-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
renaming #298
renaming #298
Conversation
This PR would address my use case, and I'm happy for it to be merged. However, I'm not sure about the value of using regex over exact module name matching as in my pull request. While the more general approach is likely better, regex can be more error-prone:
While regex allows greater flexibility, it can introduce unexpected behaviors if not handled properly. String matching feels simpler and less error-prone for this specific use case, but I'm open to discussing further if there are compelling benefits to the regex approach. |
@Butanium Hmm one thing I could do to make at least one of the cases you mentioned easier is remove the starting '.'? Idk I just imagine renaming every attention for 405b would 191 different lines vs one with regex. I could also auto add '$' to the end of the keys if there wasnt one already. I could log every time a layer is renamed like
I could add a flag to the nnsight.CONFIG.APP like |
You wouldn't need 191 lines for llama if the renaming happens at the module name level instead of the model path level. |
@Butanium Fair I see what youre saying! So you would do: rename = {
'attn': 'attention'
} And just rename all ones that have that name? That makes sense. I will do that! |
Yeah that's basically what I've done in #255 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Do you want me to do some tests locally?
@Butanium If you could add a couple pytests to either |
Will do that properly after all my deadlines (end of December) |
@JadenFiotto-Kaufman I'm not sure how to proceed to write the test, as your tests rely on a |
Feature to pass a dictionary of (regex string -> new name) to rename Envoys. Example renaming all 'attn' Envoys to 'attention':