Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

renaming #298

Merged
merged 3 commits into from
Dec 7, 2024
Merged

renaming #298

merged 3 commits into from
Dec 7, 2024

Conversation

JadenFiotto-Kaufman
Copy link
Member

Feature to pass a dictionary of (regex string -> new name) to rename Envoys. Example renaming all 'attn' Envoys to 'attention':

import nnsight

rename = {
    r"\.transformer\.h\.\d+.attn": "attention"
}

model = nnsight.LanguageModel("openai-community/gpt2", dispatch=False, device_map = "auto", rename=rename)

with model.trace("Hello"):
    
    value= model.transformer.h[2].attention.output.save()

@Butanium
Copy link
Contributor

Butanium commented Dec 4, 2024

This PR would address my use case, and I'm happy for it to be merged. However, I'm not sure about the value of using regex over exact module name matching as in my pull request. While the more general approach is likely better, regex can be more error-prone:

  1. Risk of unintended deletions: If not implemented carefully, renaming with regex can inadvertently delete modules. While this doesn’t break the model entirely, it can lead to issues such as missing envoys for some modules:
    from nnsight import LanguageModel
    rename = {".*attn": "self_attn"}
    broken_model = LanguageModel("gpt2", rename=rename)
    print(broken_model)
    """
            ...
            (self_attn): GPT2SdpaAttention(
              (self_attn): Dropout(p=0.1, inplace=False)  # missing envoy for attn modules
            )
            ...
    """
    with broken_model.trace("hello"):
        out = broken_model.lm_head.output.save()  # works correctly
        out_broken = broken_model.transformer.h[9].self_attn.c_attn.output.save()  
        # fails with "no attribute 'output'" for conv layer
  2. Regex complexity and pitfalls: Regex introduces more room for error compared to string matching. For example, I initially made two mistakes in my implementation (omitting \. and $), but now it works as intended:
    attention_names = [r".*\.attn$", r".*\.self_attention$", r".*\.attention$"]
    model_names = [r".*\.transformer$", r".*\.gpt_neox$"]
    layer_names = [r".*\.h$"]
    ln_names = [r".*\.final_layer_norm$", r".*\.ln_f$"]
    lm_head_names = [r".*\.embed_out$"]
    
    llm_rename_dict = {}
    for name in attention_names:
        llm_rename_dict[name] = "self_attn"
    for name in model_names:
        llm_rename_dict[name] = "model"
    for name in layer_names:
        llm_rename_dict[name] = "layers"
    for name in ln_names:
        llm_rename_dict[name] = "ln_final"
    for name in lm_head_names:
        llm_rename_dict[name] = "lm_head"
    
    patched_model = LanguageModel("gpt2", rename=llm_rename_dict)
    print(patched_model)

While regex allows greater flexibility, it can introduce unexpected behaviors if not handled properly. String matching feels simpler and less error-prone for this specific use case, but I'm open to discussing further if there are compelling benefits to the regex approach.

@JadenFiotto-Kaufman
Copy link
Member Author

@Butanium Hmm one thing I could do to make at least one of the cases you mentioned easier is remove the starting '.'? Idk I just imagine renaming every attention for 405b would 191 different lines vs one with regex.

I could also auto add '$' to the end of the keys if there wasnt one already.

I could log every time a layer is renamed like

Renaming:
(model.layers.0.self_attn) -> (model.layers.0.attn)
(model.layers.1.self_attn) -> (model.layers.1.attn)
...

I could add a flag to the nnsight.CONFIG.APP like nnsight.CONFIG.APP.REGEX_RENAMING which could be off by default?

@Butanium
Copy link
Contributor

Butanium commented Dec 5, 2024

You wouldn't need 191 lines for llama if the renaming happens at the module name level instead of the model path level.
Can you think of a use case for which you'd want to rename modules differently depending on their path rather than their name?

@JadenFiotto-Kaufman
Copy link
Member Author

@Butanium Fair I see what youre saying! So you would do:

rename = {
    'attn': 'attention'
  }

And just rename all ones that have that name?

That makes sense. I will do that!

@Butanium
Copy link
Contributor

Butanium commented Dec 5, 2024

Yeah that's basically what I've done in #255

Copy link
Contributor

@Butanium Butanium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Do you want me to do some tests locally?

@JadenFiotto-Kaufman
Copy link
Member Author

@Butanium If you could add a couple pytests to either /tests/test_lm.py or /tests/test_tiny.py that would be great! And yeah in my PR, we actually dont want to total rename the Envoy.path That should be consistent under the hood regardless of naming. That way when you use it remotely theres no need to convert back.

@JadenFiotto-Kaufman JadenFiotto-Kaufman merged commit 8e4f8e9 into 0.4 Dec 7, 2024
1 check passed
@Butanium
Copy link
Contributor

Butanium commented Dec 7, 2024

Will do that properly after all my deadlines (end of December)

@Butanium
Copy link
Contributor

@JadenFiotto-Kaufman I'm not sure how to proceed to write the test, as your tests rely on a tiny_model / gpt2 argument that is given to the test function. Is it ok to make a test that creates a model inside its defintion?

@JadenFiotto-Kaufman JadenFiotto-Kaufman deleted the aliasing branch December 22, 2024 18:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants