renaming #298

JadenFiotto-Kaufman · 2024-12-02T00:52:37Z

Feature to pass a dictionary of (regex string -> new name) to rename Envoys. Example renaming all 'attn' Envoys to 'attention':

import nnsight

rename = {
    r"\.transformer\.h\.\d+.attn": "attention"
}

model = nnsight.LanguageModel("openai-community/gpt2", dispatch=False, device_map = "auto", rename=rename)

with model.trace("Hello"):
    
    value= model.transformer.h[2].attention.output.save()

Butanium · 2024-12-04T15:35:30Z

This PR would address my use case, and I'm happy for it to be merged. However, I'm not sure about the value of using regex over exact module name matching as in my pull request. While the more general approach is likely better, regex can be more error-prone:

Risk of unintended deletions: If not implemented carefully, renaming with regex can inadvertently delete modules. While this doesn’t break the model entirely, it can lead to issues such as missing envoys for some modules:

from nnsight import LanguageModel
rename = {".*attn": "self_attn"}
broken_model = LanguageModel("gpt2", rename=rename)
print(broken_model)
"""
        ...
        (self_attn): GPT2SdpaAttention(
          (self_attn): Dropout(p=0.1, inplace=False)  # missing envoy for attn modules
        )
        ...
"""
with broken_model.trace("hello"):
    out = broken_model.lm_head.output.save()  # works correctly
    out_broken = broken_model.transformer.h[9].self_attn.c_attn.output.save()  
    # fails with "no attribute 'output'" for conv layer

Regex complexity and pitfalls: Regex introduces more room for error compared to string matching. For example, I initially made two mistakes in my implementation (omitting \. and $), but now it works as intended:

attention_names = [r".*\.attn$", r".*\.self_attention$", r".*\.attention$"]
model_names = [r".*\.transformer$", r".*\.gpt_neox$"]
layer_names = [r".*\.h$"]
ln_names = [r".*\.final_layer_norm$", r".*\.ln_f$"]
lm_head_names = [r".*\.embed_out$"]

llm_rename_dict = {}
for name in attention_names:
    llm_rename_dict[name] = "self_attn"
for name in model_names:
    llm_rename_dict[name] = "model"
for name in layer_names:
    llm_rename_dict[name] = "layers"
for name in ln_names:
    llm_rename_dict[name] = "ln_final"
for name in lm_head_names:
    llm_rename_dict[name] = "lm_head"

patched_model = LanguageModel("gpt2", rename=llm_rename_dict)
print(patched_model)

While regex allows greater flexibility, it can introduce unexpected behaviors if not handled properly. String matching feels simpler and less error-prone for this specific use case, but I'm open to discussing further if there are compelling benefits to the regex approach.

JadenFiotto-Kaufman · 2024-12-05T18:46:56Z

@Butanium Hmm one thing I could do to make at least one of the cases you mentioned easier is remove the starting '.'? Idk I just imagine renaming every attention for 405b would 191 different lines vs one with regex.

I could also auto add '$' to the end of the keys if there wasnt one already.

I could log every time a layer is renamed like

Renaming:
(model.layers.0.self_attn) -> (model.layers.0.attn)
(model.layers.1.self_attn) -> (model.layers.1.attn)
...

I could add a flag to the nnsight.CONFIG.APP like nnsight.CONFIG.APP.REGEX_RENAMING which could be off by default?

Butanium · 2024-12-05T20:39:27Z

You wouldn't need 191 lines for llama if the renaming happens at the module name level instead of the model path level.
Can you think of a use case for which you'd want to rename modules differently depending on their path rather than their name?

JadenFiotto-Kaufman · 2024-12-05T21:46:14Z

@Butanium Fair I see what youre saying! So you would do:

rename = {
    'attn': 'attention'
  }

And just rename all ones that have that name?

That makes sense. I will do that!

Butanium · 2024-12-05T21:55:15Z

Yeah that's basically what I've done in #255

Butanium

Looks good! Do you want me to do some tests locally?

JadenFiotto-Kaufman · 2024-12-07T15:47:54Z

@Butanium If you could add a couple pytests to either /tests/test_lm.py or /tests/test_tiny.py that would be great! And yeah in my PR, we actually dont want to total rename the Envoy.path That should be consistent under the hood regardless of naming. That way when you use it remotely theres no need to convert back.

Butanium · 2024-12-07T16:31:37Z

Will do that properly after all my deadlines (end of December)

Butanium · 2024-12-18T19:06:36Z

@JadenFiotto-Kaufman I'm not sure how to proceed to write the test, as your tests rely on a tiny_model / gpt2 argument that is given to the test function. Is it ok to make a test that creates a model inside its defintion?

renaming

e50782c

JadenFiotto-Kaufman requested review from cadentj and AdamBelfki3 December 2, 2024 00:52

JadenFiotto-Kaufman mentioned this pull request Dec 2, 2024

Add a rename_modules_dict arg to Envoy that allows renaming modules #255

Closed

JadenFiotto-Kaufman added 2 commits December 6, 2024 20:52

Merge branch '0.4' into aliasing

9350d53

Updates @clement

42500f5

Butanium approved these changes Dec 7, 2024

View reviewed changes

JadenFiotto-Kaufman merged commit 8e4f8e9 into 0.4 Dec 7, 2024
1 check passed

JadenFiotto-Kaufman deleted the aliasing branch December 22, 2024 18:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

renaming #298

renaming #298

JadenFiotto-Kaufman commented Dec 2, 2024

Butanium commented Dec 4, 2024

JadenFiotto-Kaufman commented Dec 5, 2024

Butanium commented Dec 5, 2024

JadenFiotto-Kaufman commented Dec 5, 2024

Butanium commented Dec 5, 2024

Butanium left a comment

JadenFiotto-Kaufman commented Dec 7, 2024

Butanium commented Dec 7, 2024

Butanium commented Dec 18, 2024

renaming #298

renaming #298

Conversation

JadenFiotto-Kaufman commented Dec 2, 2024

Butanium commented Dec 4, 2024

JadenFiotto-Kaufman commented Dec 5, 2024

Butanium commented Dec 5, 2024

JadenFiotto-Kaufman commented Dec 5, 2024

Butanium commented Dec 5, 2024

Butanium left a comment

Choose a reason for hiding this comment

JadenFiotto-Kaufman commented Dec 7, 2024

Butanium commented Dec 7, 2024

Butanium commented Dec 18, 2024