Change layers_pattern logic#2158
Closed
BenjaminBossan wants to merge 1 commit intohuggingface:mainfrom
Closed
Conversation
Addreses part of huggingface#2155. Description So far, the layers_pattern argument would only work if there was a prefix to the pattern. As an example, if the module name is: decoder.layer.0.attn.to_q and we pass layers_pattern="layer", this would match. However, if the module name was: layer.0.attn.to_q it would not work. Usually, when we create a model with AutoModelForFoo.from_pretrained, the "layer" part would never be first. However, if we load a model directly, e.g. through LlamaModel.from_pretrained, there is actually no prefix. As a consequence, we get no match there. With this PR, the prefix is made optional, so that the second pattern also matches. Status I'm not sure yet if this should be merged, as it is technically backwards incompatible. Users can still target the desired modules by carefully crafting a regex for target_modules so that it only matches the desired layer indices. However, this is tedious and layers_pattern was introduced to avoid having to do this.
4 tasks
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
Member
Author
|
@githubnemo If you have time to look at this, your opinion would be appreciated (but not high prio). |
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
cyyever
pushed a commit
to cyyever/peft
that referenced
this pull request
Sep 4, 2025
…aCollatorForCompletionOnlyLM (huggingface#2158) * feat: Add info to batch in DataCollatorForCompletionOnlyLM Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * fix: formatting Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * feat: Add info to batch in DataCollatorForCompletionOnlyLM Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * fix: formatting Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * fix: max_length_k to int Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * fix:Added comments Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * test cases Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * test cases Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * test cases Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * feat: Add info to batch in DataCollatorForCompletionOnlyLM Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * fix: formatting Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * feat: Add info to batch in DataCollatorForCompletionOnlyLM Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * test cases Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * test cases Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * test cases Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * unit test changes Signed-off-by: Abhishek <maurya.abhishek@ibm.com> * style * add test * remove test --------- Signed-off-by: Abhishek <maurya.abhishek@ibm.com> Co-authored-by: Quentin Gallouédec <quentin.gallouedec@huggingface.co> Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Addresses part of #2155.
Description
So far, the
layers_patternargument would only work if there was a prefix to the pattern. As an example, if the module name is:decoder.layer.0.attn.to_qand we pass
layers_pattern="layer", this would match. However, if the module name was:layer.0.attn.to_qi.e. without prefix before
"layer", it would not work.Usually, when we create a model with
AutoModelForFoo.from_pretrained, the"layer"part would never be first. However, if we load a model directly, e.g. throughLlamaModel.from_pretrained, there is actually no prefix. As a consequence, we get no match there.With this PR, the prefix is made optional, so that the second pattern also matches.
Status
I'm not sure yet if this should be merged, as it is technically backwards incompatible. Users can still target the desired modules by carefully crafting a regex for
target_modulesso that it only matches the desired layer indices. However, this is tedious andlayers_patternwas introduced to avoid having to do this.