Skip to content

Change layers_pattern logic#2158

Closed
BenjaminBossan wants to merge 1 commit intohuggingface:mainfrom
BenjaminBossan:change-layers-pattern-logic
Closed

Change layers_pattern logic#2158
BenjaminBossan wants to merge 1 commit intohuggingface:mainfrom
BenjaminBossan:change-layers-pattern-logic

Conversation

@BenjaminBossan
Copy link
Member

@BenjaminBossan BenjaminBossan commented Oct 17, 2024

Addresses part of #2155.

Description

So far, the layers_pattern argument would only work if there was a prefix to the pattern. As an example, if the module name is:

decoder.layer.0.attn.to_q

and we pass layers_pattern="layer", this would match. However, if the module name was:

layer.0.attn.to_q

i.e. without prefix before "layer", it would not work.

Usually, when we create a model with AutoModelForFoo.from_pretrained, the "layer" part would never be first. However, if we load a model directly, e.g. through LlamaModel.from_pretrained, there is actually no prefix. As a consequence, we get no match there.

With this PR, the prefix is made optional, so that the second pattern also matches.

Status

I'm not sure yet if this should be merged, as it is technically backwards incompatible. Users can still target the desired modules by carefully crafting a regex for target_modules so that it only matches the desired layer indices. However, this is tedious and layers_pattern was introduced to avoid having to do this.

Addreses part of huggingface#2155.

Description

So far, the layers_pattern argument would only work if there was a
prefix to the pattern. As an example, if the module name is:

decoder.layer.0.attn.to_q

and we pass layers_pattern="layer", this would match. However, if the
module name was:

layer.0.attn.to_q

it would not work.

Usually, when we create a model with AutoModelForFoo.from_pretrained,
the "layer" part would never be first. However, if we load a model
directly, e.g. through LlamaModel.from_pretrained, there is actually no
prefix. As a consequence, we get no match there.

With this PR, the prefix is made optional, so that the second pattern
also matches.

Status

I'm not sure yet if this should be merged, as it is technically
backwards incompatible. Users can still target the desired modules by
carefully crafting a regex for target_modules so that it only matches
the desired layer indices. However, this is tedious and layers_pattern
was introduced to avoid having to do this.
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

@BenjaminBossan
Copy link
Member Author

@githubnemo If you have time to look at this, your opinion would be appreciated (but not high prio).

@github-actions
Copy link

github-actions bot commented Jan 7, 2025

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

@github-actions github-actions bot closed this Jan 16, 2025
@github-actions github-actions bot closed this Jan 25, 2025
@github-actions github-actions bot closed this Feb 4, 2025
cyyever pushed a commit to cyyever/peft that referenced this pull request Sep 4, 2025
…aCollatorForCompletionOnlyLM (huggingface#2158)

* feat: Add info to batch in DataCollatorForCompletionOnlyLM

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* fix: formatting

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* feat: Add info to batch in DataCollatorForCompletionOnlyLM

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* fix: formatting

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* fix: max_length_k to int

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* fix:Added comments

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* test cases

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* test cases

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* test cases

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* feat: Add info to batch in DataCollatorForCompletionOnlyLM

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* fix: formatting

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* feat: Add info to batch in DataCollatorForCompletionOnlyLM

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* test cases

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* test cases

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* test cases

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* unit test changes

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>

* style

* add test

* remove test

---------

Signed-off-by: Abhishek <maurya.abhishek@ibm.com>
Co-authored-by: Quentin Gallouédec <quentin.gallouedec@huggingface.co>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants