Skip to content

WIP: Support tranformers weight conversion#3071

Draft
githubnemo wants to merge 1 commit intohuggingface:mainfrom
githubnemo:feature/weight-conversion
Draft

WIP: Support tranformers weight conversion#3071
githubnemo wants to merge 1 commit intohuggingface:mainfrom
githubnemo:feature/weight-conversion

Conversation

@githubnemo
Copy link
Collaborator

Continuation of PR #2995.
Background: huggingface/transformers#42491 and huggingface/transformers#43261.

This change implements conversion operations for converting some existing PEFT checkpoints, mainly dealing with the fusing of MoE layers in transformers v5.

The code added here is currently a copy from the code that exists in transformers which is supposed to be gated as soon PEFT v0.19 is released and use the code in this PR.

The copying makes testing a bit difficult since there's currently no routing depending on the PEFT version in transformers. Older transformers versions, therefore, need patching to forcefully use the PEFT implementation of the conversion. As soon as the routing is implemented in transformers we can conditionally disable the patching.

Continuation of PR huggingface#2995.
Background: huggingface/transformers#42491 and huggingface/transformers#43261.

This change implements conversion operations for converting some existing
PEFT checkpoints, mainly dealing with the fusing of MoE layers in transformers v5.

The code added here is currently a copy from the code that exists in transformers
which is supposed to be gated as soon PEFT v0.19 is released and use the code
in this PR.

The copying makes testing a bit difficult since there's currently no routing
depending on the PEFT version in transformers. Older transformers versions, therefore,
need patching to forcefully use the PEFT implementation of the conversion.
As soon as the routing is implemented in transformers we can conditionally
disable the patching.
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants