modular_model_converter bugfix on assignments #35642

nikosanto13 · 2025-01-12T20:21:38Z

What does this PR do?

This PR improves the logic of modular_model_converter.py script in order to keep the assignments from modular files.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@ArthurZucker, @Cyrilvallez

Additional details

Besides the changes on utils/modular_model_converter.py script:

Regenerated modeling_*.py files which resolved some bugs that affected docstrings, e.g IJEPA's _IMAGE_CLASS checkpoint and EXPECTED_OUTPUT and phi's checkpoint (which was inherited from Mistral, since the corresponding modular file was lacking a checkpoint definition)
Had to copy-paste the mistral docstring in the modular_starcoder2.py. Apparently, this is needed because the forward pass of modular-defined Starcoder2Model is decorated with STARCODER2_INPUTS_DOCSTRING. Copy-pasting is somewhat a contradiction to the "modular" logic, but if I'm not mistaken there wasn't an easier solution.

Cyrilvallez · 2025-01-16T12:46:00Z

Hey! Thanks for the contribution! However, the merging rule for assignment was specifically chosen to avoid having to redefine the big docstrings (what you did with Starcoder2) or very common variables while we think of a better solution for automatic docstrings.
However, using regex patterns instead of hard matching for ASSIGNMENTS_TO_KEEP may be a good idea, as I think it's not always exactly "_CHECKPOINT_FOR_DOC" for older models.
Is there a specific reason why you need this change BTW? 🤗

nikosanto13 · 2025-01-16T13:12:38Z

@Cyrilvallez hey, thanks for the update. Well I observed the erroneous documentation with phi and jepa modular files and I figured this should be fixed this with the regex patterns (also to avoid any potential issues in the future). The fix had the byproduct of having to redefine the docstring (since the docstring variable name now matches the regex pattern).

I guess if the errors in documentations are less of an issue, then it could be skipped.

Cyrilvallez · 2025-01-16T16:24:44Z

I see! Super cool initiative! Let's do it then!

Cyrilvallez

See the comment for some guidance on how to proceed! Basically, I am against adding the DOCSTRING pattern as explained (it should not be needed, see comments, and will make our lives harder while figuring out a better way for docstrings), but the rest is very nice!

Cyrilvallez · 2025-01-16T16:29:28Z

utils/modular_model_converter.py

+ASSIGNMENTS_REGEX_TO_KEEP = [
+    r"_CHECKPOINT",
+    r"_EXPECTED",
+    r"DOCSTRING",
+]
+


Suggested change

ASSIGNMENTS_REGEX_TO_KEEP = [

r"_CHECKPOINT",

r"_EXPECTED",

r"DOCSTRING",

]

# Similar to the above list, but for regex patterns

ASSIGNMENTS_REGEX_TO_KEEP = [

r"_CHECKPOINT",

r"_EXPECTED",

]

Here let's remove the docstring part, as I explained I am against it as it forces us to redefine annoyingly long docstrings! For Emu3, it is actually an issue because of a very slight overlook, switching

class Emu3TextModel(LlamaModel, Emu3PreTrainedModel): def __init__(self, config: Emu3Config): super().__init__(config) self.layers = nn.ModuleList( [Emu3DecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)] )

in modular_emu3.py to

class Emu3TextModel(LlamaModel, Emu3PreTrainedModel): def __init__(self, config: Emu3Config): super().__init__(config) self.layers = nn.ModuleList( [Emu3DecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)] ) @add_start_docstrings_to_model_forward(EMU3_TEXT_INPUTS_DOCSTRING) def forward(self, **super_kwargs): super().forward(**super_kwargs)

automatically takes care of the docstring based on current rules!

ah now I see, ty for clarifying

utils/modular_model_converter.py

src/transformers/models/starcoder2/modular_starcoder2.py

…strings, expected outputs etc.

…cstring assingment, remove verbatim assignments in modular converter

nikosanto13 · 2025-01-17T22:10:20Z

@Cyrilvallez thanks for ur help. I didn't like my change for the docstring either (neither I knew about future plans on automatic docstrings), glad to get feedback there.

ArthurZucker

LGTM, cc @Cyrilvallez

ArthurZucker · 2025-01-20T15:49:34Z

src/transformers/models/ijepa/modeling_ijepa.py

@@ -640,8 +642,7 @@ def forward(
        )


-# Image classification docstring
-_IMAGE_CLASS_CHECKPOINT = "google/ijepa-base-patch16-224"
+_IMAGE_CLASS_CHECKPOINT = "facebook/ijepa_vith14_1k"


_IMAGE_CLASS_CHECKPOINT = "google/ijepa-base-patch16-224" modular should be changed as well

modular is correct, but the assignment wasn't passing because the name wasn't strictly matching ASSIGNMENTS_TO_KEEP (is this what you meant?)

Although now I see that there is a leftover _IMAGE_CLASS_CHECKPOINT = "google/ijepa-base-patch16-224" in configuration_ijepa.py

Ah yeah had one wrong in my modular

ArthurZucker · 2025-01-20T15:50:26Z

utils/modular_model_converter.py

-}
+ASSIGNMENTS_REGEX_TO_KEEP = [
+    r"_CHECKPOINT",
+    r"_EXPECTED",


Might want to add _FOR_DOC?

_CONFIG_FOR_DOC is already handled by VARIABLES_AT_THE_BEGINNING but I suppose I should add it to cover a few edge cases (like _TOKENIZER_FOR_DOC, although there are no assignments in modular files with that name in lhs)?

…e in ijepa's configuration

nikosanto13 · 2025-01-21T12:04:03Z

@ArthurZucker I'm frequently using the modeling files of speech ssl models (wav2vec2, hubert, wavlm etc.) and I see that they don't have modular files yet (but have large duplication). Would you welcome a PR on that?

ArthurZucker · 2025-01-21T12:36:15Z

Yeah for sure! 🤗

Cyrilvallez · 2025-01-21T16:37:56Z

If needed you can get more details about modular with #35737 (I rewrote the doc recently) 🤗

nikosanto13 · 2025-01-21T17:08:28Z

awesome, I'll definitely use it

nikosanto13 requested review from amyeroberts, qubvel, ArthurZucker, Cyrilvallez and Rocketknight1 as code owners January 12, 2025 20:21

nikosanto13 force-pushed the modular-bugfix-assignments branch 2 times, most recently from 7be3ec8 to 2580dc6 Compare January 13, 2025 15:36

Cyrilvallez reviewed Jan 16, 2025

View reviewed changes

nikosanto13 force-pushed the modular-bugfix-assignments branch 2 times, most recently from 8d51d2f to 7d1294a Compare January 17, 2025 21:47

added bugfix in modular converter to keep modular assignments for doc…

2ba99e6

…strings, expected outputs etc.

nikosanto13 force-pushed the modular-bugfix-assignments branch from 7d1294a to 74ce50a Compare January 17, 2025 22:00

revert stracoder2 docstring copying, add forward in EMU3 to enable do…

eeea23a

…cstring assingment, remove verbatim assignments in modular converter

nikosanto13 force-pushed the modular-bugfix-assignments branch from 74ce50a to eeea23a Compare January 17, 2025 22:05

ArthurZucker approved these changes Jan 20, 2025

View reviewed changes

qubvel removed their request for review January 20, 2025 18:12

added _FOR_DOC in assignments to keep, corrected wrong checkpoint nam…

da935a1

…e in ijepa's configuration

ArthurZucker merged commit 920f34a into huggingface:main Jan 21, 2025
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

modular_model_converter bugfix on assignments #35642

modular_model_converter bugfix on assignments #35642

nikosanto13 commented Jan 12, 2025

Cyrilvallez commented Jan 16, 2025

nikosanto13 commented Jan 16, 2025 •

edited

Loading

Cyrilvallez commented Jan 16, 2025

Cyrilvallez left a comment

Cyrilvallez Jan 16, 2025

nikosanto13 Jan 17, 2025

nikosanto13 commented Jan 17, 2025

ArthurZucker left a comment

ArthurZucker Jan 20, 2025

nikosanto13 Jan 20, 2025

ArthurZucker Jan 21, 2025 •

edited

Loading

ArthurZucker Jan 20, 2025

nikosanto13 Jan 20, 2025

ArthurZucker Jan 21, 2025

nikosanto13 commented Jan 21, 2025

ArthurZucker commented Jan 21, 2025

Cyrilvallez commented Jan 21, 2025

nikosanto13 commented Jan 21, 2025

modular_model_converter bugfix on assignments #35642

modular_model_converter bugfix on assignments #35642

Conversation

nikosanto13 commented Jan 12, 2025

What does this PR do?

Before submitting

Who can review?

Additional details

Cyrilvallez commented Jan 16, 2025

nikosanto13 commented Jan 16, 2025 • edited Loading

Cyrilvallez commented Jan 16, 2025

Cyrilvallez left a comment

Choose a reason for hiding this comment

Cyrilvallez Jan 16, 2025

Choose a reason for hiding this comment

nikosanto13 Jan 17, 2025

Choose a reason for hiding this comment

nikosanto13 commented Jan 17, 2025

ArthurZucker left a comment

Choose a reason for hiding this comment

ArthurZucker Jan 20, 2025

Choose a reason for hiding this comment

nikosanto13 Jan 20, 2025

Choose a reason for hiding this comment

ArthurZucker Jan 21, 2025 • edited Loading

Choose a reason for hiding this comment

ArthurZucker Jan 20, 2025

Choose a reason for hiding this comment

nikosanto13 Jan 20, 2025

Choose a reason for hiding this comment

ArthurZucker Jan 21, 2025

Choose a reason for hiding this comment

nikosanto13 commented Jan 21, 2025

ArthurZucker commented Jan 21, 2025

Cyrilvallez commented Jan 21, 2025

nikosanto13 commented Jan 21, 2025

nikosanto13 commented Jan 16, 2025 •

edited

Loading

ArthurZucker Jan 21, 2025 •

edited

Loading