-
Notifications
You must be signed in to change notification settings - Fork 27.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
modular_model_converter bugfix on assignments #35642
modular_model_converter bugfix on assignments #35642
Conversation
7be3ec8
to
2580dc6
Compare
Hey! Thanks for the contribution! However, the merging rule for assignment was specifically chosen to avoid having to redefine the big docstrings (what you did with Starcoder2) or very common variables while we think of a better solution for automatic docstrings. |
@Cyrilvallez hey, thanks for the update. Well I observed the erroneous documentation with phi and jepa modular files and I figured this should be fixed this with the regex patterns (also to avoid any potential issues in the future). The fix had the byproduct of having to redefine the docstring (since the docstring variable name now matches the regex pattern). I guess if the errors in documentations are less of an issue, then it could be skipped. |
I see! Super cool initiative! Let's do it then! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the comment for some guidance on how to proceed! Basically, I am against adding the DOCSTRING
pattern as explained (it should not be needed, see comments, and will make our lives harder while figuring out a better way for docstrings), but the rest is very nice!
utils/modular_model_converter.py
Outdated
ASSIGNMENTS_REGEX_TO_KEEP = [ | ||
r"_CHECKPOINT", | ||
r"_EXPECTED", | ||
r"DOCSTRING", | ||
] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ASSIGNMENTS_REGEX_TO_KEEP = [ | |
r"_CHECKPOINT", | |
r"_EXPECTED", | |
r"DOCSTRING", | |
] | |
# Similar to the above list, but for regex patterns | |
ASSIGNMENTS_REGEX_TO_KEEP = [ | |
r"_CHECKPOINT", | |
r"_EXPECTED", | |
] |
Here let's remove the docstring part, as I explained I am against it as it forces us to redefine annoyingly long docstrings! For Emu3, it is actually an issue because of a very slight overlook, switching
class Emu3TextModel(LlamaModel, Emu3PreTrainedModel):
def __init__(self, config: Emu3Config):
super().__init__(config)
self.layers = nn.ModuleList(
[Emu3DecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
)
in modular_emu3.py
to
class Emu3TextModel(LlamaModel, Emu3PreTrainedModel):
def __init__(self, config: Emu3Config):
super().__init__(config)
self.layers = nn.ModuleList(
[Emu3DecoderLayer(config, layer_idx) for layer_idx in range(config.num_hidden_layers)]
)
@add_start_docstrings_to_model_forward(EMU3_TEXT_INPUTS_DOCSTRING)
def forward(self, **super_kwargs):
super().forward(**super_kwargs)
automatically takes care of the docstring based on current rules!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah now I see, ty for clarifying
8d51d2f
to
7d1294a
Compare
…strings, expected outputs etc.
7d1294a
to
74ce50a
Compare
…cstring assingment, remove verbatim assignments in modular converter
74ce50a
to
eeea23a
Compare
@Cyrilvallez thanks for ur help. I didn't like my change for the docstring either (neither I knew about future plans on automatic docstrings), glad to get feedback there. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, cc @Cyrilvallez
@@ -640,8 +642,7 @@ def forward( | |||
) | |||
|
|||
|
|||
# Image classification docstring | |||
_IMAGE_CLASS_CHECKPOINT = "google/ijepa-base-patch16-224" | |||
_IMAGE_CLASS_CHECKPOINT = "facebook/ijepa_vith14_1k" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_IMAGE_CLASS_CHECKPOINT = "google/ijepa-base-patch16-224"
modular should be changed as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modular is correct, but the assignment wasn't passing because the name wasn't strictly matching ASSIGNMENTS_TO_KEEP (is this what you meant?)
Although now I see that there is a leftover _IMAGE_CLASS_CHECKPOINT = "google/ijepa-base-patch16-224"
in configuration_ijepa.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah had one wrong in my modular
utils/modular_model_converter.py
Outdated
} | ||
ASSIGNMENTS_REGEX_TO_KEEP = [ | ||
r"_CHECKPOINT", | ||
r"_EXPECTED", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might want to add _FOR_DOC
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_CONFIG_FOR_DOC
is already handled by VARIABLES_AT_THE_BEGINNING
but I suppose I should add it to cover a few edge cases (like _TOKENIZER_FOR_DOC
, although there are no assignments in modular files with that name in lhs)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep
…e in ijepa's configuration
@ArthurZucker I'm frequently using the modeling files of speech ssl models (wav2vec2, hubert, wavlm etc.) and I see that they don't have modular files yet (but have large duplication). Would you welcome a PR on that? |
Yeah for sure! 🤗 |
If needed you can get more details about modular with #35737 (I rewrote the doc recently) 🤗 |
awesome, I'll definitely use it |
What does this PR do?
This PR improves the logic of modular_model_converter.py script in order to keep the assignments from modular files.
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@ArthurZucker, @Cyrilvallez
Additional details
Besides the changes on utils/modular_model_converter.py script:
modeling_*.py
files which resolved some bugs that affected docstrings, e.g IJEPA's_IMAGE_CLASS
checkpoint andEXPECTED_OUTPUT
and phi's checkpoint (which was inherited from Mistral, since the corresponding modular file was lacking a checkpoint definition)Starcoder2Model
is decorated withSTARCODER2_INPUTS_DOCSTRING
. Copy-pasting is somewhat a contradiction to the "modular" logic, but if I'm not mistaken there wasn't an easier solution.