-
Notifications
You must be signed in to change notification settings - Fork 27.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: take into account meta device #34134
Fix: take into account meta device #34134
Conversation
cc @SunMarc for initial review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this @tibor-reiss! Can you also add a test so that this doesn't happen in the future ?
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@SunMarc I was not sure where to add the test, so please feel free to move it around. Additionally, the TINY_* models are float32, so I went with the models from the issues' example (they are quite small) - let me know if there is a better way. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for iterating ! Left a few suggestions. Does this solve your issue @fxmarty-amd =) ?
Friendly ping @fxmarty-amd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Sorry for being so late on my review and thanks for the contribution! 🤗
Update test parameters Co-authored-by: Marc Sun <[email protected]>
6e25b7d
to
b782bfb
Compare
* Do not load for meta device * Make some minor improvements * Add test * Update tests/utils/test_modeling_utils.py Update test parameters Co-authored-by: Marc Sun <[email protected]> * Make the test simpler --------- Co-authored-by: Marc Sun <[email protected]>
Fixes #34091
The models given in the issues' example have different torch_dtype which results in a different handling in
check_support_param_buffer_assignment
.fxmarty/small-llama-testing: torch_dtype=float32
fxmarty/tiny-llama-fast-tokenizer: torch_dtype=float16
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@LysandreJik @fxmarty @muellerzr