-
Notifications
You must be signed in to change notification settings - Fork 358
Convert Gemma 2 to HuggingFace #1324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
You can check my commit here for the conversion script here |
That is fantastic. Thanks. |
@hxssgaa I made a quick test of the script, trying to convert a 2B Gemma2 model. However, I am seeing this error: |
@hxssgaa I understand this is because it uses the settings from the base.yml-file. However, it was not obvious to my how to get the script to either rely on the structure from the loaded model, og on the model-yml files. I also see the script refers to convert_maxtext_to_hf.py. Is that a helper file? |
Hi @peregilk , I just did another test for conversion script of gemma2-2b, and didn't find the issue you are getting. The converted checkpoint exactly matches with official huggingface gemma2-2b-it. Please use the correct yml setting for conversion, your script should look like:
It's a typo, I already fixed it in the latest commit, it should be gemma2_orbax_to_hf.py instead. |
@hxssgaa Thanks for answering me, and sorry for posing stupid questions here. Do you first save/convert the checkpoint locally to disk first? Or can I still dont think the example command is exactly correct, but if this is stored locally and does not require a specific yml-file, this is probably just a typo. |
@peregilk, no need to save the ckpt locally, you can just point the maxtext_checkpoint to the google bucket checkpoint location. Sorry tor the confusion here, I think I have changed the ckpt conversion format to be similar as JAX_PLATFORMS=cpu python MaxText/gemma2_orbax_to_hf.py MaxText/configs/base.yml |
Awesome @hxssgaa. I actually tried something similar but I think there was a small typo in my script earlier forcing it to not pick up the correct yaml. However, now it works. I can also confirm that I have tried one model "all the way". I can confirm that I get exactly the same MMLU scores on the original |
Has anyone yet tried converting gemma 3 (4b) to huggingface? I have now done gemma 3 model from Kaggle --> maxtext format (orbax) --> continued pretraining --> (Would now like to convert to hf but seems there is now script available and I am trying to do it myself but no luck yet) |
@hxssgaa any chance to develop similar code to convert Gemma 3 checkpoint to HF ? |
I created such conversion script based on this https://github.com/AI-Hypercomputer/maxtext/blob/f6ebc1662cb944bd7748fb350bba164b13479b68/MaxText/gemma2_orbax_to_hf.py and bunch of trial and error with gemini 2.5 pro in Cursor. I was able to then run some benchmarks with the converted model + tested that the model would start GRPO finetuning with Unsloth. I can share the script once maybe today evening when I am finished with work |
Great @R4ZZ3 |
Hi @salrowili The file can now be found here: |
@gagika can you please take a look |
Are there any scripts for converting Gemma-2 models to HuggingFace? I see there are Llama and Mistral scripts.
The text was updated successfully, but these errors were encountered: