You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The README.md file says you used the conv_mode=phi for training, while the train_qwen2_base.sh file has the conv_mode=qwen2_base. Is this a typo in the README file?
To evaluate models trained with the Qwen2 model, what should the conv_mode argument be set to? Is it qwen2_base or qwen2_instruct? I am assuming that it is qwen2_instruct.
Finally, what should one consider for evaluating the model on more benchmarks, such as SugarCrepe? Is it just the conv_mode parameter or something else as well?
Update:
This is a typo in the README file.
It turns out you need to use qwen2_base for evaluation.
Moreover, I found out a bigger bug with the script files. The Phi-2 model has a max length of size 2048 and not 3072.
Thanks,
Krish
The text was updated successfully, but these errors were encountered:
Uh oh!
There was an error while loading. Please reload this page.
Hi,
Thanks for the great work!
I have a couple of questions.
Update:
Moreover, I found out a bigger bug with the script files. The Phi-2 model has a max length of size 2048 and not 3072.
Thanks,
Krish
The text was updated successfully, but these errors were encountered: