-
Notifications
You must be signed in to change notification settings - Fork 16
Update recipe for Phi4-mini #95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@kimishpatel for review |
config = kwargs.get("config", None) | ||
config = kwargs.get("config") or AutoConfig.from_pretrained(model_name_or_path) | ||
|
||
if hasattr(config, "rope_scaling") and config.rope_scaling is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need such changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, config object can be passed via from_pretrained but can't via optimum-cli. For phi-4 particular, the default config is using long rope which has data-dependent code that causes export issue, we need to override that config to use the default rope instead. Because there is no easy way to pass it via optimum-cli (unless we pass the entire json), it's better we always load the default config here no matter the export workflow is triggered via optimum-cli or from_pretrained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just want to minimize the customization that we have to do here. Ideally I want to structure it such that this repo can be simplified to a bunch of scripts
) | ||
def test_phi4_text_generation(self): | ||
def test_phi4_text_generation_with_custom_sdpa_and_kv_cache_8da4w_8we(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should you not make a separate test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This model is large (3.2B IIUC), the previous recipe for Phi-4 is not running in CI due to OOM, so it doesn't matter.
This PR is required in order to export phi4-mini via optimum-cli and benchmark it in pytorch/executorch