Skip to content

Update recipe for Phi4-mini #95

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 26, 2025
Merged

Conversation

guangy10
Copy link
Collaborator

@guangy10 guangy10 commented Jun 26, 2025

This PR is required in order to export phi4-mini via optimum-cli and benchmark it in pytorch/executorch

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@guangy10 guangy10 marked this pull request as ready for review June 26, 2025 02:00
@guangy10 guangy10 requested a review from echarlaix June 26, 2025 02:00
@guangy10
Copy link
Collaborator Author

@kimishpatel for review

@guangy10 guangy10 merged commit a394262 into huggingface:main Jun 26, 2025
110 of 111 checks passed
@guangy10 guangy10 deleted the update_phi4 branch June 26, 2025 02:34
config = kwargs.get("config", None)
config = kwargs.get("config") or AutoConfig.from_pretrained(model_name_or_path)

if hasattr(config, "rope_scaling") and config.rope_scaling is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need such changes?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, config object can be passed via from_pretrained but can't via optimum-cli. For phi-4 particular, the default config is using long rope which has data-dependent code that causes export issue, we need to override that config to use the default rope instead. Because there is no easy way to pass it via optimum-cli (unless we pass the entire json), it's better we always load the default config here no matter the export workflow is triggered via optimum-cli or from_pretrained.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want to minimize the customization that we have to do here. Ideally I want to structure it such that this repo can be simplified to a bunch of scripts

)
def test_phi4_text_generation(self):
def test_phi4_text_generation_with_custom_sdpa_and_kv_cache_8da4w_8we(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should you not make a separate test?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This model is large (3.2B IIUC), the previous recipe for Phi-4 is not running in CI due to OOM, so it doesn't matter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants