Skip to content

Add tests for HFTransformers recipe with static cache #2179

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 5, 2025

Conversation

KaelanDt
Copy link
Contributor

@KaelanDt KaelanDt commented Jun 3, 2025

Before submitting
  • Was this discussed/approved via a Github issue? (no need for typos and docs improvements)
  • Did you read the contributor guideline, Pull Request section?
  • Did you make sure to update the docs?
  • Did you write any new necessary tests?

What does this PR do?

Partially fixes #2067 (backward to come in a subsequent PR) by adding a generate test with static cache on Qwen2.5 and Llama3.
the two tests take 1:40 total to run on an L4.

PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃

@KaelanDt KaelanDt force-pushed the kaelan/inplace-tests branch from 858d468 to d255e79 Compare June 3, 2025 11:17
@KaelanDt KaelanDt requested a review from Borda June 3, 2025 14:05
@t-vi
Copy link
Collaborator

t-vi commented Jun 3, 2025

So on our CI it's more like 4 minutes

178.00s call     thunder/tests/test_recipes.py::test_recipe_model_with_cache[Qwen/Qwen2.5-3B]
64.62s call     thunder/tests/test_recipes.py::test_recipe_model_with_cache[unsloth/Llama-3.2-1B]

is there any way to further reduce this?

@KaelanDt
Copy link
Contributor Author

KaelanDt commented Jun 3, 2025

So on our CI it's more like 4 minutes

178.00s call     thunder/tests/test_recipes.py::test_recipe_model_with_cache[Qwen/Qwen2.5-3B]
64.62s call     thunder/tests/test_recipes.py::test_recipe_model_with_cache[unsloth/Llama-3.2-1B]

is there any way to further reduce this?

in fact there is a smaller version of Qwen2.5, so I added that.

Better now @t-vi

45.52s call     thunder/tests/test_recipes.py::test_recipe_model_with_cache[unsloth/Llama-3.2-1B]
42.06s call     thunder/tests/test_recipes.py::test_recipe_model_with_cache[Qwen/Qwen2.5-1.5B]

@github-actions github-actions bot added documentation Improvements or additions to documentation ci labels Jun 5, 2025
@KaelanDt KaelanDt force-pushed the kaelan/inplace-tests branch from 161b347 to d6f8751 Compare June 5, 2025 10:37
@github-actions github-actions bot removed documentation Improvements or additions to documentation ci labels Jun 5, 2025
@KaelanDt
Copy link
Contributor Author

KaelanDt commented Jun 5, 2025

18.20s call     thunder/tests/test_recipes.py::test_recipe_model_with_cache[Qwen2ForCausalLM-Qwen2Config]
17.94s call     thunder/tests/test_recipes.py::test_recipe_model_with_cache[LlamaForCausalLM-LlamaConfig]

Copy link
Collaborator

@t-vi t-vi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Supergood, thank you @KaelanDt @Borda

@t-vi t-vi enabled auto-merge (squash) June 5, 2025 11:22
@t-vi t-vi merged commit 45feb17 into main Jun 5, 2025
49 checks passed
@t-vi t-vi deleted the kaelan/inplace-tests branch June 5, 2025 11:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Check inplace correctness for recipes
3 participants