Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature][Config] Add quantized Llama 405B inference + job config #1391

Open
wizeng23 opened this issue Feb 6, 2025 · 2 comments · May be fixed by #1477
Open

[Feature][Config] Add quantized Llama 405B inference + job config #1391

wizeng23 opened this issue Feb 6, 2025 · 2 comments · May be fixed by #1477
Assignees
Labels
enhancement New feature or request Feature good first issue Good for newcomers

Comments

@wizeng23
Copy link
Contributor

wizeng23 commented Feb 6, 2025

Feature request

Possible quantized model to use: https://huggingface.co/neuralmagic/Meta-Llama-3.1-405B-Instruct-quantized.w8a8
May be able to fit on 8x A100 (40 or 80GB) on GCP.

Motivation / references

This is one of the best-performing open-weight models, and should be possible to host locally when quantized. Possible use cases include being an LLM judge.

Your contribution

N/A

@wizeng23 wizeng23 added enhancement New feature or request Feature good first issue Good for newcomers triage This issue needs review by the core team. labels Feb 6, 2025
@wizeng23 wizeng23 self-assigned this Feb 6, 2025
@wizeng23 wizeng23 changed the title [Feature] Add quantized Llama 405B inference + job config [Feature][Config] Add quantized Llama 405B inference + job config Feb 7, 2025
@wizeng23 wizeng23 removed their assignment Feb 7, 2025
@taenin taenin removed the triage This issue needs review by the core team. label Feb 12, 2025
@devampatel03
Copy link

Hi @wizeng23 ! I would like to work on this. Can you assign this issue to me?

@wizeng23
Copy link
Contributor Author

Done! Appreciate the help with this :) please let me know if you have any questions about this!

@devampatel03 devampatel03 linked a pull request Feb 24, 2025 that will close this issue
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Feature good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants