tloen / llama-int8 Public

forked from meta-llama/llama

Notifications You must be signed in to change notification settings
Fork 105
Star 1.1k

Code
Issues 14
Pull requests 2
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Security
Insights

Issues: tloen/llama-int8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

14 Open 3 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Does this support llama2 as well?

#21 opened Feb 16, 2024 by YaoJiayi

Producing nan Tensors

#20 opened Jun 15, 2023 by Bryan-Lavender

65B on multiple GPUs : CUDA out of memory with 4 x GPU RTX A5000 (24GB) / 96GB in total

#18 opened Mar 14, 2023 by scampion

LLaMA 13B works on a single RTX 4080 16GB

#17 opened Mar 13, 2023 by kcchu

Further detail needed - installing bitsandbytes from source

#16 opened Mar 13, 2023 by chrisbward

Getting error on generation in Windows

#12 opened Mar 10, 2023 by elephantpanda

Can 65B run on 4*32G GPU?

#11 opened Mar 8, 2023 by zhongtao93

Is it possible to save the smaller weights so it doesn't have to convert them each time?

#10 opened Mar 7, 2023 by spullara

Systematic comparison of original models to int8 inferencing

#9 opened Mar 7, 2023 by innokean

When a single A100 80G ,memory is about 96G,Error loading 65B

#8 opened Mar 6, 2023 by dpyneo

Any chance to share quantized int8 7B and 13B models?

#6 opened Mar 6, 2023 by progressionnetwork

Does 8GB able to run smallest llama model?

#5 opened Mar 6, 2023 by lucasjinreal

Tracking issue for Mac support

#4 opened Mar 5, 2023 by pannous

13B - load is successful on T4, but forward pass fails

#2 opened Mar 5, 2023 by deep-diver

ProTip! Type g p on any issue or pull request to go back to the pull request listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly