-
Notifications
You must be signed in to change notification settings - Fork 114
Issues: vllm-project/llm-compressor
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Extend e2e tests to add asym support for W8A8-Int8
enhancement
New feature or request
good first issue
A good first issue for users wanting to contribute
#1344
opened Apr 10, 2025 by
dsikka
Extend timings to lm-eval testing
enhancement
New feature or request
good first issue
A good first issue for users wanting to contribute
#1340
opened Apr 9, 2025 by
dsikka
[Request] Update vision model examples
enhancement
New feature or request
#1327
opened Apr 5, 2025 by
T145
Recipe.model_dump()
output is not valid input for Recipe.model_validate(...)
bug
#1319
opened Apr 2, 2025 by
rahul-tuli
[Gemma3] - FP8 Dynamic KeyError: 'vision_model.encoder.layers.0.mlp.fc1.weight_scale'
bug
Something isn't working
#1306
opened Apr 1, 2025 by
m4r1k
[Gemma3] - oneshot doesn't output preprocessor_config.json & processor_config.json
bug
Something isn't working
#1305
opened Apr 1, 2025 by
m4r1k
Can't quantize kv cache: Something isn't working
observer = self.k_observers[layer_idx]
liste index out of range
bug
#1295
opened Mar 28, 2025 by
DreamGenX
Very slow saving of models due to irrelevant stats computation
enhancement
New feature or request
#1293
opened Mar 28, 2025 by
eldarkurtic
Has anyone successfully quantinize Deepseek-R1 to w8a8?
question
Further information is requested
#1274
opened Mar 21, 2025 by
taishan1994
prebuilt docker images?
question
Further information is requested
#1252
opened Mar 13, 2025 by
shensimeteor
Hope support tensor parallel
enhancement
New feature or request
#1249
opened Mar 13, 2025 by
Arcmoon-Hu
Support gemma3
enhancement
New feature or request
good first issue
A good first issue for users wanting to contribute
#1248
opened Mar 13, 2025 by
zf-inworld
use cpu memory 1.4TB, but use gpu memory 300MB
bug
Something isn't working
#1240
opened Mar 11, 2025 by
mmdbhs
Cuda out of memory while loading the quantized model
bug
Something isn't working
vllm
Using vLLM
#1234
opened Mar 7, 2025 by
Bakht-Ullah
Quantization Memory Requirements
question
Further information is requested
#1228
opened Mar 5, 2025 by
sneha5gsm
Fail when invalid arguments are provided in the recipe for a modifier
bug
Something isn't working
good first issue
A good first issue for users wanting to contribute
#1226
opened Mar 5, 2025 by
dsikka
Fail when faulty recipes are provided during Something isn't working
good first issue
A good first issue for users wanting to contribute
oneshot
bug
#1225
opened Mar 5, 2025 by
dsikka
Expand tests for New feature or request
good first issue
A good first issue for users wanting to contribute
ModuleSparsificationInfo
enhancement
#1224
opened Mar 5, 2025 by
dsikka
Update to use New feature or request
good first issue
A good first issue for users wanting to contribute
loguru
enhancement
#1223
opened Mar 5, 2025 by
dsikka
Update observers to make New feature or request
good first issue
A good first issue for users wanting to contribute
MSE
the default
enhancement
#1222
opened Mar 5, 2025 by
dsikka
Incomplete warning message during Something isn't working
good first issue
A good first issue for users wanting to contribute
oneshot
post process - WARNING - Optimized model not saved. To save, please provide
bug
#1219
opened Mar 3, 2025 by
dsikka
Lazy Loading of Weights for Large Model Quantization
enhancement
New feature or request
#1216
opened Mar 1, 2025 by
zjnyly
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.