[BUG] vllm support for QQQ format checkpoints #1501

jmkuebler · 2025-04-04T08:15:03Z

Describe the bug

I created a model with QuantizeConfig(bits=4, group_size=128, quant_method="qqq", format="qqq") and can successfully load it with the built-in model.

GPTQModel.load(self_made_qqq_model)

However, I do not manage to load the checkpoint in vllm.
Even when loading through your library directly

GPTQModel.load(self_made_qqq_model,backend="vllm")

I get:

BACKEND.VLLM backend only supports FORMAT.GPTQ: actual = qqq

@Qubitium was this explicitly tested? https://github.com/ModelCloud/GPTQModel?tab=readme-ov-file#quantization-support

For reference: Support was added in #1402

The text was updated successfully, but these errors were encountered:

jmkuebler · 2025-04-04T08:16:48Z

My version of GPTQModel is at the latest commit ca9d634db6a933cfae2c2d8e8be3fe78f76b802d (at time of opening the issue)

Qubitium · 2025-04-04T19:52:43Z

@jmkuebler Thanks for the bug report. Yes vLLM loading of QQQ support using GPTQModel config has not been added yet but we will do it soon.

We plan to add 1,2 more quantization algorithms into GPTQModel very soon and will add the appropriate vLLM loading hooks.

jmkuebler added the bug Something isn't working label Apr 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] vllm support for QQQ format checkpoints #1501

[BUG] vllm support for QQQ format checkpoints #1501

jmkuebler commented Apr 4, 2025 •

edited

Loading

jmkuebler commented Apr 4, 2025

Uh oh!

Qubitium commented Apr 4, 2025 •

edited

Loading

Uh oh!

[BUG] vllm support for QQQ format checkpoints #1501

[BUG] vllm support for QQQ format checkpoints #1501

Comments

jmkuebler commented Apr 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

jmkuebler commented Apr 4, 2025

Uh oh!

Qubitium commented Apr 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jmkuebler commented Apr 4, 2025 •

edited

Loading

Qubitium commented Apr 4, 2025 •

edited

Loading