[Question] Exllamv2 kernel may have accuracy issue for group_size 16 #1515

wenhuach21 · 2025-04-08T04:01:17Z

It seems gptqmodel version and autogptq version both have accuracy issue. I tested from autoround, but I am not quite sure, you could have a test from your side

Qubitium · 2025-04-08T06:15:23Z

@wenhuach21 We have https://github.com/ModelCloud/GPTQModel/blob/main/tests/test_kernel_output.py and https://github.com/ModelCloud/GPTQModel/blob/main/tests/test_kernel_output_ipex.py to test and compare the qualities between different kernels. I will check if group size 16 is covered.

Qubitium · 2025-04-11T13:12:49Z

For now main has disabled Exllama V2 kernel selection for group_size == 16 until we can get to the bottom of this.

wenhuach21 added the bug Something isn't working label Apr 8, 2025

Qubitium mentioned this issue Apr 11, 2025

disable selection of ExllamaV2 kernel for group_size=16 for now #1537

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Question] Exllamv2 kernel may have accuracy issue for group_size 16 #1515

[Question] Exllamv2 kernel may have accuracy issue for group_size 16 #1515

wenhuach21 commented Apr 8, 2025

Qubitium commented Apr 8, 2025

Uh oh!

Qubitium commented Apr 11, 2025

Uh oh!

[Question] Exllamv2 kernel may have accuracy issue for group_size 16 #1515

[Question] Exllamv2 kernel may have accuracy issue for group_size 16 #1515

Comments

wenhuach21 commented Apr 8, 2025

Qubitium commented Apr 8, 2025

Uh oh!

Qubitium commented Apr 11, 2025

Uh oh!