Skip to content

[Question] Exllamv2 kernel may have accuracy issue for group_size 16 #1515

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wenhuach21 opened this issue Apr 8, 2025 · 2 comments
Open
Labels
bug Something isn't working

Comments

@wenhuach21
Copy link

It seems gptqmodel version and autogptq version both have accuracy issue. I tested from autoround, but I am not quite sure, you could have a test from your side

@wenhuach21 wenhuach21 added the bug Something isn't working label Apr 8, 2025
@Qubitium
Copy link
Collaborator

Qubitium commented Apr 8, 2025

@wenhuach21 We have https://github.com/ModelCloud/GPTQModel/blob/main/tests/test_kernel_output.py and https://github.com/ModelCloud/GPTQModel/blob/main/tests/test_kernel_output_ipex.py to test and compare the qualities between different kernels. I will check if group size 16 is covered.

@Qubitium
Copy link
Collaborator

For now main has disabled Exllama V2 kernel selection for group_size == 16 until we can get to the bottom of this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants