You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Output error: ValueError: The linear dimension 16384 has 409 groups under group size 40. The groups cannot be evenly distributed on 2 GPUs.
Possible solutions: reduce the number of GPUs, or use quantization with a smaller group size.
Is it possible to run a 3-bit version of the MLC-LLM model using multiple GPUs?
Thanks in advance!
The text was updated successfully, but these errors were encountered:
Hi @shahizat, as the error message has suggested, under 3-bit quantization we cannot divide groups evenly by half and thus for this case it is not supported.
Grettings to all
🐛 Bug
To Reproduce
Steps to reproduce the behavior:
Output error: ValueError: The linear dimension 16384 has 409 groups under group size 40. The groups cannot be evenly distributed on 2 GPUs.
Possible solutions: reduce the number of GPUs, or use quantization with a smaller group size.
Is it possible to run a 3-bit version of the MLC-LLM model using multiple GPUs?
Thanks in advance!
The text was updated successfully, but these errors were encountered: