You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been exploring the OmniQuant repository and am impressed with the quantization techniques provided for Large Language Models (LLMs). I noticed that the pre-trained models are available in .pth and .bin file formats from huggingface
I was wondering why these models are not available in the GGUF format, which is considered more efficient for handling large models. Is there a specific reason for this choice of file formats? Am I missing something here?
I am sure there is a reason for that I am probably just missing something.
The text was updated successfully, but these errors were encountered:
Hello,
I've been exploring the OmniQuant repository and am impressed with the quantization techniques provided for Large Language Models (LLMs). I noticed that the pre-trained models are available in .pth and .bin file formats from huggingface
I was wondering why these models are not available in the GGUF format, which is considered more efficient for handling large models. Is there a specific reason for this choice of file formats? Am I missing something here?
I am sure there is a reason for that I am probably just missing something.
The text was updated successfully, but these errors were encountered: