Skip to content

[Feature] Log/info/Save/Restore quantization steps #1410

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mratsim opened this issue May 4, 2025 · 0 comments
Open

[Feature] Log/info/Save/Restore quantization steps #1410

mratsim opened this issue May 4, 2025 · 0 comments
Labels
enhancement New feature or request

Comments

@mratsim
Copy link

mratsim commented May 4, 2025

Inspired by trying to push through #1409 or gptq experiments.

It's incredibly frustrating to fix quantization workflows to have new failures force you to redo hours of quantization.

Some stories:

  1. AWQ quantization is done on CPU then GPU, I tried to workaround [AWQ] Insane memory requirement: over 900GB for 32B model #1409 by increasing swap space but:
    1. When killed by Linux OOM, the whole terminal is killed so there is no trace of what sample #ID led to OOM. I.e. if left unattended, I have to do guesswork because there is no log available anymore.
    2. After the CPU OOM is fixed, I get a GPU OOM and so have to redo 40min of CPU computation (on overclocked Ryzen 9 9950X). It would be much much much more user-friendly to do a GPU test run at the very beginning, ensuring ahead of time that all hardware requirements are met instead of being surprised after people committed time.
  2. GPTQ quantization can have errors due to numerical instability and finding no solution with hessian.
    1. Those should definitely be logged to disk.
    2. It would be great time savings to be allowed to restart GPTQ from layers that converged and allow more samples or more seq lengths or a different dampening fraction for the follow-up layers that failed.
@mratsim mratsim added the enhancement New feature or request label May 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant