Open
Description
Is your feature request related to a problem? Please describe.
Currently, QuaRot, QServe (QOQ), and SpinQuant all have slightly different way to implementation Hadamard rotation. fms-mo
needs a concise way to incorporate rotation into the framework, so that this technique can be extended to generic models without hacking the huggingface model def.
Describe the solution you'd like
introduce new RotQuantizers to work with the existing QLinear and QBmm, which should greatly simplify the rotation implementations.
Describe alternatives you've considered
might be able to utilize the existing codes from QuaRot, QServe, SpinQuant, but those implementations are more complicated than necessary.
Additional context
This includes magR as well.
Metadata
Metadata
Assignees
Labels
No labels