-
Notifications
You must be signed in to change notification settings - Fork 34
[mlir-gen] Introduce basic support for quantization ops (2/n). #1089
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This patch tries to extend the mixed precision isupport and add basic infra to define or create quantization kernel as we get more clarity on kind of primitive operations required. Addition of run time test is not in scope of this patch.
General question, shouldn't we migrate to lighthouse's gen? |
That's indeed a good question. Might be a good example to dogfood how we facilitate downstream projects interacting with lighthouse. |
For now, I'd duplicate, because we don't know what we can build with this work. Once we're happy with the result, we should bring this up to the lighthouse and help talk about quantization upstream. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General structure seems fine
I assume that the sequence of quant ops has been tested and is correct
I'd like to see at least two back-and-forth execution test:
|
Test failures are due to NFS migration and will be fixed in time |
…untime test. -Adds '-print-input' flag to print input arguments for visual inspection. -Refactored and updated the corresponding APIs.
This patch tries to extend the mixed precision isupport and add basic infra to define or create quantization kernel as we get more clarity on kind of primitive operations required.
Addition of run time test is not in scope of this patch.