-
Notifications
You must be signed in to change notification settings - Fork 8
MIGraphX EP Add FP4 support #176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: rocm7.1_internal_testing
Are you sure you want to change the base?
Conversation
@CharlieL7 throwing you on this for visability. Looks like I need to define fp4 tensor types in OnnxRT but just let me know if the MIGraphX type side is valid. Currently using int8 right now as the input type |
5746ba9
to
be835ef
Compare
62db153
to
be835ef
Compare
onnx/onnx#6318 and onnx/onnx#6283 added FP4 support to ONNX. This change introduces the FP4 type in ORT and adds type support to one relevant operator (`Cast`) as a proof-of-concept for the type integration into ORT. More op support will be added on a need-basis. This change took inspiration from the following PRs: microsoft#14731 microsoft#22228 microsoft#20362 Some notes: 1) Only `tensor` type gets support for FP4 initially. Secondary types like `seq(tensor)`, `sparse_tensor`, `optional` do not get support (so as to not introduce unnecessary bloat to the framework without a solid use-case) 2) Flatbuffer related files receive no updates in this PR Be able to run FP4 models with ORT
…enabled in ORT (microsoft#25940) ### Description As title ### Motivation and Context Follow-up fixes to microsoft#25767
Currently grabbed upstream tag for v1.23.0 to branch off for ROCm 7.1 but that doesn't contain changes for fp4 types. Cherry picking the changes from mainline before adding MIGraphX side changes since we require the tensor types |
Do this so that MIGraphX can take in fp4 types from input/output tensors and then use that to perform an inference via the MIGraphX API.
EP changes - 1a87ee2 The other two commits are cherry-picks from OnnxRT mainline |
Description
Enable fp4 datatype described by the spec here in MIGraphX EP
Motivation and Context
Need to flush out tensor types and spec for fp4 in onnxruntime (not just the EP) as this gives support to MIGraphX EP allowing in tensor types