-
Notifications
You must be signed in to change notification settings - Fork 118
Open
Description
Problem Description
Issue with normal values
We have an FP32 tensor defined as follows:
torch.tensor([
0.3463, -0.0499, 0.3794, -2.3764, 0.3960, 0.5989, -0.6250, -0.4239,
-0.6192, 0.2761, -0.3746, 0.1462, 0.5216, -0.5194, -0.8497, -0.8237,
0.5145, 0.3508, 0.0298, -0.3462, 0.3917, 0.1096, 1.4494, 1.7009,
-0.5473, -2.0851, 1.8099, 0.3587, 1.4622, -2.3143, 0.3412, -0.8931
])
With a UE8M0 scale of 126, the packed FP4 generated by _mxfp4_quant_op
is
[129, 226, 34, 171, 26, 25, 162, 187, 18, 144, 2, 85, 234, 22, 229, 193]
However, torchao has a different result which is aligned with MI355 instruction v_cvt_scalef32_pk_fp4_f32
:
[129, 226, 34, 170, 26, 25, 162, 187, 18, 144, 2, 85, 234, 22, 229, 193]
Specifically, the value -0.6250 / 0.5
is rounded to -1 instead of -1.5.
Issue with denormal values
Denormal values cannot be rounded up in the current aiter implementation.
For example:
FP32 numbers: 0x3F000000 and 0x3F000003
ue8m0 scale: 128
torchao and v_cvt_scalef32_pk_fp4_f32
round 0x3F000000 to 0 and 0x3F000003 to 0.5,
while the current aiter rounds both to 0.5.
Operating System
Ubuntu 22.04
CPU
AMD EPYC 73F3 16-Core Processor
GPU
AMD Instinct MI355X
ROCm Version
ROCm 7.0
ROCm Component
No response
Steps to Reproduce
No response
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
Additional Information
Possible fix:
Metadata
Metadata
Assignees
Labels
No labels