Skip to content

Commit 3006cf8

Browse files
authored
Merge pull request #367 from Xilinx/jrickert.qdq_int4
Bump Quantize/DequantizeLinear to opset 21, adding int4/uint4 support
2 parents a43efb0 + 823d11b commit 3006cf8

File tree

12 files changed

+261
-139
lines changed

12 files changed

+261
-139
lines changed

docs/Dialects/onnx.md

Lines changed: 44 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -2180,15 +2180,18 @@ Effects: `MemoryEffects::Effect{}`
21802180

21812181
_ONNX DequantizeLinear operation_
21822182

2183-
The linear dequantization operator. It consumes a quantized tensor, a scale, and a zero point to compute the full precision tensor.
2184-
The dequantization formula is `y = (x - x_zero_point) * x_scale`. `x_scale` and `x_zero_point` must have same shape, and can be either a scalar
2185-
for per-tensor / per layer quantization, or a 1-D tensor for per-axis quantization.
2186-
`x_zero_point` and `x` must have same type. `x` and `y` must have same shape. In the case of dequantizing int32,
2187-
there's no zero point (zero point is supposed to be 0).
2188-
`zero-point` is usually not used in the case of float8e4m3fn, float8e4m3fnuz, float8e5m2, float8e5m2fnuz quantization,
2189-
but the dequantization formula remains the same for consistency and 'x_scale' still determines the output type.
2183+
The linear dequantization operator. It consumes a quantized tensor, a scale, and a zero point to compute the
2184+
full-precision tensor. The dequantization formula is `y = (x - x_zero_point) * x_scale`. `x_scale` and `x_zero_point`
2185+
must have the same shape, determining the quantization's granularity: a scalar for per-tensor/per-layer quantization,
2186+
a 1-D tensor for per-axis quantization, or have a rank identical to the input for blocked quantization.
2187+
See QuantizeLinear for details on quantization granularity.
21902188

2191-
Traits: `AlwaysSpeculatableImplTrait`, `OpVersionTrait<19>`
2189+
`x_zero_point` and `x` must have the same type. `x` and `y` must have the same shape. In the case of dequantizing
2190+
`int32`, there's no zero point (zero point is supposed to be 0).
2191+
`zero-point` is usually not used in the case of float8 types quantization, but the dequantization formula remains the same
2192+
for consistency, and `x_scale` still determines the output type.
2193+
2194+
Traits: `AlwaysSpeculatableImplTrait`, `OpVersionTrait<21>`
21922195

21932196
Interfaces: `ConditionallySpeculatable`, `NoMemoryEffect (MemoryEffectOpInterface)`, `ShapeHelperOpInterface`, `ShapeInferenceOpInterface`
21942197

@@ -2199,15 +2202,16 @@ Effects: `MemoryEffects::Effect{}`
21992202
<table>
22002203
<tr><th>Attribute</th><th>MLIR Type</th><th>Description</th></tr>
22012204
<tr><td><code>axis</code></td><td>::mlir::IntegerAttr</td><td>64-bit signed integer attribute</td></tr>
2205+
<tr><td><code>block_size</code></td><td>::mlir::IntegerAttr</td><td>64-bit signed integer attribute</td></tr>
22022206
</table>
22032207

22042208
#### Operands:
22052209

22062210
| Operand | Description |
22072211
| :-----: | ----------- |
2208-
| `x` | tensor of 8-bit signless integer values or tensor of 8-bit unsigned integer values or tensor of 16-bit unsigned integer values or tensor of 32-bit signless integer values or tensor of f8E4M3FN type values or tensor of f8E4M3FNUZ type values or tensor of f8E5M2 type values or tensor of f8E5M2FNUZ type values
2212+
| `x` | tensor of 8-bit signless integer values or tensor of 8-bit unsigned integer values or tensor of 16-bit signless integer values or tensor of 16-bit unsigned integer values or tensor of 32-bit signless integer values or tensor of f8E4M3FN type values or tensor of f8E4M3FNUZ type values or tensor of f8E5M2 type values or tensor of f8E5M2FNUZ type values or tensor of 4-bit unsigned integer values or tensor of 4-bit signless integer values
22092213
| `x_scale` | tensor of 32-bit float values or tensor of 16-bit float values or tensor of bfloat16 type values
2210-
| `x_zero_point` | tensor of 8-bit signless integer values or tensor of 8-bit unsigned integer values or tensor of 16-bit unsigned integer values or tensor of 32-bit signless integer values or tensor of f8E4M3FN type values or tensor of f8E4M3FNUZ type values or tensor of f8E5M2 type values or tensor of f8E5M2FNUZ type values or none type
2214+
| `x_zero_point` | tensor of 8-bit signless integer values or tensor of 8-bit unsigned integer values or tensor of 16-bit signless integer values or tensor of 16-bit unsigned integer values or tensor of 32-bit signless integer values or tensor of f8E4M3FN type values or tensor of f8E4M3FNUZ type values or tensor of f8E5M2 type values or tensor of f8E5M2FNUZ type values or tensor of 4-bit unsigned integer values or tensor of 4-bit signless integer values or none type
22112215

22122216
#### Results:
22132217

@@ -6810,17 +6814,33 @@ Effects: `MemoryEffects::Effect{}`
68106814

68116815
_ONNX QuantizeLinear operation_
68126816

6813-
The linear quantization operator. It consumes a high precision tensor, a scale, and a zero point to compute the low precision / quantized tensor.
6814-
The scale factor and zero point must have same shape, and can be either a scalar for per-tensor / per layer quantization, or a 1-D tensor for per-axis quantization.
6815-
The quantization formula is `y = saturate ((x / y_scale) + y_zero_point)`.
6816-
For saturation, it saturates to [0, 255] if it's uint8, or [-128, 127] if it's int8.
6817-
For (x / y_scale), it's rounding to the nearest even. Refer to https://en.wikipedia.org/wiki/Rounding for details.
6818-
'y_zero_point' and 'y' must have same type.
6819-
'y_zero_point' is usually not used for quantization to float8e4m3fn, float8e4m3fnuz, float8e5m2, float8e5m2fnuz,
6820-
but the quantization formula remains the same for consistency and
6821-
the type of the attribute 'y_zero_point' still determines the quantization type.
6817+
The linear quantization operator consumes a high-precision tensor, a scale, and a zero point to compute the
6818+
low-precision/quantized tensor. The scale factor and zero point must have the same shape, determining the quantization
6819+
granularity. The quantization formula is `y = saturate((x / y_scale) + y_zero_point)`.
68226820

6823-
Traits: `AlwaysSpeculatableImplTrait`, `OpVersionTrait<19>`
6821+
Saturation is done according to:
6822+
- uint16: [0, 65535]
6823+
- int16: [-32768, 32767]
6824+
- uint8: [0, 255]
6825+
- int8: [-128, 127]
6826+
- uint4: [0, 15]
6827+
- int4: [-8, 7]
6828+
6829+
For `(x / y_scale)`, it rounds to the nearest even. Refer to https://en.wikipedia.org/wiki/Rounding for details.
6830+
6831+
`y_zero_point` and `y` must have the same type. `y_zero_point` is usually not used for quantization to float8 types, but the quantization
6832+
formula remains the same for consistency, and the type of the attribute `y_zero_point` still determines the quantization type.
6833+
6834+
There are three supported quantization granularities, determined by the shape of `y_scale`.
6835+
In all cases, `y_zero_point` must have the same shape as `y_scale`.
6836+
- Per-tensor (per-layer) quantization: `y_scale` is a scalar.
6837+
- Per-axis quantization: The scale must be a 1-D tensor, with the length of the quantization axis. For an input shape
6838+
`(D0, ..., Di, ..., Dn)` and `axis=i`, `y_scale` is a 1-D tensor of length `Di`.
6839+
- Blocked quantization: The scale's shape is identical to the input's shape, except for one dimension, in which
6840+
blocking is performed. Given `x` shape `(D0, ..., Di, ..., Dn)`, `axis=i`, and block size `B`: `y_scale` shape is
6841+
`(D0, ..., ceil(Di/B), ..., Dn)`.
6842+
6843+
Traits: `AlwaysSpeculatableImplTrait`, `OpVersionTrait<21>`
68246844

68256845
Interfaces: `ConditionallySpeculatable`, `NoMemoryEffect (MemoryEffectOpInterface)`, `ShapeHelperOpInterface`, `ShapeInferenceOpInterface`
68266846

@@ -6831,6 +6851,8 @@ Effects: `MemoryEffects::Effect{}`
68316851
<table>
68326852
<tr><th>Attribute</th><th>MLIR Type</th><th>Description</th></tr>
68336853
<tr><td><code>axis</code></td><td>::mlir::IntegerAttr</td><td>64-bit signed integer attribute</td></tr>
6854+
<tr><td><code>block_size</code></td><td>::mlir::IntegerAttr</td><td>64-bit signed integer attribute</td></tr>
6855+
<tr><td><code>output_dtype</code></td><td>::mlir::IntegerAttr</td><td>64-bit signed integer attribute</td></tr>
68346856
<tr><td><code>saturate</code></td><td>::mlir::IntegerAttr</td><td>64-bit signed integer attribute</td></tr>
68356857
</table>
68366858

@@ -6840,13 +6862,13 @@ Effects: `MemoryEffects::Effect{}`
68406862
| :-----: | ----------- |
68416863
| `x` | tensor of 32-bit float values or tensor of 16-bit float values or tensor of bfloat16 type values or tensor of 32-bit signless integer values
68426864
| `y_scale` | tensor of 32-bit float values or tensor of 16-bit float values or tensor of bfloat16 type values or tensor of 32-bit signless integer values
6843-
| `y_zero_point` | tensor of 8-bit signless integer values or tensor of 8-bit unsigned integer values or tensor of 16-bit unsigned integer values or tensor of f8E4M3FN type values or tensor of f8E4M3FNUZ type values or tensor of f8E5M2 type values or tensor of f8E5M2FNUZ type values or none type
6865+
| `y_zero_point` | tensor of 8-bit signless integer values or tensor of 8-bit unsigned integer values or tensor of 16-bit signless integer values or tensor of 16-bit unsigned integer values or tensor of f8E4M3FN type values or tensor of f8E4M3FNUZ type values or tensor of f8E5M2 type values or tensor of f8E5M2FNUZ type values or tensor of 4-bit unsigned integer values or tensor of 4-bit signless integer values or none type
68446866

68456867
#### Results:
68466868

68476869
| Result | Description |
68486870
| :----: | ----------- |
6849-
| `y` | tensor of 8-bit signless integer values or tensor of 8-bit unsigned integer values or tensor of 16-bit unsigned integer values or tensor of f8E4M3FN type values or tensor of f8E4M3FNUZ type values or tensor of f8E5M2 type values or tensor of f8E5M2FNUZ type values
6871+
| `y` | tensor of 8-bit signless integer values or tensor of 8-bit unsigned integer values or tensor of 16-bit signless integer values or tensor of 16-bit unsigned integer values or tensor of f8E4M3FN type values or tensor of f8E4M3FNUZ type values or tensor of f8E5M2 type values or tensor of f8E5M2FNUZ type values or tensor of 4-bit unsigned integer values or tensor of 4-bit signless integer values
68506872

68516873
### `onnx.RMSLayerNormalization` (ONNXRMSLayerNormalizationOp)
68526874

src/Builder/OpBuildTable.inc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,7 @@ op_dialect_version_map_["Col2Im"] = {18};
5050
op_dialect_version_map_["CumSum"] = {14};
5151
op_dialect_version_map_["DeformConv"] = {22};
5252
op_dialect_version_map_["DepthToSpace"] = {13};
53-
op_dialect_version_map_["DequantizeLinear"] = {19};
53+
op_dialect_version_map_["DequantizeLinear"] = {21};
5454
op_dialect_version_map_["Det"] = {22};
5555
op_dialect_version_map_["DFT"] = {20, 17};
5656
op_dialect_version_map_["DictVectorizer"] = {1};
@@ -138,7 +138,7 @@ op_dialect_version_map_["Pad"] = {21, 18, 13, 11, 2};
138138
op_dialect_version_map_["Pow"] = {15};
139139
op_dialect_version_map_["QLinearConv"] = {10};
140140
op_dialect_version_map_["QLinearMatMul"] = {10};
141-
op_dialect_version_map_["QuantizeLinear"] = {19};
141+
op_dialect_version_map_["QuantizeLinear"] = {21};
142142
op_dialect_version_map_["RNN"] = {22};
143143
op_dialect_version_map_["RandomNormal"] = {22};
144144
op_dialect_version_map_["RandomNormalLike"] = {22};

src/Conversion/ONNXToTOSA/NN/DequantizeLinear.cpp

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -54,13 +54,17 @@ class ONNXDequantizeLinearOpLoweringToTOSA
5454
return rewriter.notifyMatchFailure(
5555
loc, "expected zero point to be none or have tensor type");
5656
}
57-
58-
if (auto scaleTy = cast<ShapedType>(adaptor.getXScale().getType());
59-
!scaleTy.hasStaticShape()) {
57+
const auto scaleTy = cast<ShapedType>(adaptor.getXScale().getType());
58+
if (!scaleTy.hasStaticShape()) {
6059
return rewriter.notifyMatchFailure(
6160
loc, "expected scale to have static shape");
6261
}
6362

63+
if (scaleTy.getRank() > 1) {
64+
return rewriter.notifyMatchFailure(
65+
loc, "block quantization is not yet supported");
66+
}
67+
6468
int64_t axis = op.getAxis();
6569
// See https://github.com/onnx/onnx/issues/6067
6670
if (axis == 1 && (resultType.getRank() == 1 || resultType.getRank() == 0))

src/Conversion/ONNXToTOSA/NN/QuantizeLinear.cpp

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -47,13 +47,24 @@ class ONNXQuantizeLinearOpLoweringToTOSA
4747
return rewriter.notifyMatchFailure(
4848
loc, "expected zero point to have static shape");
4949
}
50-
51-
if (auto zpTy = dyn_cast<ShapedType>(adaptor.getYScale().getType());
52-
zpTy && !zpTy.hasStaticShape()) {
50+
auto scaleTy = dyn_cast<ShapedType>(adaptor.getYScale().getType());
51+
if (scaleTy && !scaleTy.hasStaticShape()) {
5352
return rewriter.notifyMatchFailure(
5453
loc, "expected scale to have static shape");
5554
}
5655

56+
if (scaleTy.getRank() > 1) {
57+
return rewriter.notifyMatchFailure(
58+
loc, "block quantization is not yet supported");
59+
}
60+
61+
if (const auto outputDtype =
62+
static_cast<onnx::TensorProto_DataType>(op.getOutputDtype());
63+
outputDtype != onnx::TensorProto_DataType_UNDEFINED) {
64+
return rewriter.notifyMatchFailure(
65+
loc, "custom output dtype not yet supported");
66+
}
67+
5768
if (!op.getSaturate()) {
5869
return rewriter.notifyMatchFailure(loc, "Only saturate=1 is supported");
5970
}

src/Dialect/ONNX/ONNXOps.td.inc

Lines changed: 46 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -1874,23 +1874,26 @@ def ONNXDepthToSpaceOp:ONNX_Op<"DepthToSpace",
18741874
}
18751875

18761876
def ONNXDequantizeLinearOp:ONNX_Op<"DequantizeLinear",
1877-
[Pure, OpVersionTrait<19>, DeclareOpInterfaceMethods<ShapeInferenceOpInterface>, DeclareOpInterfaceMethods<ShapeHelperOpInterface>]> {
1877+
[Pure, OpVersionTrait<21>, DeclareOpInterfaceMethods<ShapeInferenceOpInterface>, DeclareOpInterfaceMethods<ShapeHelperOpInterface>]> {
18781878
let hasCanonicalizer = 1;
18791879
let summary = "ONNX DequantizeLinear operation";
18801880
let description = [{
1881-
The linear dequantization operator. It consumes a quantized tensor, a scale, and a zero point to compute the full precision tensor.
1882-
The dequantization formula is `y = (x - x_zero_point) * x_scale`. `x_scale` and `x_zero_point` must have same shape, and can be either a scalar
1883-
for per-tensor / per layer quantization, or a 1-D tensor for per-axis quantization.
1884-
`x_zero_point` and `x` must have same type. `x` and `y` must have same shape. In the case of dequantizing int32,
1885-
there's no zero point (zero point is supposed to be 0).
1886-
`zero-point` is usually not used in the case of float8e4m3fn, float8e4m3fnuz, float8e5m2, float8e5m2fnuz quantization,
1887-
but the dequantization formula remains the same for consistency and 'x_scale' still determines the output type.
1881+
The linear dequantization operator. It consumes a quantized tensor, a scale, and a zero point to compute the
1882+
full-precision tensor. The dequantization formula is `y = (x - x_zero_point) * x_scale`. `x_scale` and `x_zero_point`
1883+
must have the same shape, determining the quantization's granularity: a scalar for per-tensor/per-layer quantization,
1884+
a 1-D tensor for per-axis quantization, or have a rank identical to the input for blocked quantization.
1885+
See QuantizeLinear for details on quantization granularity.
1886+
1887+
`x_zero_point` and `x` must have the same type. `x` and `y` must have the same shape. In the case of dequantizing
1888+
`int32`, there's no zero point (zero point is supposed to be 0).
1889+
`zero-point` is usually not used in the case of float8 types quantization, but the dequantization formula remains the same
1890+
for consistency, and `x_scale` still determines the output type.
18881891
}];
1889-
// AMD: Manual addition of uint16
1890-
let arguments = (ins AnyTypeOf<[TensorOf<[I8]>, TensorOf<[UI8]>, TensorOf<[UI16]>, TensorOf<[I32]>, TensorOf<[F8E4M3FN]>, TensorOf<[F8E4M3FNUZ]>, TensorOf<[F8E5M2]>, TensorOf<[F8E5M2FNUZ]>]>:$x,
1892+
let arguments = (ins AnyTypeOf<[TensorOf<[I8]>, TensorOf<[UI8]>, TensorOf<[I16]>, TensorOf<[UI16]>, TensorOf<[I32]>, TensorOf<[F8E4M3FN]>, TensorOf<[F8E4M3FNUZ]>, TensorOf<[F8E5M2]>, TensorOf<[F8E5M2FNUZ]>, TensorOf<[UI<4>]>, TensorOf<[I<4>]>]>:$x,
18911893
AnyTypeOf<[TensorOf<[F32]>, TensorOf<[F16]>, TensorOf<[BF16]>]>:$x_scale,
1892-
AnyTypeOf<[TensorOf<[I8]>, TensorOf<[UI8]>, TensorOf<[UI16]>, TensorOf<[I32]>, TensorOf<[F8E4M3FN]>, TensorOf<[F8E4M3FNUZ]>, TensorOf<[F8E5M2]>, TensorOf<[F8E5M2FNUZ]>, NoneType]>:$x_zero_point,
1893-
DefaultValuedAttr<SI64Attr, "1">:$axis);
1894+
AnyTypeOf<[TensorOf<[I8]>, TensorOf<[UI8]>, TensorOf<[I16]>, TensorOf<[UI16]>, TensorOf<[I32]>, TensorOf<[F8E4M3FN]>, TensorOf<[F8E4M3FNUZ]>, TensorOf<[F8E5M2]>, TensorOf<[F8E5M2FNUZ]>, TensorOf<[UI<4>]>, TensorOf<[I<4>]>, NoneType]>:$x_zero_point,
1895+
DefaultValuedAttr<SI64Attr, "1">:$axis,
1896+
DefaultValuedAttr<SI64Attr, "0">:$block_size);
18941897
let results = (outs AnyTypeOf<[TensorOf<[F32]>, TensorOf<[F16]>, TensorOf<[BF16]>]>:$y);
18951898
let extraClassDeclaration = [{
18961899
static int getNumberOfOperands() {
@@ -6124,26 +6127,43 @@ def ONNXQLinearMatMulOp:ONNX_Op<"QLinearMatMul",
61246127
}
61256128

61266129
def ONNXQuantizeLinearOp:ONNX_Op<"QuantizeLinear",
6127-
[Pure, OpVersionTrait<19>, DeclareOpInterfaceMethods<ShapeInferenceOpInterface>, DeclareOpInterfaceMethods<ShapeHelperOpInterface>]> {
6130+
[Pure, OpVersionTrait<21>, DeclareOpInterfaceMethods<ShapeInferenceOpInterface>, DeclareOpInterfaceMethods<ShapeHelperOpInterface>]> {
61286131
let summary = "ONNX QuantizeLinear operation";
61296132
let description = [{
6130-
The linear quantization operator. It consumes a high precision tensor, a scale, and a zero point to compute the low precision / quantized tensor.
6131-
The scale factor and zero point must have same shape, and can be either a scalar for per-tensor / per layer quantization, or a 1-D tensor for per-axis quantization.
6132-
The quantization formula is `y = saturate ((x / y_scale) + y_zero_point)`.
6133-
For saturation, it saturates to [0, 255] if it's uint8, or [-128, 127] if it's int8.
6134-
For (x / y_scale), it's rounding to the nearest even. Refer to https://en.wikipedia.org/wiki/Rounding for details.
6135-
'y_zero_point' and 'y' must have same type.
6136-
'y_zero_point' is usually not used for quantization to float8e4m3fn, float8e4m3fnuz, float8e5m2, float8e5m2fnuz,
6137-
but the quantization formula remains the same for consistency and
6138-
the type of the attribute 'y_zero_point' still determines the quantization type.
6139-
}];
6140-
// AMD: Manual addition of uint16
6133+
The linear quantization operator consumes a high-precision tensor, a scale, and a zero point to compute the
6134+
low-precision/quantized tensor. The scale factor and zero point must have the same shape, determining the quantization
6135+
granularity. The quantization formula is `y = saturate((x / y_scale) + y_zero_point)`.
6136+
6137+
Saturation is done according to:
6138+
- uint16: [0, 65535]
6139+
- int16: [-32768, 32767]
6140+
- uint8: [0, 255]
6141+
- int8: [-128, 127]
6142+
- uint4: [0, 15]
6143+
- int4: [-8, 7]
6144+
6145+
For `(x / y_scale)`, it rounds to the nearest even. Refer to https://en.wikipedia.org/wiki/Rounding for details.
6146+
6147+
`y_zero_point` and `y` must have the same type. `y_zero_point` is usually not used for quantization to float8 types, but the quantization
6148+
formula remains the same for consistency, and the type of the attribute `y_zero_point` still determines the quantization type.
6149+
6150+
There are three supported quantization granularities, determined by the shape of `y_scale`.
6151+
In all cases, `y_zero_point` must have the same shape as `y_scale`.
6152+
- Per-tensor (per-layer) quantization: `y_scale` is a scalar.
6153+
- Per-axis quantization: The scale must be a 1-D tensor, with the length of the quantization axis. For an input shape
6154+
`(D0, ..., Di, ..., Dn)` and `axis=i`, `y_scale` is a 1-D tensor of length `Di`.
6155+
- Blocked quantization: The scale's shape is identical to the input's shape, except for one dimension, in which
6156+
blocking is performed. Given `x` shape `(D0, ..., Di, ..., Dn)`, `axis=i`, and block size `B`: `y_scale` shape is
6157+
`(D0, ..., ceil(Di/B), ..., Dn)`.
6158+
}];
61416159
let arguments = (ins AnyTypeOf<[TensorOf<[F32]>, TensorOf<[F16]>, TensorOf<[BF16]>, TensorOf<[I32]>]>:$x,
61426160
AnyTypeOf<[TensorOf<[F32]>, TensorOf<[F16]>, TensorOf<[BF16]>, TensorOf<[I32]>]>:$y_scale,
6143-
AnyTypeOf<[TensorOf<[I8]>, TensorOf<[UI8]>, TensorOf<[UI16]>, TensorOf<[F8E4M3FN]>, TensorOf<[F8E4M3FNUZ]>, TensorOf<[F8E5M2]>, TensorOf<[F8E5M2FNUZ]>, NoneType]>:$y_zero_point,
6161+
AnyTypeOf<[TensorOf<[I8]>, TensorOf<[UI8]>, TensorOf<[I16]>, TensorOf<[UI16]>, TensorOf<[F8E4M3FN]>, TensorOf<[F8E4M3FNUZ]>, TensorOf<[F8E5M2]>, TensorOf<[F8E5M2FNUZ]>, TensorOf<[UI<4>]>, TensorOf<[I<4>]>, NoneType]>:$y_zero_point,
61446162
DefaultValuedAttr<SI64Attr, "1">:$axis,
6163+
DefaultValuedAttr<SI64Attr, "0">:$block_size,
6164+
DefaultValuedAttr<SI64Attr, "0">:$output_dtype,
61456165
DefaultValuedAttr<SI64Attr, "1">:$saturate);
6146-
let results = (outs AnyTypeOf<[TensorOf<[I8]>, TensorOf<[UI8]>, TensorOf<[UI16]>, TensorOf<[F8E4M3FN]>, TensorOf<[F8E4M3FNUZ]>, TensorOf<[F8E5M2]>, TensorOf<[F8E5M2FNUZ]>]>:$y);
6166+
let results = (outs AnyTypeOf<[TensorOf<[I8]>, TensorOf<[UI8]>, TensorOf<[I16]>, TensorOf<[UI16]>, TensorOf<[F8E4M3FN]>, TensorOf<[F8E4M3FNUZ]>, TensorOf<[F8E5M2]>, TensorOf<[F8E5M2FNUZ]>, TensorOf<[UI<4>]>, TensorOf<[I<4>]>]>:$y);
61476167
let extraClassDeclaration = [{
61486168
static int getNumberOfOperands() {
61496169
return 3;

0 commit comments

Comments
 (0)