You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Running int4 quantization on a onnx file with conformer architecture outputs an error at the end of quantization and output onnx does not load
Steps/Code to reproduce bug
Run quantization with below script
from modelopt.onnx.quantization import quantize
import os
def run_quantization():
input_path = 'model.onnx'
output_path = 'model_int4.onnx'
# Create output directory if it doesn't exist
os.makedirs(os.path.dirname(output_path), exist_ok=True)
quantize(
input_path,
quantize_mode='int4',
use_external_data_format=True,
output_path=output_path,
verbose=True,
)
if __name__ == "__main__":
run_quantization()
Error
INFO:root:Quantized onnx model is saved as xxx
WARNING:root:ONNX model checker failed, check your deployment status.
WARNING:root:Unrecognized attribute: block_size for operator DequantizeLinear
==> Context: Bad node spec for node. Name: onnx::MatMul_4725_DequantizeLinear OpType: DequantizeLinear
Expected behavior
Output model should have valid onnx and should load/run as expected
System information
Container used (if applicable): ?
OS (e.g., Ubuntu 22.04, CentOS 7, Windows 10): Ubuntu 22.04.5 LTS
The warning is not critical. The quantized model should compile and run successfully on both the TensorRT and DML backends. I have tested with TensorRT 10.8 and 10.9, and was able to generate the engine successfully from the quantized ONNX model.
If you’re encountering any issues during deployment, please share the specific error messages so we can help troubleshoot further.
Describe the bug
Running int4 quantization on a onnx file with conformer architecture outputs an error at the end of quantization and output onnx does not load
Steps/Code to reproduce bug
Run quantization with below script
Error
Expected behavior
Output model should have valid onnx and should load/run as expected
System information
[TensorRT-LLM] TensorRT-LLM version: 0.17.0.post1
The text was updated successfully, but these errors were encountered: