Add asymmetric support for Int8Tensor + SmoothQuant#3900
Conversation
Summary: This PR adds in support for asymmetric activation quantization in Int8Tensor and Smoothquant Test Plan: ``` pytest test/quantization/quantize_/workflows/int8/test_int8_tensor.py pytest test/prototype/test_smoothquant.py ``` Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3900
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New FailureAs of commit dd40b73 with merge base 5906856 ( NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Summary: This PR adds in support for asymmetric activation quantization in Int8Tensor and Smoothquant Test Plan: ``` pytest test/quantization/quantize_/workflows/int8/test_int8_tensor.py pytest test/prototype/test_smoothquant.py ``` Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 5163aa0 Pull Request resolved: #3900
Summary: This PR adds in support for asymmetric quantization in Int8Tensor by adding in a new optional tensor attribute, `zero_point` and `act_zero_point` for the weight / activation respectively. Also adds in a support for asymmetric quantization in our smoothquant implementation. Test Plan: ``` pytest test/quantization/quantize_/workflows/int8/test_int8_tensor.py pytest test/prototype/test_smoothquant.py ``` Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]
Summary: This PR adds in support for asymmetric quantization in Int8Tensor by adding in a new optional tensor attribute, `zero_point` and `act_zero_point` for the weight / activation respectively. Also adds in a support for asymmetric quantization in our smoothquant implementation. Test Plan: ``` pytest test/quantization/quantize_/workflows/int8/test_int8_tensor.py pytest test/prototype/test_smoothquant.py ``` Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: a146822 Pull Request resolved: #3900
| tensor: torch.Tensor, | ||
| quant_kwargs: QuantizeTensorKwargs, | ||
| scale: Optional[torch.Tensor] = None, | ||
| zero_point: Optional[torch.Tensor] = None, |
There was a problem hiding this comment.
nit: weight_zero_point for clear naming?
There was a problem hiding this comment.
I think we should keep this consistent with scale. act_zero_point is used to denote the activation zero point
| int8 quantized tensor with plain layout. | ||
|
|
||
| Currently only Symmetric quantization is supported. | ||
| Supports both symmetric and asymmetric quantization. |
There was a problem hiding this comment.
Maybe drop this docstring? Description inside Tensor Attribute section looks enough I feel.
There was a problem hiding this comment.
sounds good, will remove
| act_quant_kwargs: flags for dynamic activation quantization | ||
| """ | ||
|
|
||
| tensor_data_names = ["qdata", "scale"] |
There was a problem hiding this comment.
shouldn't zero_point be required for MappingType.ASYMMETRIC here?
torchao/quantization/quant_api.py
Outdated
| ) | ||
| assert config.version == 2, f"Unexpected version: {config.version}" | ||
|
|
||
| # TODO: Symmentric/Asymmetric choice for weight quantization |
|
@jcaip has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Removed TODO comment regarding symmetric/asymmetric weight quantization.
Removed mention of symmetric quantization support from docstring.
|
CC @cyxlily |
Stack from ghstack (oldest at bottom):
Summary:
This PR adds in support for asymmetric quantization in
Int8Tensor by adding in a new optional tensor attribute,
zero_pointandact_zero_pointfor the weight / activationrespectively.
Also adds in a support for asymmetric quantization in our smoothquant
implementation.
Test Plan:
Reviewers:
Subscribers:
Tasks:
Tags:
Differential Revision: D94258324