Skip to content

Add asymmetric support for Int8Tensor + SmoothQuant#3900

Merged
jcaip merged 4 commits intomainfrom
gh/jcaip/12/head
Feb 26, 2026
Merged

Add asymmetric support for Int8Tensor + SmoothQuant#3900
jcaip merged 4 commits intomainfrom
gh/jcaip/12/head

Conversation

@jcaip
Copy link
Contributor

@jcaip jcaip commented Feb 17, 2026

Stack from ghstack (oldest at bottom):

Summary:

This PR adds in support for asymmetric quantization in
Int8Tensor by adding in a new optional tensor attribute,
zero_point and act_zero_point for the weight / activation
respectively.

Also adds in a support for asymmetric quantization in our smoothquant
implementation.

Test Plan:

pytest test/quantization/quantize_/workflows/int8/test_int8_tensor.py
pytest test/prototype/test_smoothquant.py

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D94258324

Summary:

This PR adds in support for asymmetric activation quantization in
Int8Tensor and Smoothquant

Test Plan:

```
pytest test/quantization/quantize_/workflows/int8/test_int8_tensor.py
pytest test/prototype/test_smoothquant.py
```

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 17, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3900

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit dd40b73 with merge base 5906856 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jcaip added a commit that referenced this pull request Feb 17, 2026
Summary:

This PR adds in support for asymmetric activation quantization in
Int8Tensor and Smoothquant

Test Plan:

```
pytest test/quantization/quantize_/workflows/int8/test_int8_tensor.py
pytest test/prototype/test_smoothquant.py
```

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 5163aa0
Pull Request resolved: #3900
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 17, 2026
Summary:

This PR adds in support for asymmetric quantization in
Int8Tensor by adding in a new optional tensor attribute,
`zero_point` and `act_zero_point` for the weight / activation
respectively.

Also adds in a support for asymmetric quantization in our smoothquant
implementation.

Test Plan:

```
pytest test/quantization/quantize_/workflows/int8/test_int8_tensor.py
pytest test/prototype/test_smoothquant.py
```

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
jcaip added a commit that referenced this pull request Feb 17, 2026
Summary:

This PR adds in support for asymmetric quantization in
Int8Tensor by adding in a new optional tensor attribute,
`zero_point` and `act_zero_point` for the weight / activation
respectively.

Also adds in a support for asymmetric quantization in our smoothquant
implementation.

Test Plan:

```
pytest test/quantization/quantize_/workflows/int8/test_int8_tensor.py
pytest test/prototype/test_smoothquant.py
```

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: a146822
Pull Request resolved: #3900
@jcaip jcaip added the module: inference quantize_ api inference flow label Feb 17, 2026
tensor: torch.Tensor,
quant_kwargs: QuantizeTensorKwargs,
scale: Optional[torch.Tensor] = None,
zero_point: Optional[torch.Tensor] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: weight_zero_point for clear naming?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keep this consistent with scale. act_zero_point is used to denote the activation zero point

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok sounds good to me.

int8 quantized tensor with plain layout.

Currently only Symmetric quantization is supported.
Supports both symmetric and asymmetric quantization.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe drop this docstring? Description inside Tensor Attribute section looks enough I feel.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, will remove

act_quant_kwargs: flags for dynamic activation quantization
"""

tensor_data_names = ["qdata", "scale"]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't zero_point be required for MappingType.ASYMMETRIC here?

)
assert config.version == 2, f"Unexpected version: {config.version}"

# TODO: Symmentric/Asymmetric choice for weight quantization

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Symmentric-> Symmetric

Copy link

@hossein1387 hossein1387 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other than the typo, LGTM.

@jcaip jcaip changed the base branch from gh/jcaip/12/base to main February 24, 2026 19:58
@jcaip
Copy link
Contributor Author

jcaip commented Feb 24, 2026

@jcaip has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Removed TODO comment regarding symmetric/asymmetric weight quantization.
Removed mention of symmetric quantization support from docstring.
@Xia-Weiwen
Copy link
Collaborator

CC @cyxlily

@jcaip jcaip merged commit 8d65522 into main Feb 26, 2026
34 of 36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: inference quantize_ api inference flow

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants