feat: INT8 scaled matmul Triton kernel #3558

namgyu-youn · 2025-12-31T13:18:33Z

Summary:
Introduce a Triton kernel for INT8 scaled matrix multiplication.

Motivation: The existing kernel (torch._int_mm) crashes with small dimension inputs in vLLM (failure log: https://gist.github.com/vkuzo/5bf389079442bb9851ef315cdcb797b4).

Micro benchmark results show the Triton kernel is ~2.6x faster than int_scaled_matmul (full log: https://gist.github.com/namgyu-youn/aa2fc5d444fdc4b52b35db555087e2ce).

Test plan:

pytest -sv test/kernel/test_int8mm_triton.py

Future work:

Add autotuning for dynamic block size selection
Add Swizzling (GROUP_M) for L2 cache optimization
Add SplitK for K-dimension parallelization (improves decode with small M)
Replace int_scaled_matmul in Int8Tensor after e2e validation

pytorch-bot · 2025-12-31T13:18:37Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3558

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

B200 runners are down due to network issues

This comment was automatically generated by Dr. CI and updates every 15 minutes.

namgyu-youn · 2025-12-31T13:19:14Z

@pytorchbot label "topic: new feature"

feat: int8 matmul triton kernel

4421dc3

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 31, 2025

namgyu-youn changed the title ~~feat: add INT8 scaled matmul Triton kernel~~ feat: INT8 scaled matmul Triton kernel Dec 31, 2025

pytorch-bot bot added the topic: new feature Use this tag if this PR adds a new feature label Dec 31, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: INT8 scaled matmul Triton kernel #3558

feat: INT8 scaled matmul Triton kernel #3558

namgyu-youn commented Dec 31, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Dec 31, 2025

Uh oh!

namgyu-youn commented Dec 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat: INT8 scaled matmul Triton kernel #3558

Are you sure you want to change the base?

feat: INT8 scaled matmul Triton kernel #3558

Conversation

namgyu-youn commented Dec 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Dec 31, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3558

❗ 1 Active SEVs

Uh oh!

namgyu-youn commented Dec 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

namgyu-youn commented Dec 31, 2025 •

edited

Loading