remove `torchao/prototype/moe_quant` #3554

vkuzo · 2025-12-26T18:36:09Z

Summary:

This is not used, removing.

Test Plan: CI

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]

vkuzo · 2025-12-26T18:36:10Z

Stack from ghstack (oldest at bottom):

-> remove torchao/prototype/moe_quant #3554

pytorch-bot · 2025-12-26T18:36:12Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3554

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Summary: This is not used, removing. Test Plan: CI Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 7bac41c ghstack-comment-id: 3693220218 Pull-Request: #3554

[ghstack-poisoned]

Summary: This is not used, removing. Test Plan: CI Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: ddcc172 ghstack-comment-id: 3693220218 Pull-Request: #3554

liangan1 · 2025-12-29T02:42:46Z

@vkuzo is there any plan to add the new design for moe quant?

liangan1 · 2026-01-05T03:00:04Z

cc @xiaowangintel

vkuzo · 2026-01-05T12:41:55Z

@vkuzo is there any plan to add the new design for moe quant?

Yes

we don't currently plan to provide any "MoE modules" (hence this PR which deletes "MoE modules")
we do plan to improve PyTorch Core MoE support (such as torch._grouped_mm) and support quantization as needed. So, if user writes their MoE with torch._grouped_mm, torchao plans to support quantizing that to torch._scaled_grouped_mm.
we have tentative plans to improve the integration of torchao checkpoints into vLLM for MoE models, using vLLM's MoE kernels

[ghstack-poisoned]

Summary: This is not used, removing. Test Plan: CI Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: aec80dc ghstack-comment-id: 3693220218 Pull-Request: #3554

[ghstack-poisoned]

Summary: This is not used, removing. Test Plan: CI Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: a2e8174 ghstack-comment-id: 3693220218 Pull-Request: #3554

liangan1 · 2026-01-06T00:00:11Z

@vkuzo is there any plan to add the new design for moe quant?

Yes

we don't currently plan to provide any "MoE modules" (hence this PR which deletes "MoE modules")

we do plan to improve PyTorch Core MoE support (such as torch._grouped_mm) and support quantization as needed. So, if user writes their MoE with torch._grouped_mm, torchao plans to support quantizing that to torch._scaled_grouped_mm.

we have tentative plans to improve the integration of torchao checkpoints into vLLM for MoE models, using vLLM's MoE kernels

Thanks for your info. But I suppose the item 2 should only works for the fp8/fp4, how about int4 and int8? besides, most of the MoE model definition in the HF/transformers is not based on the grouped_mm, is there any plan to extend the torch._grouped_mm adoption scope in the HF/transformers?

[ghstack-poisoned]

Summary: This is not used, removing. Test Plan: CI Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 1f95814 ghstack-comment-id: 3693220218 Pull-Request: #3554

vkuzo · 2026-01-06T12:46:13Z

But I suppose the item 2 should only works for the fp8/fp4, how about int4 and int8?

int4 and int8 could be supported as well, whether the kernel lives in Core, torchao or somewhere else can be case by case in the short term

most of the MoE model definition in the HF/transformers is not based on the grouped_mm

yes, the MoE authoring story is very fragmented. Long term, we want PyTorch core to have the right primitives to make MoE authoring easy, and in torchao have a story to easily quantize them. Short term, we may have to have case-by-case workarounds. We do plan to work on adoption of grouped_mm.

vkuzo added 2 commits December 26, 2025 10:27

Update

26b330d

[ghstack-poisoned]

Update

6a2ebc4

[ghstack-poisoned]

vkuzo added a commit that referenced this pull request Dec 26, 2025

remove torchao/prototype/moe_quant

76d3514

Summary: This is not used, removing. Test Plan: CI Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 7bac41c ghstack-comment-id: 3693220218 Pull-Request: #3554

vkuzo mentioned this pull request Dec 26, 2025

delete outdated llama benchmarks #3552

Merged

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 26, 2025

vkuzo added the topic: deprecation Use this tag if this PR deprecates a feature label Dec 26, 2025

vkuzo requested review from andrewor14 and jerryzh168 December 26, 2025 18:37

vkuzo added 2 commits December 26, 2025 11:24

Update

a095beb

[ghstack-poisoned]

Update

b77b634

[ghstack-poisoned]

vkuzo added a commit that referenced this pull request Dec 26, 2025

remove torchao/prototype/moe_quant

3b477ae

Summary: This is not used, removing. Test Plan: CI Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: ddcc172 ghstack-comment-id: 3693220218 Pull-Request: #3554

jerryzh168 approved these changes Jan 5, 2026

View reviewed changes

vkuzo added 2 commits January 5, 2026 10:24

Update

7c7cd95

[ghstack-poisoned]

Update

307938f

[ghstack-poisoned]

vkuzo added a commit that referenced this pull request Jan 5, 2026

remove torchao/prototype/moe_quant

9401cd3

Summary: This is not used, removing. Test Plan: CI Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: aec80dc ghstack-comment-id: 3693220218 Pull-Request: #3554

vkuzo added 2 commits January 5, 2026 13:50

Update

6631354

[ghstack-poisoned]

Update

31722fd

[ghstack-poisoned]

vkuzo added a commit that referenced this pull request Jan 5, 2026

remove torchao/prototype/moe_quant

e8e17ef

Summary: This is not used, removing. Test Plan: CI Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: a2e8174 ghstack-comment-id: 3693220218 Pull-Request: #3554

Update

53b314f

[ghstack-poisoned]

vkuzo added a commit that referenced this pull request Jan 6, 2026

remove torchao/prototype/moe_quant

3779154

Summary: This is not used, removing. Test Plan: CI Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 1f95814 ghstack-comment-id: 3693220218 Pull-Request: #3554

vkuzo changed the base branch from gh/vkuzo/193/head to main January 6, 2026 11:24

vkuzo merged commit dd41e98 into main Jan 6, 2026
42 of 54 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

remove `torchao/prototype/moe_quant` #3554

remove `torchao/prototype/moe_quant` #3554

vkuzo commented Dec 26, 2025

Uh oh!

vkuzo commented Dec 26, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Dec 26, 2025 •

edited

Loading

Uh oh!

liangan1 commented Dec 29, 2025

Uh oh!

liangan1 commented Jan 5, 2026

Uh oh!

vkuzo commented Jan 5, 2026

Uh oh!

liangan1 commented Jan 6, 2026 •

edited

Loading

Uh oh!

Uh oh!

vkuzo commented Jan 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

remove torchao/prototype/moe_quant #3554

remove torchao/prototype/moe_quant #3554

Conversation

vkuzo commented Dec 26, 2025

Uh oh!

vkuzo commented Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3554

Uh oh!

liangan1 commented Dec 29, 2025

Uh oh!

liangan1 commented Jan 5, 2026

Uh oh!

vkuzo commented Jan 5, 2026

Uh oh!

liangan1 commented Jan 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

vkuzo commented Jan 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

remove `torchao/prototype/moe_quant` #3554

remove `torchao/prototype/moe_quant` #3554

vkuzo commented Dec 26, 2025 •

edited

Loading

pytorch-bot bot commented Dec 26, 2025 •

edited

Loading

liangan1 commented Jan 6, 2026 •

edited

Loading