-
Notifications
You must be signed in to change notification settings - Fork 438
Pull requests: pytorch/ao
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Expand Triton autotune configs for MoE FP8 kernels to improve AMD GPU performance
#3952
opened Feb 25, 2026 by
brucechanglongxu
Loading…
3 tasks done
Fix operator precedence bug in is_Navi4() GPU detection
#3951
opened Feb 25, 2026 by
brucechanglongxu
Loading…
3 tasks done
Add better debug print for failed prepare
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
fb-exported
meta-exported
#3950
opened Feb 25, 2026 by
JakeStevens
Loading…
[mxfp8 training] remove mxfp8 from MXLinear and MXLinearConfig
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
module: training
quantize_ api training flow
mx
#3949
opened Feb 25, 2026 by
danielvegamyhre
Loading…
[mxfp8 training] unified MXFP8TrainingConfig and MXFP8TrainingTensor
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
module: training
quantize_ api training flow
moe
mx
#3948
opened Feb 25, 2026 by
danielvegamyhre
Loading…
Add FA4 fp8 backend to low precision attention api
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
topic: new feature
Use this tag if this PR adds a new feature
#3947
opened Feb 25, 2026 by
howardzhang-cv
•
Draft
Add RunningAbsMaxSmoothQuantObserver for memory-efficient calibration
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
fb-exported
meta-exported
#3946
opened Feb 25, 2026 by
jcaip
Loading…
Use relaxed memory ordering for Triton atomics on AMDGPU.
ciflow/rocm
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
module: training
quantize_ api training flow
moe
topic: improvement
Use this tag if this PR is an improvement (doesn't fit into any of the other categories)
topic: performance
Use this tag if this PR improves the performance of a feature
#3945
opened Feb 25, 2026 by
wenchenvincent
Loading…
[ROCm] Enable float8 utils and base tests on ROCm
module: rocm
#3942
opened Feb 24, 2026 by
brucechanglongxu
Loading…
[ROCm] Enable affine quantized, quant API, and integration tests on ROCm
module: rocm
#3941
opened Feb 24, 2026 by
brucechanglongxu
Loading…
[ROCm] Remove redundant ROCm test skips in NF4 and use accelerator-ag…
module: rocm
#3940
opened Feb 24, 2026 by
brucechanglongxu
Loading…
[ROCm] Enable low-bit optimizer tests on ROCm
module: rocm
#3939
opened Feb 24, 2026 by
brucechanglongxu
Loading…
lowbit linear weight packing and shared embedding on x86
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
fb-exported
meta-exported
#3938
opened Feb 24, 2026 by
JacobSzwejbka
Loading…
Add MXFP4 support for GPTQ quantization and separate eval script
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
#3935
opened Feb 23, 2026 by
jcaip
Loading…
[WIP] [mxfp8 moe training] add triton kernel for per group padding
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
#3933
opened Feb 22, 2026 by
danielvegamyhre
Loading…
Added benchmark for LLaMA 3 model for attention tests
benchmark
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
module: not user facing
Use this tag if you don't want this PR to show up in release notes
#3930
opened Feb 21, 2026 by
howardzhang-cv
•
Draft
Added benchmark for single attention layer across different sequence lengths
benchmark
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
module: not user facing
Use this tag if you don't want this PR to show up in release notes
#3929
opened Feb 21, 2026 by
howardzhang-cv
•
Draft
Add Static Activation Quantization Subclass for Observation
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
module: inference
quantize_ api inference flow
#3925
opened Feb 20, 2026 by
namgyu-youn
Loading…
4 tasks
[mxfp8 moe training] migrate ep utils to mx_formats
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
module: training
quantize_ api training flow
moe
mx
topic: improvement
Use this tag if this PR is an improvement (doesn't fit into any of the other categories)
#3922
opened Feb 20, 2026 by
danielvegamyhre
Loading…
Add MXFP4 support for GPTQ quantization and separate eval script
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
#3921
opened Feb 19, 2026 by
jcaip
Loading…
Add support for flashinfer quantize kernel option for nvfp4
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
module: not user facing
Use this tag if you don't want this PR to show up in release notes
#3912
opened Feb 17, 2026 by
jerryzh168
Loading…
Refactor This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
module: not user facing
Use this tag if you don't want this PR to show up in release notes
use_triton_kernel to use nvfp4_quantize_kernel_choice
CLA Signed
#3911
opened Feb 17, 2026 by
jerryzh168
Loading…
Add asymmetric support for Int8Tensor + SmoothQuant
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
module: inference
quantize_ api inference flow
#3900
opened Feb 17, 2026 by
jcaip
Loading…
refactor GPTQ observer for dynamic quant
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
#3899
opened Feb 17, 2026 by
jcaip
Loading…
[mxfp8 moe training] move mxfp8 grouped mm code into mx_formats
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
module: training
quantize_ api training flow
moe
mx
topic: improvement
Use this tag if this PR is an improvement (doesn't fit into any of the other categories)
#3898
opened Feb 17, 2026 by
danielvegamyhre
Loading…
Previous Next
ProTip!
What’s not been updated in a month: updated:<2026-01-25.