-
Notifications
You must be signed in to change notification settings - Fork 391
Pull requests: NVIDIA/TransformerEngine
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[PyTorch] Explicitly specify quantized tensor usages needed for linear op backward
bug
Something isn't working
#1646
opened Apr 4, 2025 by
timmoon10
Loading…
7 of 13 tasks
Enable fp8 primary weights for sub-channel recipe
#1641
opened Apr 3, 2025 by
kunlunl
Loading…
7 of 13 tasks
Add adam bf16 state with original fp32 kernel
#1640
opened Apr 3, 2025 by
BestJuly
Loading…
1 of 13 tasks
Support FP8 primary weight in FSDP training
#1630
opened Apr 1, 2025 by
shjwudp
Loading…
1 of 13 tasks
[PyTorch] Debug checkpointing with te.Sequential
bug
Something isn't working
#1629
opened Apr 1, 2025 by
timmoon10
Loading…
8 of 13 tasks
Improved performance of mxfp8 cast kernels
2.2.0
performance
Performance issues
#1628
opened Mar 31, 2025 by
Oleg-Goncharov
Loading…
6 of 13 tasks
[Pytorch] NVIDIA-DL-Framework-Inspect support – part 1 – core
#1614
opened Mar 25, 2025 by
pggPL
Loading…
7 tasks done
[Pytorch] NVIDIA-DL-Framework-Inspect support – part 2 – features
#1613
opened Mar 25, 2025 by
pggPL
Loading…
7 tasks done
[Pytorch] NVIDIA-DL-Framework-Inspect support – part 3 – tests
#1612
opened Mar 25, 2025 by
pggPL
Loading…
7 of 13 tasks
[Pytorch] NVIDIA-DL-Framework-Inspect support – part 4 – documentation
#1611
opened Mar 25, 2025 by
pggPL
Loading…
7 tasks done
[JAX] Unbalanced Context Parallelism with THD format
#1565
opened Mar 12, 2025 by
zlsh80826
Loading…
8 of 13 tasks
Enable AttnFuncWithCPAndKVP2P to support mla
#1561
opened Mar 12, 2025 by
SuperCB
Loading…
3 of 13 tasks
Blockwise scaling linear quantization recipe
#1559
opened Mar 11, 2025 by
kwyss-nvidia
Loading…
8 of 13 tasks
change softmax_lse correction of CP to FP32
#1546
opened Mar 7, 2025 by
xrennvidia
Loading…
6 of 13 tasks
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.