Skip to content

[DeepSeek MoE] current workstream planning #1125

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
lessw2020 opened this issue Apr 21, 2025 · 1 comment
Open

[DeepSeek MoE] current workstream planning #1125

lessw2020 opened this issue Apr 21, 2025 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@lessw2020
Copy link
Contributor

lessw2020 commented Apr 21, 2025

Making an issue to track expected work for DeepSeek experimental:

1 - Integrate DeepGEMM support (contiguous) as an additional inference option - this uses groupwise/blockwise fp8 quantization - completed (#1124)
1A - add triton contigous group gemm (AMD compat) - completed (#1154)
2 - refactor token processing to avoid code duplication. = PR #1127
3 - add proper training loop support - initial working PR landed. (see train_ds_real.py).
4 - need basic unit tests for checkins
5 - review AMD port for Symmetric Memory. (PR merged into PT core, need to verify run on AMD).
6 - finalize which groupGEMM's we want to support long term (torch bf16 + DeepSeek for fp8?). AMD?
updates -
fix for torch.group_gemm hang (#1166) so this has full training support now.
torch.__scaled_mm with wrappers via torchAO and thus fp8 rowwise has been added for ds inference now. (#1142)

7 - implement stats tracking for experts (exflow optimization) and subsequent more efficient expert placement. (initial stats tracking added, but only tracks topk1, needs topk6). Update = initial token tracking in place for topk==1, need to expand to topk==6.

8 - large scale training runs to prove out everything.

@lessw2020 lessw2020 added the enhancement New feature or request label Apr 21, 2025
@lessw2020 lessw2020 self-assigned this Apr 21, 2025
@lessw2020
Copy link
Contributor Author

#2 = PR #1127

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant