Skip to content

Fetch from nvidia Megatron-LM #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5,006 commits into
base: load-iter
Choose a base branch
from
Open

Conversation

RaymondLi0
Copy link

No description provided.

BestJuly and others added 30 commits April 24, 2025 19:53
feat(MoE): FP8 Support for Multi-Token-Prediction

See merge request ADLR/megatron-lm!2950
Fix checkpoint directory bug in distill nightly test

Closes #446

See merge request ADLR/megatron-lm!3096
[dist ckpt] Re-attempt !2493 + fixing merge conflicts

See merge request ADLR/megatron-lm!2637
ci: Control which checks per test to run

See merge request ADLR/megatron-lm!3175
Fix the sync issue in `TemporalAsyncWorker`

See merge request ADLR/megatron-lm!3155
Co-authored-by: Chenhan Yu <[email protected]>
Co-authored-by: Chen-Han Yu <[email protected]>
Co-authored-by: Ye Yu <[email protected]>
Add ModelOpt speculative decoding finetune

See merge request ADLR/megatron-lm!2971
Co-authored-by: yaoyu-33 <[email protected]>
Co-authored-by: Mcore Bot <[email protected]>
Co-authored-by: Chenhan Yu <[email protected]>
Moe fix for Llama4

See merge request ADLR/megatron-lm!3083
[custom FSDP] Support EP + FSDP training for DeepSeek-v3

See merge request ADLR/megatron-lm!2910
Fix extra tokens in returned generation

Closes dl/JoC/nemo-ci#2075

See merge request ADLR/megatron-lm!3178
Update current scaling supported TE version to 2.2.0.dev0

See merge request ADLR/megatron-lm!3160
Co-authored-by: Shanmugam Ramasamy <[email protected]>
Co-authored-by: Mcore Bot <[email protected]>
Co-authored-by: Shanmugam Ramasamy <[email protected]>
Co-authored-by: Shanmugam Ramasamy <[email protected]>
Co-authored-by: Vijay Korthikanti <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: root <[email protected]>
Seperate chunk allocator

See merge request ADLR/megatron-lm!3121
Revert inference_context.is_decode_only() to inference_context.sequence_len_offset > 0

See merge request ADLR/megatron-lm!3180
…-fusion will throw an exception when topk/num_local_experts is not the power of 2.
[BUG FIX]: fix the bug of indices-to-multihot-fusion will throw an exception when topk/num_local_experts is not the power of 2.

See merge request ADLR/megatron-lm!3058
…g global ones with optional local ones for better parallelism flexibility

Co-authored-by: Zhiyu Li <[email protected]>
Refactor Inference Process Groups by replacing global ones with optional local ones for better parallelism flexibility

See merge request ADLR/megatron-lm!3015
Update te patch to include 1626

See merge request ADLR/megatron-lm!3179
Matthieu Le and others added 30 commits May 29, 2025 02:34
Update dataset helper for online video decoding

See merge request ADLR/megatron-lm!3367
Do not use eval on arbitrary user input.

See merge request ADLR/megatron-lm!3365
tests: Update frozen-checkpoints

See merge request ADLR/megatron-lm!3363
Consolidate eval methods across train and generation

See merge request ADLR/megatron-lm!3375
ci: Auto-restart on nan

See merge request ADLR/megatron-lm!3388
perf(mla, experimental): MLA RoPE fusion and YARN embedding cache

Closes #429

See merge request ADLR/megatron-lm!2949
Fix custom FSDP float8 tensor set_item

See merge request ADLR/megatron-lm!3280
ci: Move queue blocker

See merge request ADLR/megatron-lm!3401
ci: Improve error-handling of missing logs

See merge request ADLR/megatron-lm!3400
ci: Control job concurrency

See merge request ADLR/megatron-lm!3408
ci: Catch missing logs

See merge request ADLR/megatron-lm!3412
ci: Remove tests from A100

See merge request ADLR/megatron-lm!3411
Add an option to skip counting zeros in grad of ChainedOptimizer

See merge request ADLR/megatron-lm!3393
Add an interface to set high priority stream groups

See merge request ADLR/megatron-lm!3326
Co-authored-by: Chen-Han Yu <[email protected]>
Co-authored-by: Chenhan Yu <[email protected]>
Llama4 inference

See merge request ADLR/megatron-lm!3241
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.