Hugging Face Issue Labeler · Workflow runs · huggingface/trl · GitHub

Actions

All workflows
Workflows
- Build TRL Docker image Build TRL Docker image
- Tests Tests
- Tests latest TRL release with dev dependencies Tests latest TRL release with dev dependencies
- Automatic Dependency Submission Automatic Dependency Submission
- Build documentation Build documentation
- Build PR Documentation Build PR Documentation
- Cleanup Cache Cleanup Cache
- CodeQL Analysis - Workflows CodeQL Analysis - Workflows
- Env Env
- Hugging Face Issue Labeler Hugging Face Issue Labeler
Management
- Caches
- Deployments

Hugging Face Issue Labeler

Actions

Loading...
Loading

issue_auto_labeller.yml

725 workflow runs

725 workflow runs

why kl = nan when grpo train? Hugging Face Issue Labeler #676: Issue #4040 opened by uilstong

30s

30s

About "or None" and "defaults to None" Hugging Face Issue Labeler #675: Issue #4036 opened by qgallouedec

22s

22s

CI fails: ValueError: zero-size array to reduction operation maximum which has no identity Hugging Face Issue Labeler #674: Issue #4035 opened by albertvillanova

33s

33s

SFTTrainer with PEFT model Hugging Face Issue Labeler #673: Issue #4029 opened by lylaiyy

39s

39s

Abnormal results during DPO training Hugging Face Issue Labeler #672: Issue #4023 opened by wjjwyj

28s

28s

Does the reward function trainer support multimodal models now? Hugging Face Issue Labeler #671: Issue #4021 opened by hlp2020

21s

21s

No warning for unsupported int4 quantization Hugging Face Issue Labeler #670: Issue #4018 opened by MRiabov

39s

39s

ParallelismConfig not applied in GRPOTrainer: Trainer._prepare_context_parallel_inputs expects dict[torch.Tensor] but receives list[dict[list]] Hugging Face Issue Labeler #669: Issue #4016 opened by JdRion

22s

22s

Does the grpo vllm colocate training now support the InternVL3 8B model? Hugging Face Issue Labeler #668: Issue #4015 opened by DearFishi

23s

23s

current_gradient_accumulation_steps is undefined when eval_on_start==True Hugging Face Issue Labeler #667: Issue #4010 opened by konstantinjdobler

31s

31s

Can not from trl import DataCollatorForLanguageModeling Hugging Face Issue Labeler #666: Issue #4009 opened by lylaiyy

29s

29s

Training Step of GRPO in Wandb. Hugging Face Issue Labeler #665: Issue #4004 opened by mandyyyyii

25s

25s

DPO trainer with video content Hugging Face Issue Labeler #664: Issue #4002 opened by GabrieleGiudic

25s

25s

scale_rewards malfunctioned in GRPOTrainer Hugging Face Issue Labeler #663: Issue #3991 opened by Peter-Chou

25s

25s

accelerator.sync_gradients Hugging Face Issue Labeler #662: Issue #3988 opened by AriesJin

28s

28s

Feature Request: Save/Load Precomputed Ref Log-Probabilities in DPOTrainer Hugging Face Issue Labeler #661: Issue #3985 opened by ginkyenglee

31s

31s

GRPOTrainer uses undefined self.current_gradient_accumulation_steps, causing AttributeError during training Hugging Face Issue Labeler #660: Issue #3983 opened by ahatamiz

22s

22s

DPOTrainer uses tokenizer instead of processor for Gemma3 vision Hugging Face Issue Labeler #659: Issue #3982 opened by supreme-gg-gg

26s

26s

Extremely persistent error - ValueError: 'generation_batch_size' and 'steps_per_generation' can not be both configured at the same time Hugging Face Issue Labeler #658: Issue #3980 opened by Rakshith12-pixel

27s

27s

GRPOTrainer vLLM colocate hardcodes MASTER_PORT=12345 so no parallel runs possible Hugging Face Issue Labeler #657: Issue #3979 opened by ruggsea

31s

31s

kto trainer invalid configuration error Hugging Face Issue Labeler #656: Issue #3974 opened by bryanchrist

29s

29s

REQUEST: Dynamic Sampling for GRPO Hugging Face Issue Labeler #655: Issue #3973 opened by wenquanlu

23s

23s

REQUEST：Add estimation of flos metric in GRPO Trainer Hugging Face Issue Labeler #654: Issue #3967 opened by LLMOON

25s

25s

[Question] Why isn't vanilla REINFORCE implemented? Hugging Face Issue Labeler #653: Issue #3966 opened by nityadav

21s

21s

[QST] Colocation and resharding Hugging Face Issue Labeler #652: Issue #3963 opened by jeromeku

26s

26s