Skip to content

Conversation

KaparthyReddy
Copy link

Add max_eval_batches argument to TrainingArguments

Description

Adds a max_eval_batches parameter to TrainingArguments that allows users to limit the number of batches used during evaluation.

Fixes #31561

Motivation

When working with large evaluation datasets, running evaluation on the entire dataset can be very slow. During development, hyperparameter tuning, or quick iteration, it's often sufficient to evaluate on a subset of the data.

This is similar to PyTorch Lightning's limit_val_batches parameter.

Changes

  • ✅ Added max_eval_batches parameter to TrainingArguments
  • ✅ Implemented batch limiting in Trainer.evaluation_loop
  • ✅ Added test coverage
  • ✅ Added documentation in parameter metadata

Usage Example

from transformers import Trainer, TrainingArguments

training_args = TrainingArguments(
    output_dir="./output",
    evaluation_strategy="steps",
    eval_steps=100,
    max_eval_batches=50,  # Only evaluate on 50 batches
)

trainer = Trainer(
    model=model,
    args=training_args,
    eval_dataset=eval_dataset,
)

# Evaluation will stop after 50 batches instead of going through entire dataset
trainer.evaluate()

- Add max_eval_batches parameter to limit evaluation batches
- Implement batch limiting in Trainer.evaluation_loop
- Add test coverage for new argument
- Useful for speeding up evaluation on large datasets

This allows users to evaluate on a subset of batches instead of
the entire eval dataset, which is helpful for:
- Quick evaluation during development
- Large evaluation datasets where full evaluation is slow
- Hyperparameter tuning where approximate metrics are sufficient

Fixes huggingface#31561
@Rocketknight1
Copy link
Member

cc @SunMarc, but we might have another way to handle this already!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add argument to set number of eval steps in Trainer

2 participants