[Orbit]Handle iterator exhaustion in Controller.py by LINYV0719 · Pull Request #13595 · tensorflow/models

LINYV0719 · 2025-12-29T13:05:38Z

Description

This PR addresses the TODO in orbit/controller.py to support steps=-1 in Controller.train(), allowing training to run until the underlying dataset is exhausted.

Motivation: Previously, Controller.train required a fixed number of steps. This change allows users to train for a full epoch (or until the dataset runs out) without needing to know the exact dataset size beforehand, which is common when using tf.data.Dataset.

Changes:
-Modified Controller.train loop condition to accept steps=-1.
-Added a try-except block to catch tf.errors.OutOfRangeError and StopIteration during _train_n_steps. This ensures the loop exits gracefully when the iterator is exhausted instead of crashing.
-Added logic to break the loop if the global_step increment is less than expected (another indicator of exhaustion).
-Added a new test case test_train_until_exhaustion in orbit/controller_test.py to verify this behavior using a finite dataset.

Type of change

For a new feature or function, please create an issue first to discuss it
with us before submitting a pull request.

Note: Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)

Tests

I verified the changes by running the new test case and existing tests.

Test Configuration:

OS: Windows 11
Python Version: 3.10
Command: python -m orbit.controller_test
Result: Passed. specifically, test_train_until_exhaustion passed with the expected behavior

Checklist

I have signed the Contributor License Agreement.
I have read guidelines for pull request.
My code follows the coding guidelines.
I have performed a self code review of my own code.
I have commented my code, particularly in hard-to-understand areas.
I have made corresponding changes to the documentation.
My changes generate no new warnings.
I have added tests that prove my fix is effective or that my feature works.

Feat: Handle iterator exhaustion in Controller.train

fa6cffa

LINYV0719 requested a review from a team as a code owner December 29, 2025 13:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Orbit]Handle iterator exhaustion in Controller.py#13595

[Orbit]Handle iterator exhaustion in Controller.py#13595
LINYV0719 wants to merge 1 commit intotensorflow:masterfrom
LINYV0719:feat/train-until-exhaustion

LINYV0719 commented Dec 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

LINYV0719 commented Dec 29, 2025

Description

Type of change

Tests

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant