Skip to content

Fix #1064: Auto log sample train/val images at training start (with tests) #1067

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

bw4sz
Copy link
Collaborator

@bw4sz bw4sz commented Jun 9, 2025

Description

This PR implements automatic logging of sample images from training and validation datasets at the start of training, as requested in #1064.

Changes

Implementation (src/deepforest/main.py)

  • Added on_train_start hook that:
    • Samples up to 5 images from training dataset
    • Samples up to 5 images from validation dataset (if available)
    • Uses visualize.plot_annotations() to create annotated images
    • Logs images to all available experiment loggers (Comet, TensorBoard, W&B, etc.)

Tests (tests/test_on_train_start.py)

  • Comprehensive test suite including:
    • Test that training images are logged correctly
    • Test that validation images are logged correctly
    • Test with multiple loggers
    • Test with empty annotations (edge case)
    • Test that correct number of images are sampled
    • Test without any loggers (should not crash)
    • Test that visualize.plot_annotations is called correctly
    • Test with no validation dataset
    • Test that parent class behavior is preserved

Implementation Details

  • The hook checks for non-empty annotations before sampling
  • Images are saved to a temporary directory
  • Each image is logged with metadata including filename, context (train/val), and current step
  • Compatible with all loggers that have a log_image method
  • Gracefully handles edge cases (empty datasets, no logger, etc.)

Testing

Run the new tests with:

pytest tests/test_on_train_start.py

The implementation will automatically log images when training starts. Users will see sample images in their experiment tracking tool of choice.

Fixes #1064

@jveitchmichaelis
Copy link
Collaborator

jveitchmichaelis commented Jun 10, 2025

This is the one from Cursor? It does have some "tells" - for some reason Claude seems to really like mocking in unit tests (and it loves to make long test suites). I had to make a rule to discourage that.

Can the bot respond to PR comments on GitHub? It would be great if we could feedback/review issues and have it fix stuff automatically.

@bw4sz bw4sz self-assigned this Jun 12, 2025
@bw4sz bw4sz force-pushed the fix-1064-auto-log-images-with-tests-1749496102 branch from ce5c09a to 22276eb Compare June 12, 2025 19:20
@bw4sz
Copy link
Collaborator Author

bw4sz commented Jun 17, 2025

Waiting on #1079 to see if there are rebase conflicts.

Copy link
Contributor

@henrykironde henrykironde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good but the tests could use some changes

@bw4sz bw4sz force-pushed the fix-1064-auto-log-images-with-tests-1749496102 branch from 6dd87ba to 16ce0a4 Compare June 28, 2025 02:37
@bw4sz
Copy link
Collaborator Author

bw4sz commented Jun 28, 2025

@henrykironde this should be ready, passes locally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Auto log a few images from train and val dataset to self.log
3 participants