Integrating Image TIling into Training #19379

RyanDoesMath · 2025-02-22T17:19:24Z

RyanDoesMath
Feb 22, 2025

I wanted to start a discussion here about how I could implement an experiment I wanted to try for small object detection.
Currently, the best solution for most small object detection is

Perform image tiling on a dataset
Train a model on the tiles
At inference, tile the input image
Run the model on each tile
Perform NMS on the reassembled detections.

I had a professor who claimed that this kind of ad-hoc training is usually worse than an integrated training approach because the weights can be updated based on the whole process. To test this, I wanted to instead try the following:

Leave the images in the dataset whole.
Lower batch size from 16 to 4.
During training, include image tiling as a preprocessing step just before the forward pass.
Perform a forward pass on every tile.
Reassemble using NMS.
Compute loss based on the reassembled detections.

In theory this should balloon training time, especially since I don't know how to do this on-gpu. However, I am curious whether or not it would increase accuracy, or not make much of a difference.

Any ideas on how this could be implemented? My first thought was to create a copy of YOLOv11 as a nn.Module class, then make my interjections into the process there, and supply it to the Ultralytics training as the model in place of the normal YOLOv11 model. Very open to suggestions.

UltralyticsAssistant · 2025-02-22T17:19:47Z

UltralyticsAssistant
Feb 22, 2025
Maintainer

👋 Hello @RyanDoesMath, thank you for your interest in Ultralytics 🚀! This is an intriguing experiment for small object detection, and we appreciate you sharing your innovative approach here. For integrating image tiling into training, you're certainly delving into a complex but fascinating area of research.

We recommend checking out the Ultralytics Docs for valuable insights into the model and how various customization options can be implemented. Specifically, you might find the Python Usage Examples helpful as a starting point for modifying the training process. The Tips for Best Training Results could also provide some guidance on optimizing your experimental setup.

If you're aiming to modify the YOLOv8 architecture or preprocessing pipeline, forking the repository and creating a custom nn.Module model as you suggested does seem like a sound starting point. The source code in this repository is modular and designed to support such customizations.

If this relates to a 🐛 bug or unexpected issues while implementing your changes, please provide a minimum reproducible example to help us better understand the situation and assist more effectively.

Additionally, you might benefit from engaging with the Ultralytics community for brainstorming and input. Feel free to join our:

Discord 🎧 for dynamic discussions
Discourse Forum for deeper topic dives
Subreddit for knowledge sharing

Upgrade

Before beginning your implementation, ensure you're using the latest version of the ultralytics package. An updated environment can help you avoid issues that may already have been resolved:

pip install -U ultralytics

Environments

If you're testing your implementation in a specific environment (e.g., notebooks or cloud settings), ensure all dependencies are preinstalled and up-to-date. The following environments are verified for smooth integration:

Notebooks:
Google Cloud: GCP Quickstart Guide
AWS: AWS Quickstart Guide
Docker: Docker Quickstart Guide

Status

This badge reflects the current state of Ultralytics CI testing, ensuring all YOLO Modes and Tasks pass successfully on macOS, Windows, and Ubuntu. If the badge is green, things should be good to go!

This is an automated response, but an Ultralytics engineer will review your discussion soon and provide further guidance 😊. Happy coding and experimenting!

0 replies

glenn-jocher · 2025-02-22T20:38:25Z

glenn-jocher
Feb 22, 2025
Maintainer

@RyanDoesMath for implementing integrated tiling in YOLO11 training, we recommend extending the DetectionModel class to add custom tiling logic during preprocessing. You'll need to modify the forward pass to process tiles and aggregate results before loss calculation.

The SAHI Tiled Inference guide demonstrates inference-time tiling integration that could be adapted for training. For training modifications, consider adjusting the augment pipeline in the dataloader or creating a custom training loop with mosaic augmentation disabled.

While this approach could theoretically improve small object recognition, it would significantly increase memory consumption and training time. We suggest starting with our existing small-object detection recommendations from the Hyperparameter Tuning guide first, then experimenting with custom tiling implementations.

1 reply

robmarkcole May 12, 2025

I can't find the referenced existing small-object detection recommendations

rygall · 2025-02-27T16:36:12Z

rygall
Feb 27, 2025

Would love to see what you come up with on this. I am looking into the same thing right now. I have very large image files that I need to tile, otherwise the gradients during training consume far too much memory.

4 replies

glenn-jocher Feb 28, 2025
Maintainer

@rygall to implement image tiling during training with YOLO11, consider modifying the Dataset class to generate tiles on-the-fly while preserving original annotations. We recommend exploring SAHI's slice_image function for tile generation and integrating it into your data pipeline. For inference, see our SAHI Tiled Inference guide which demonstrates tile processing and NMS merging.

This approach would require custom dataset preprocessing rather than model architecture changes. For memory efficiency, use imgsz=640 (or lower) and batch=4 as you suggested. Let us know if you achieve interesting results!

rygall Mar 5, 2025

This makes a ton of sense! Thank you.

glenn-jocher Mar 6, 2025
Maintainer

You're welcome! For additional insights on optimizing small object detection through tiling, see our Model Evaluation Insights guide which covers annotation adjustments and performance considerations. We'd love to hear about your results! 🚀

p-prakash Oct 12, 2025

@rygall Were you able to make it work, do you have any example/sample code?

@glenn-jocher , can you please point me to an example where Dataset class has been modified (even if it's for some other scenarios)?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ultralytics

Integrating Image TIling into Training #19379

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments 5 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Ultralytics

Integrating Image TIling into Training #19379

Uh oh!

RyanDoesMath Feb 22, 2025

Replies: 3 comments · 5 replies

Uh oh!

UltralyticsAssistant Feb 22, 2025 Maintainer

Upgrade

Environments

Status

Uh oh!

Uh oh!

glenn-jocher Feb 22, 2025 Maintainer

Uh oh!

robmarkcole May 12, 2025

Uh oh!

rygall Feb 27, 2025

Uh oh!

glenn-jocher Feb 28, 2025 Maintainer

Uh oh!

rygall Mar 5, 2025

Uh oh!

glenn-jocher Mar 6, 2025 Maintainer

Uh oh!

p-prakash Oct 12, 2025

RyanDoesMath
Feb 22, 2025

Replies: 3 comments 5 replies

UltralyticsAssistant
Feb 22, 2025
Maintainer

glenn-jocher
Feb 22, 2025
Maintainer

rygall
Feb 27, 2025

glenn-jocher Feb 28, 2025
Maintainer

glenn-jocher Mar 6, 2025
Maintainer