Skip to content

Training instability #13608

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
1 task done
Taoboan1999 opened this issue May 27, 2025 · 7 comments
Open
1 task done

Training instability #13608

Taoboan1999 opened this issue May 27, 2025 · 7 comments
Labels
detect Object Detection issues, PR's question Further information is requested

Comments

@Taoboan1999
Copy link

Search before asking

Question

I'm training yolov12's model on my own dataset, and on map50 of the validation set, I often get 0.67→0.68→0.69→0.70→0.71→0.68→0.67→0.67→0.68→0.67→0.67
Such approximate changes in results. I ensured maximum batchsize=8 and maximum image size, tried reducing the learning rate lr0=0.001→0.0001→0.00001, and increased the parameters for data augmentation (mosaic).

I can't change all the data in my training set and validation set, how can I change this bad status quo please

Translated with DeepL.com (free version)

Additional

No response

@Taoboan1999 Taoboan1999 added the question Further information is requested label May 27, 2025
@UltralyticsAssistant UltralyticsAssistant added the detect Object Detection issues, PR's label May 27, 2025
@UltralyticsAssistant
Copy link
Member

👋 Hello @Taoboan1999, thank you for reaching out and for your detailed description! 🚀 This is an automated response to help you get started, and an Ultralytics engineer will assist you soon.

Please visit our ⭐️ Tutorials for guidance, including quickstart guides for Custom Data Training and tips for Best Training Results.

If this is a 🐛 Bug Report, please provide a minimum reproducible example (MRE) to help us debug more effectively.

If your question is about custom training, please include as much information as possible—such as dataset samples, training logs, and the exact commands you are using.

Requirements

Python>=3.8.0 with all requirements.txt installed including PyTorch>=1.8. To get started:

git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

Environments

YOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):

Status

YOLOv5 CI
If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on macOS, Windows, and Ubuntu every 24 hours and on every commit.

Thank you for your patience! An Ultralytics engineer will review your issue and provide more specific guidance soon. 😊

@pderrenger
Copy link
Member

Hi @Taoboan1999! The training instability you're experiencing with oscillating mAP50 values (0.67-0.71) is common and can be addressed with several adjustments. Try implementing learning rate scheduling with --cos-lr for cosine annealing, reduce --momentum from default 0.937 to 0.9, and consider using --patience for early stopping to prevent overfitting. You might also benefit from using Exponential Moving Average (EMA) which smooths model weights during training and often leads to more stable convergence patterns.

@Taoboan1999
Copy link
Author

Hi @Taoboan1999! The training instability you're experiencing with oscillating mAP50 values (0.67-0.71) is common and can be addressed with several adjustments. Try implementing learning rate scheduling with --cos-lr for cosine annealing, reduce --momentum from default 0.937 to 0.9, and consider using --patience for early stopping to prevent overfitting. You might also benefit from using Exponential Moving Average (EMA) which smooths model weights during training and often leads to more stable convergence patterns.

Thank you very much for your patience, I will try the various methods you mentioned one by one and if any of them are more effective, I will post them

@Taoboan1999
Copy link
Author

Hi @Taoboan1999! The training instability you're experiencing with oscillating mAP50 values (0.67-0.71) is common and can be addressed with several adjustments. Try implementing learning rate scheduling with --cos-lr for cosine annealing, reduce --momentum from default 0.937 to 0.9, and consider using --patience for early stopping to prevent overfitting. You might also benefit from using Exponential Moving Average (EMA) which smooths model weights during training and often leads to more stable convergence patterns.

How do you turn on EMA in yolov12 training.

@pderrenger
Copy link
Member

Hi @Taoboan1999! EMA is enabled by default in YOLOv5 training - you can control it with the --ema flag (enabled by default) and adjust the decay rate using --ema-decay (default 0.9999). If you want to disable EMA for testing, use --ema False, or to modify the decay rate for potentially more stable training, try --ema-decay 0.999 for faster adaptation to recent weights.

@Taoboan1999
Copy link
Author

Hi @Taoboan1999! EMA is enabled by default in YOLOv5 training - you can control it with the --ema flag (enabled by default) and adjust the decay rate using --ema-decay (default 0.9999). If you want to disable EMA for testing, use --ema False, or to modify the decay rate for potentially more stable training, try --ema-decay 0.999 for faster adaptation to recent weights.

Thank you again for your answer. I will try modifying this parameter when training YOLOv5, but I am still encountering overfitting issues in YOLOv12. I will ask a question in the YOLOv12 issues section. Thank you.

@pderrenger
Copy link
Member

You're welcome @Taoboan1999! Just to clarify, this is the YOLOv5 repository, so for YOLOv12 (YOLO12) questions you'll want to head over to the main Ultralytics repository where the newer YOLO versions are maintained. Good luck with your training!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
detect Object Detection issues, PR's question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants