Description
One of the core challenges of machine learning for airborne biodiversity observation is trying to generalize across sensors and acquisition conditions. Given low data sizes, data augmentations are crucial for good generalization across resolutions, focal views and object size.
A quick search of albumentations suggests hits a few existing classes that should be useful:
https://huggingface.co/spaces/qubvel-hf/albumentations-demo?transform=Downscale
https://huggingface.co/spaces/qubvel-hf/albumentations-demo?transform=RandomSizedBBoxSafeCrop
Checklist
-
Implement augmentations in their own module, not within preprocessing. Currently lives inline
DeepForest/deepforest/dataset.py
Line 35 in 80cb7d8
-
Allow the user to choose the augmentations either through the config file. Careful to allow defaults to remain unchanged and sets reasonable defaults if not specified in existing config files.
-
Make a doc page showing example augmentations
Optional
- Compare training with augmentations and without when predicting across resolutions.