Skip to content

Commit 0f5c457

Browse files
gitttt-1234claude
andcommitted
Update default trainer configuration parameters
This commit updates the default values for learning rate scheduler and optimizer configurations to improve training performance and stability: **Configuration Changes:** - OptimizerConfig: - Learning rate: 1e-3 → 1e-4 - ReduceLROnPlateauConfig: - threshold_mode: "rel" → "abs" - threshold: 1e-4 → 1e-6 - patience: 10 → 5 - factor: 0.1 → 0.5 - cooldown: 0 → 3 - min_lr: 0.0 → 1e-8 - EarlyStoppingConfig: - min_delta: 0.0 → 1e-8 - patience: 1 → 10 - LRSchedulerConfig: - Now defaults to ReduceLROnPlateauConfig instead of None **Files Updated:** - Updated sleap_nn/config/trainer_config.py with new defaults and documentation - Updated all sample config files in docs/sample_configs/ - Updated test config files in tests/assets/model_ckpts/ - Updated configuration documentation in docs/config.md - Updated test assertions in tests/config/test_trainer_config.py The new defaults provide: - More conservative learning rate scheduling - Better threshold sensitivity with absolute mode - Improved early stopping behavior - More stable training convergence 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
1 parent bdd853f commit 0f5c457

26 files changed

+69
-68
lines changed

docs/config.md

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -123,7 +123,7 @@ trainer_config:
123123
step_lr: null
124124
reduce_lr_on_plateau:
125125
threshold: 1.0e-06
126-
threshold_mode: rel
126+
threshold_mode: abs
127127
cooldown: 3
128128
patience: 5
129129
factor: 0.5
@@ -739,7 +739,7 @@ trainer_config:
739739
### Optimizer Configuration
740740
- `optimizer_name`: (str) Optimizer to be used. One of ["Adam", "AdamW"]. **Default**: `"Adam"`
741741
- `optimizer`:
742-
- `lr`: (float) Learning rate of type float. **Default**: `1e-3`
742+
- `lr`: (float) Learning rate of type float. **Default**: `1e-4`
743743
- `amsgrad`: (bool) Enable AMSGrad with the optimizer. **Default**: `False`
744744

745745
### Learning Rate Schedulers
@@ -752,12 +752,12 @@ trainer_config:
752752

753753
#### Reduce LR on Plateau
754754
- `lr_scheduler.reduce_lr_on_plateau`:
755-
- `threshold`: (float) Threshold for measuring the new optimum, to only focus on significant changes. **Default**: `1e-4`
756-
- `threshold_mode`: (str) One of "rel", "abs". In rel mode, dynamic_threshold = best * ( 1 + threshold ) in max mode or best * ( 1 - threshold ) in min mode. In abs mode, dynamic_threshold = best + threshold in max mode or best - threshold in min mode. **Default**: `"rel"`
757-
- `cooldown`: (int) Number of epochs to wait before resuming normal operation after lr has been reduced. **Default**: `0`
758-
- `patience`: (int) Number of epochs with no improvement after which learning rate will be reduced. For example, if patience = 2, then we will ignore the first 2 epochs with no improvement, and will only decrease the LR after the third epoch if the loss still hasn't improved then. **Default**: `10`
759-
- `factor`: (float) Factor by which the learning rate will be reduced. new_lr = lr * factor. **Default**: `0.1`
760-
- `min_lr`: (float or List[float]) A scalar or a list of scalars. A lower bound on the learning rate of all param groups or each group respectively. **Default**: `0.0`
755+
- `threshold`: (float) Threshold for measuring the new optimum, to only focus on significant changes. **Default**: `1e-6`
756+
- `threshold_mode`: (str) One of "rel", "abs". In rel mode, dynamic_threshold = best * ( 1 + threshold ) in max mode or best * ( 1 - threshold ) in min mode. In abs mode, dynamic_threshold = best + threshold in max mode or best - threshold in min mode. **Default**: `"abs"`
757+
- `cooldown`: (int) Number of epochs to wait before resuming normal operation after lr has been reduced. **Default**: `3`
758+
- `patience`: (int) Number of epochs with no improvement after which learning rate will be reduced. For example, if patience = 2, then we will ignore the first 2 epochs with no improvement, and will only decrease the LR after the third epoch if the loss still hasn't improved then. **Default**: `5`
759+
- `factor`: (float) Factor by which the learning rate will be reduced. new_lr = lr * factor. **Default**: `0.5`
760+
- `min_lr`: (float or List[float]) A scalar or a list of scalars. A lower bound on the learning rate of all param groups or each group respectively. **Default**: `1e-8`
761761

762762
**Example Learning Rate Scheduler configurations:**
763763

@@ -786,7 +786,7 @@ trainer_config:
786786
step_lr: null
787787
reduce_lr_on_plateau:
788788
threshold: 1e-6
789-
threshold_mode: "rel"
789+
threshold_mode: "abs"
790790
cooldown: 3
791791
patience: 5
792792
factor: 0.5
@@ -796,8 +796,8 @@ trainer_config:
796796
### Early Stopping
797797
- `early_stopping`:
798798
- `stop_training_on_plateau`: (bool) True if early stopping should be enabled. **Default**: `False`
799-
- `min_delta`: (float) Minimum change in the monitored quantity to qualify as an improvement, i.e. an absolute change of less than or equal to min_delta, will count as no improvement. **Default**: `0.0`
800-
- `patience`: (int) Number of checks with no improvement after which training will be stopped. Under the default configuration, one check happens after every training epoch. **Default**: `1`
799+
- `min_delta`: (float) Minimum change in the monitored quantity to qualify as an improvement, i.e. an absolute change of less than or equal to min_delta, will count as no improvement. **Default**: `1e-8`
800+
- `patience`: (int) Number of checks with no improvement after which training will be stopped. Under the default configuration, one check happens after every training epoch. **Default**: `10`
801801

802802
### Online Hard Keypoint Mining (OHKM)
803803
- `online_hard_keypoint_mining`:

docs/sample_configs/config_bottomup_convnext.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -122,7 +122,7 @@ trainer_config:
122122
step_lr: null
123123
reduce_lr_on_plateau:
124124
threshold: 1.0e-06
125-
threshold_mode: rel
125+
threshold_mode: abs
126126
cooldown: 3
127127
patience: 5
128128
factor: 0.5

docs/sample_configs/config_bottomup_unet_large_rf.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -133,7 +133,7 @@ trainer_config:
133133
step_lr: null
134134
reduce_lr_on_plateau:
135135
threshold: 1.0e-08
136-
threshold_mode: rel
136+
threshold_mode: abs
137137
cooldown: 3
138138
patience: 8
139139
factor: 0.5

docs/sample_configs/config_bottomup_unet_medium_rf.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -133,7 +133,7 @@ trainer_config:
133133
step_lr: null
134134
reduce_lr_on_plateau:
135135
threshold: 1.0e-08
136-
threshold_mode: rel
136+
threshold_mode: abs
137137
cooldown: 3
138138
patience: 8
139139
factor: 0.5

docs/sample_configs/config_centroid_swint.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -126,7 +126,7 @@ trainer_config:
126126
step_lr: null
127127
reduce_lr_on_plateau:
128128
threshold: 1.0e-06
129-
threshold_mode: rel
129+
threshold_mode: abs
130130
cooldown: 3
131131
patience: 5
132132
factor: 0.5

docs/sample_configs/config_centroid_unet.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -127,7 +127,7 @@ trainer_config:
127127
step_lr: null
128128
reduce_lr_on_plateau:
129129
threshold: 1.0e-08
130-
threshold_mode: rel
130+
threshold_mode: abs
131131
cooldown: 3
132132
patience: 5
133133
factor: 0.5

docs/sample_configs/config_multi_class_bottomup_unet.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -122,7 +122,7 @@ trainer_config:
122122
step_lr: null
123123
reduce_lr_on_plateau:
124124
threshold: 1.0e-06
125-
threshold_mode: rel
125+
threshold_mode: abs
126126
cooldown: 3
127127
patience: 5
128128
factor: 0.5

docs/sample_configs/config_single_instance_unet_large_rf.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -127,7 +127,7 @@ trainer_config:
127127
step_lr: null
128128
reduce_lr_on_plateau:
129129
threshold: 1.0e-05
130-
threshold_mode: rel
130+
threshold_mode: abs
131131
cooldown: 3
132132
patience: 5
133133
factor: 0.5

docs/sample_configs/config_single_instance_unet_medium_rf.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -127,7 +127,7 @@ trainer_config:
127127
step_lr: null
128128
reduce_lr_on_plateau:
129129
threshold: 1.0e-08
130-
threshold_mode: rel
130+
threshold_mode: abs
131131
cooldown: 3
132132
patience: 5
133133
factor: 0.5

docs/sample_configs/config_topdown_centered_instance_unet_large_rf.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -129,7 +129,7 @@ trainer_config:
129129
step_lr: null
130130
reduce_lr_on_plateau:
131131
threshold: 1.0e-08
132-
threshold_mode: rel
132+
threshold_mode: abs
133133
cooldown: 3
134134
patience: 5
135135
factor: 0.5

0 commit comments

Comments
 (0)