|
| 1 | +# Optimizers Validation |
| 2 | + |
| 3 | +## Overview |
| 4 | +- A minimal example that trains on either random data or the MNIST dataset to quickly compare optimizers. |
| 5 | +- Assumes FP32-only build/runtime. |
| 6 | +- Merges functionality of previous independent Random and MNIST optimizer apps. |
| 7 | + |
| 8 | +## Build |
| 9 | +1. From repository root: |
| 10 | +```bash |
| 11 | +meson setup build -Dbuildtype=release -Denable-app=true |
| 12 | +meson compile -C build -j"$(nproc)" |
| 13 | +``` |
| 14 | + |
| 15 | +## Run Examples |
| 16 | + |
| 17 | +**With Random Data** |
| 18 | + |
| 19 | +Same configuration; switch only the optimizer: |
| 20 | + |
| 21 | +```bash |
| 22 | +cd build/Applications/Optimizers/jni |
| 23 | + |
| 24 | +# Lion (Default, varied Weight Decay) |
| 25 | +./nntrainer_optimizers --dataset=random --opt=lion --wd=0 --bs=16 --db=32 --epochs=5 --lr=0.001 |
| 26 | +./nntrainer_optimizers --dataset=random --opt=lion --wd=0.01 --bs=16 --db=32 --epochs=5 --lr=0.001 |
| 27 | + |
| 28 | +# Adam |
| 29 | +./nntrainer_optimizers --dataset=random --opt=adam --bs=16 --db=32 --epochs=5 --lr=0.001 |
| 30 | + |
| 31 | +# AdamW |
| 32 | +./nntrainer_optimizers --dataset=random --opt=adamw --bs=16 --db=32 --epochs=5 --lr=0.001 |
| 33 | + |
| 34 | +# SGD (Varied Learning Rate) |
| 35 | +./nntrainer_optimizers --dataset=random --opt=sgd --bs=16 --db=32 --epochs=5 --lr=0.001 |
| 36 | +./nntrainer_optimizers --dataset=random --opt=sgd --bs=16 --db=32 --epochs=5 --lr=0.0005 |
| 37 | +``` |
| 38 | + |
| 39 | +**With MNIST Data** |
| 40 | + |
| 41 | +Compare optimizers with the same settings (requires MNIST resources): |
| 42 | + |
| 43 | +```bash |
| 44 | +cd build/Applications/Optimizers/jni |
| 45 | + |
| 46 | +./nntrainer_optimizers --dataset=mnist --opt=lion --lr=0.001 --wd=0.01 --epochs=100 --bs=32 |
| 47 | +./nntrainer_optimizers --dataset=mnist --opt=adam --lr=0.001 --epochs=100 --bs=32 |
| 48 | +./nntrainer_optimizers --dataset=mnist --opt=adamw --lr=0.001 --wd=0.01 --epochs=100 --bs=32 |
| 49 | +./nntrainer_optimizers --dataset=mnist --opt=sgd --lr=0.001 --epochs=100 --bs=32 |
| 50 | +``` |
| 51 | + |
| 52 | +Run from an arbitrary directory with explicit resource paths: |
| 53 | + |
| 54 | +```bash |
| 55 | +./build/Applications/Optimizers/jni/nntrainer_optimizers \ |
| 56 | + --dataset=mnist \ |
| 57 | + --config=Applications/MNIST/res/mnist.ini \ |
| 58 | + --data=Applications/MNIST/res/mnist_trainingSet.dat \ |
| 59 | + --opt=lion --lr=0.001 --wd=0.01 --epochs=100 --bs=32 |
| 60 | +``` |
| 61 | + |
| 62 | +## Options |
| 63 | + |
| 64 | +**General** |
| 65 | +- `--dataset=random|mnist` : dataset type (default: random) |
| 66 | +- `--opt=lion|adam|adamw|sgd` : optimizer (default: lion) |
| 67 | +- `--wd=<float>` : weight decay (used by Lion/AdamW) |
| 68 | +- `--epochs=<int>` : number of epochs |
| 69 | +- `--bs=<int>` : batch size |
| 70 | +- `--lr=<float>` : learning rate |
| 71 | + |
| 72 | +**Random Dataset Options** |
| 73 | +- `--db=<int>` : number of batches (iterations) per epoch |
| 74 | + |
| 75 | +**MNIST Dataset Options** |
| 76 | +- `--config=<path>` : INI path (auto-discovered if not provided) |
| 77 | +- `--data=<path>` : dataset path (auto-discovered if not provided) |
| 78 | +- `--train_size=<uint>` : number of training samples (default: 100) |
| 79 | +- `--val_size=<uint>` : number of validation samples (default: 100) |
| 80 | + |
| 81 | +## Output |
| 82 | +- Prints L2 norm of weights before/after training. |
| 83 | +- Prints Delta L2 (magnitude of weight updates). |
| 84 | +- Prints per-epoch training loss logs. |
| 85 | + |
| 86 | +## Notes |
| 87 | +- Because data is random in Random Mode, loss curves are better suited to compare update magnitude and consistency rather than convergence. |
| 88 | +- MNIST Mode exposes optimizer differences in loss curves (convergence speed/quality) more clearly than random data. |
0 commit comments