-
Notifications
You must be signed in to change notification settings - Fork 4.3k
BrainScript epochSize and Python epoch_size in CNTK
The number of samples (tensors along a dynamic axis) in each epoch. The epoch size in CNTK is the number of samples after which specific additional actions are taken, including
- saving a checkpoint model (training can be restarted from here)
- cross-validation
- learning-rate control
- minibatch-scaling
Note that the number of samples is similar to the number of samples used for minibatchSize (minibatch_size). This means for sequences,
For smaller data-set sizes, epochSize is often set equal to the dataset size. In BrainScript you can specify 0 to denote that. In python you can specify cntk.io.INFINITELY_REPEAT for that. For large data sets, you may want to guide your choice for epochSize by checkpointing. For example, if you want to lose at most 30 minutes of computation in case of a power outage or network glitch, you would want a checkpoint to be created about every 30 minutes (from which the training can be resumed). Choose epochSize to be the number of samples that takes about 30 minutes to compute.