Skip to content

Regnet training hparams #1613

Answered by rwightman
pamolchanov asked this question in Q&A
Discussion options

You must be logged in to vote

This is what I have related to hparams for those RegNets and other models trained in those sessions (https://github.com/rwightman/pytorch-image-models/releases/tag/v0.1-tpu-weights), there are 4 major sets of hparams used there, two relating to the regnets (ra3 and ch), the 'codes' for reference in the weight names are

  • ra3 - rmsproptf + lots of aug
  • ch, c1 - sgd + grad clipping (either global norm or agc)
  • sw - swin / convnext AdamW based
  • ah, a1 - lamb + lots of aug (ie ResNet Strikes Back style)

https://gist.github.com/rwightman/37252f8d7d850a94e43f1fcb7b3b8322

You'll have to remix for target models, adjust batch sizes (those were on 8 devices so x8 batch_size), dial back augreg for smal…

Replies: 2 comments

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Answer selected by pamolchanov
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants