Regnet training hparams #1613

pamolchanov · 2023-01-04T18:56:52Z

pamolchanov
Jan 4, 2023

Hi @rwightman, we have noticed that RegNet models have amazing latency-accuracy tradeoff. Particularly, regnety_040 model is clearly above competition, it hits >7800 imgs/sec in TensorRT on A100 (fp16), and >3000 img/sec in PyTorch with AMP, NHWC.

Wondering what is the training recipe for the model. Have tried training with Swin recipe (300ep, AA etc, +drop-path) and only getting 81.5. If you could share the command that will significantly help building future models with RegNet components. Thanks.

Answered by rwightman

Jan 4, 2023

This is what I have related to hparams for those RegNets and other models trained in those sessions (https://github.com/rwightman/pytorch-image-models/releases/tag/v0.1-tpu-weights), there are 4 major sets of hparams used there, two relating to the regnets (ra3 and ch), the 'codes' for reference in the weight names are

ra3 - rmsproptf + lots of aug
ch, c1 - sgd + grad clipping (either global norm or agc)
sw - swin / convnext AdamW based
ah, a1 - lamb + lots of aug (ie ResNet Strikes Back style)

https://gist.github.com/rwightman/37252f8d7d850a94e43f1fcb7b3b8322

You'll have to remix for target models, adjust batch sizes (those were on 8 devices so x8 batch_size), dial back augreg for smal…

View full answer

ahatamiz · 2023-01-04T19:07:00Z

ahatamiz
Jan 4, 2023

+1

0 replies

rwightman · 2023-01-04T20:07:47Z

rwightman
Jan 4, 2023
Maintainer

This is what I have related to hparams for those RegNets and other models trained in those sessions (https://github.com/rwightman/pytorch-image-models/releases/tag/v0.1-tpu-weights), there are 4 major sets of hparams used there, two relating to the regnets (ra3 and ch), the 'codes' for reference in the weight names are

ra3 - rmsproptf + lots of aug
ch, c1 - sgd + grad clipping (either global norm or agc)
sw - swin / convnext AdamW based
ah, a1 - lamb + lots of aug (ie ResNet Strikes Back style)

https://gist.github.com/rwightman/37252f8d7d850a94e43f1fcb7b3b8322

You'll have to remix for target models, adjust batch sizes (those were on 8 devices so x8 batch_size), dial back augreg for small models, increase for big, I think the ra3 templates were for medium-large models

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regnet training hparams #1613

{{title}}

Replies: 2 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Regnet training hparams #1613

pamolchanov Jan 4, 2023

Replies: 2 comments

ahatamiz Jan 4, 2023

rwightman Jan 4, 2023 Maintainer

pamolchanov
Jan 4, 2023

ahatamiz
Jan 4, 2023

rwightman
Jan 4, 2023
Maintainer