Reproducing Imagenet Knowledge Transfer Top-1 Accuracy #11

amiller195 · 2020-11-26T12:41:55Z

Hi,
Very interesting work!
According to Table 6 in the paper, training for 90 epochs with the 140K generated dataset should reach top-1 accuracy of 68.0%.
I'm trying to train Resnet50v1.5 based on the protocol here https://github.com/NVIDIA/DeepLearningExamples with the 140k dataset, can't pass top-1 accuracy of 10%.

Can you please elaborate on the training process using the generated 140k images? What protocol or additional work was required to reach the mentioned accuracy?

Thanks!

hongxuyin · 2021-01-28T19:16:05Z

Using KL divergence instead of CE, and rescaling KL divergence into normal loss ranges - distillation setup details in Sec 4.4.

tronguyen · 2021-02-20T23:44:00Z

Hi, thank you for the great work!

Sorry I also have the same question as above and wonder if the question is resolved.

I couldn't reproduce the accuracy on Imagenet with the 140k images provided. I only can reach over 30% top-1 accuracy as followed in Sec 4.4 from the paper. My training setups include: batch size 256, temperature 3, KL loss only (only relies on teacher logits), 250 epochs, learning rate 1.0 and SGD with a decay step of every 80 epochs.

Many thanks!

CHENBIN99 · 2023-03-31T07:58:34Z

same question

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reproducing Imagenet Knowledge Transfer Top-1 Accuracy #11

Reproducing Imagenet Knowledge Transfer Top-1 Accuracy #11

amiller195 commented Nov 26, 2020

hongxuyin commented Jan 28, 2021

tronguyen commented Feb 20, 2021 •

edited

Loading

CHENBIN99 commented Mar 31, 2023

Reproducing Imagenet Knowledge Transfer Top-1 Accuracy #11

Reproducing Imagenet Knowledge Transfer Top-1 Accuracy #11

Comments

amiller195 commented Nov 26, 2020

hongxuyin commented Jan 28, 2021

tronguyen commented Feb 20, 2021 • edited Loading

CHENBIN99 commented Mar 31, 2023

tronguyen commented Feb 20, 2021 •

edited

Loading