Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducing Imagenet Knowledge Transfer Top-1 Accuracy #11

Open
amiller195 opened this issue Nov 26, 2020 · 3 comments
Open

Reproducing Imagenet Knowledge Transfer Top-1 Accuracy #11

amiller195 opened this issue Nov 26, 2020 · 3 comments

Comments

@amiller195
Copy link

Hi,
Very interesting work!
According to Table 6 in the paper, training for 90 epochs with the 140K generated dataset should reach top-1 accuracy of 68.0%.
I'm trying to train Resnet50v1.5 based on the protocol here https://github.com/NVIDIA/DeepLearningExamples with the 140k dataset, can't pass top-1 accuracy of 10%.

Can you please elaborate on the training process using the generated 140k images? What protocol or additional work was required to reach the mentioned accuracy?

Thanks!

@hongxuyin
Copy link
Contributor

Using KL divergence instead of CE, and rescaling KL divergence into normal loss ranges - distillation setup details in Sec 4.4.

@tronguyen
Copy link

tronguyen commented Feb 20, 2021

Hi, thank you for the great work!

Sorry I also have the same question as above and wonder if the question is resolved.

I couldn't reproduce the accuracy on Imagenet with the 140k images provided. I only can reach over 30% top-1 accuracy as followed in Sec 4.4 from the paper. My training setups include: batch size 256, temperature 3, KL loss only (only relies on teacher logits), 250 epochs, learning rate 1.0 and SGD with a decay step of every 80 epochs.

Many thanks!

@CHENBIN99
Copy link

same question

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants