-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Number of Epochs #4
Comments
I trained 15000 epochs to guarantee full convergence. I remembered that 5000 epochs can yield competitive performance. |
Thanks! And, how many GPUs do you recommend? Also, what do you mean by convergence here? Qualitatively, I can see that the reconstruction quality is good even at the very beginning of training. However, the sample realism is not good even after hundreds of iterations. They start as smooth images with no structure, then they start to have cifar10-like structures, but when you zoom in, they do not look like real objects. Is that expected to improve after thousands of iterations? Final questions, would it be possible to release the training configurations (epochs, batch size, etc) for the other datasets, please? Thanks! |
Hi, For configurations on other datasets, I will try to find them and share with you. But since it is a work two years ago when I was at CMU, it might take some time to find them. |
Hello, do you recall the train time for ImageNet and how many GPUs were used? The number of epochs would be helpful, too. It seems that each epoch takes around 1 GPU day or so, so 15,000 would be a big ask. |
Hi,
I am trying to run the cifar10 example in the README file. The command line arguments there specify 15000 as the number of epochs. How important is it to train the model for that many epochs? In other words, what is the minimum number of epochs to train for and still get reasonable results? Based on the speed I am seeing so far, it would take my system (with a single GPU) at least 7 weeks to finish 15000 epochs.
Thanks!
The text was updated successfully, but these errors were encountered: