Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The loss equaled to 87.3365 during the training stage and didn't change #20

Open
cengzy14 opened this issue Oct 31, 2017 · 5 comments
Open

Comments

@cengzy14
Copy link

I followed the instruction and didn't change the settings in solver.prototxt, but the loss converged to 87.3365 soon. It's said that this is because the learning rate is too large and the feature before the softmax layer equals to inf. So I am wondering what settings should I use with this network.
Thanks a lot!

@liuzhuang13
Copy link
Owner

Maybe first try a lower learning rate. If it doesn't work, see if other network architectures don't work either. Then make the decision on whether to use this network, or see if there's some bugs elsewhere.

@zhaofenqiang
Copy link

@cengzy14 , same problem, have you found any solutions?

@cengzy14
Copy link
Author

@zhaofenqiang 出现87.3365的原因是softmax之前的特征层出现的inf或者nan
如果是第一次测试时出现87.3365,原因是BN层方差初始为0,而eps初始为1e-5,除以根号下 eps导致feature map中数值越来越大,出现inf,所以第一次测试时一定会出现87.6635
如果是训练时出现可能是pooling层的stride不能整除输入feature map的尺寸,导致出现nan
解决这两个问题后我的accuracy=0了,最后也没有解决,于是就换了https://github.com/shicai/DenseNet-Caffe 里面提供的在imagenet上训练的代码和model

@zhaofenqiang
Copy link

@cengzy14 谢谢,很有帮助~

@wjzh1
Copy link

wjzh1 commented Apr 4, 2018

@cengzy14 请问你微调的时候网络收敛快么

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants