-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strange output log #2
Comments
The uploaded txt log_rank0.txt is one of the eight gpus pretrain logs. |
I also encountered the same problem. |
@launchauto @michuanhaohao me too, but I run it with precision O0. Did you run with the O0 precision? |
我也遇到了这个问题!loss一直是16永远不会下降? |
怎么才能不适用apex混合精度呢?我使用swin transformer进行训练的时候,loss就会下降并且收敛。然而,我注意到swin transformer工程当中没有使用apex混合精度 |
Is it normal for the loss value to be around 16? Has anyone encountered this problem? |
请问您的问题解决了吗 |
我也是 |
Excuse me, have you solved the problem that loss drops to 8.9 and then rises in the opposite direction? Is it caused by apex mixed precision training? |
没有/(ㄒoㄒ)/~~ |
会不会是loss函数的问题呀 这个代码你还在关注吗,我的loss从开始就是16 降不下去 |
我也没有解决。。 |
Hi authors, I have pretrianed your moby_swin_tiny model using 8 Tesla V100 GPU
and reproduced your results in downstream task. I get 74.394% on linear evaluation and 43.1% on COCO object detection task, 39.3% on COCO segmentation task. But the loss and grad_norm is really weired during training. Can you show me your log?
Here is my log. The loss drops to 7 and then rises to 16, then never drop again. During the pretraining task, the grad norm average value sometimes rises to infinite.
log_rank0.txt
The text was updated successfully, but these errors were encountered: