You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Yes, it's true. Besides, in the instruction training phase, I met CUDA out of memory error in 5258/8056. Are your experiments conducted on A100 GPUs with 40GB or 80GB
Hi, really impressive work! I'm curious why using zero3 in pretraining but zero2 in instruction training?
The text was updated successfully, but these errors were encountered: