If within our pytorch model, each component is assigned a GPU (model parallelism), do we need to remove the assignments to use deepspeed? #928
Unanswered
Santosh-Gupta
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
If within our pytorch model, each component is assigned a GPU (model parallelism), do we need to remove the assignments? Or would deepspeed automatically do the reassigning of the components?
How about in cases where different nvidia gpus are used (say a mix of a100s and GeForce)? Would deepspeed automatically take into account the different sizes of these gpus, and share the parameters accordingly ?
Beta Was this translation helpful? Give feedback.
All reactions