If within our pytorch model, each component is assigned a GPU (model parallelism), do we need to remove the assignments to use deepspeed? #928

Santosh-Gupta · 2021-04-04T20:48:39Z

Santosh-Gupta
Apr 4, 2021

If within our pytorch model, each component is assigned a GPU (model parallelism), do we need to remove the assignments? Or would deepspeed automatically do the reassigning of the components?

How about in cases where different nvidia gpus are used (say a mix of a100s and GeForce)? Would deepspeed automatically take into account the different sizes of these gpus, and share the parameters accordingly ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

If within our pytorch model, each component is assigned a GPU (model parallelism), do we need to remove the assignments to use deepspeed? #928

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

If within our pytorch model, each component is assigned a GPU (model parallelism), do we need to remove the assignments to use deepspeed? #928

Santosh-Gupta Apr 4, 2021

Replies: 0 comments

Santosh-Gupta
Apr 4, 2021