Running PPO on transformer with deepspeed zero 3. #1050

Vbansal21 · 2021-05-06T10:42:02Z

Vbansal21
May 6, 2021

Is it possible to setup a training system similar to the system explained in 'Learning to summarize from Human Feedback', where PPO was implemented on transformers, and train it with zero-3 zero-infinity ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running PPO on transformer with deepspeed zero 3. #1050

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Running PPO on transformer with deepspeed zero 3. #1050

Uh oh!

Vbansal21 May 6, 2021

Replies: 0 comments

Vbansal21
May 6, 2021