-
Notifications
You must be signed in to change notification settings - Fork 696
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Just to help make a case for it:
The collective_broadcast is important for distributed training (SPMD) to sync up the variables and prevent potential numeric drifting across replicas. Accelerators can do this much faster (presumably?) than moving data through the host.
The fake "multi-device" CPU PJRT is very useful to develop and test distributed models without using expensive credits (and the extra hurdle) to get a multi-device hardware during development (which can take more time than the training itself in some cases).
Without collective_broadcast implemented in CPU though, one needs to keep different versions for testing/development, and the version that will actually be used in training. This adds in maintenance, etc.
many thanks!!!
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request