Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1D sharding config is very different from 2d sharding config #69

Open
tengyifei opened this issue Dec 18, 2024 · 0 comments
Open

1D sharding config is very different from 2d sharding config #69

tengyifei opened this issue Dec 18, 2024 · 0 comments

Comments

@tengyifei
Copy link
Collaborator

In order to enable 1D sharding, we need to add these to the CLI:

--fsdp "full_shard" --fsdp_config ~/fsdp_config.json

and the fsdp_config.json file is

{
    "fsdp_transformer_layer_cls_to_wrap": [
        "LlamaDecoderLayer"
    ],
    "xla": true,
    "xla_fsdp_v2": true,
    "xla_fsdp_grad_ckpt": true
}

However, in order to enable 2D sharding, we need to instead put this on the CLI:

--spmd_2d_sharding 2

(remember to remove fsdp related flags)

This is very confusing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant