[FSDP2] Cast model to uniform dtype before fully_shard to fix mixed-dtype AssertionError by roycho96 · Pull Request #3985 · huggingface/accelerate

roycho96 · 2026-03-20T16:24:59Z

What does this PR do?

When mixed_precision is enabled, casts model parameters to uniform dtype before fully_shard() to prevent _init_mp_dtypes() AssertionError.

Problem

FSDP2's _init_mp_dtypes() requires uniform orig_dtype across all trainable parameters in a param group. With mixed dtypes, the first forward call crashes:

AssertionError: FSDP expects uniform original parameter dtype but got {torch.bfloat16, torch.float32}

FSDP2's fsdp2_prepare_model() currently passes the mixed-dtype model directly to fully_shard() without normalizing dtypes.

Fix

Cast all parameters to the mixed precision param_dtype before fully_shard(), after model_has_params4bit detection. Params4bit models are skipped to avoid destroying quantized weights.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@SunMarc

…model

roycho96 added 2 commits March 21, 2026 00:49

fix: cast model to uniform dtype before fully_shard in fsdp2_prepare_…

932e7b5

…model

add tests

6f24741

roycho96 changed the title ~~Fix/mixed precision~~ [FSDP2] Cast model to uniform dtype before fully_shard to fix mixed-dtype AssertionError Mar 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FSDP2] Cast model to uniform dtype before fully_shard to fix mixed-dtype AssertionError#3985

[FSDP2] Cast model to uniform dtype before fully_shard to fix mixed-dtype AssertionError#3985
roycho96 wants to merge 2 commits intohuggingface:mainfrom
roycho96:fix/mixed-precision

roycho96 commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

roycho96 commented Mar 20, 2026

What does this PR do?

Problem

Fix

Before submitting

Who can review?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant