Skip to content

fix layernorm grad sbp#10561

Merged
ShawnXuan merged 1 commit intolayer_norm_grad_npufrom
fix_ln_grad_sbp
Nov 11, 2024
Merged

fix layernorm grad sbp#10561
ShawnXuan merged 1 commit intolayer_norm_grad_npufrom
fix_ln_grad_sbp

Conversation

@ShawnXuan
Copy link
Collaborator

Fix SBP settings for LayerNormGradOp to ensure correct gradient aggregation for gamma_diff and beta_diff

Changes

  • Updated SBP strategy in LayerNormGradOp: Set gamma_diff and beta_diff to use PartialSum instead of Split to avoid dimension mismatches during distributed training.
  • Added consistency check for begin_norm_axis and begin_params_axis: Enforce equality to ensure proper alignment of normalization and parameter dimensions.

@ShawnXuan ShawnXuan requested a review from fpzh2011 November 10, 2024 12:29
@ShawnXuan ShawnXuan merged commit dfe78fb into layer_norm_grad_npu Nov 11, 2024
@ShawnXuan ShawnXuan deleted the fix_ln_grad_sbp branch November 11, 2024 02:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants