Skip to content

moe_lb_loss should be divided by gradient_accumulation_steps for reporting. #1483

@bzantium

Description

@bzantium

moe_lb_loss = aux["moe_lb_loss"]

moe_lb_loss should be divided by gradient_accumulation_steps for reporting.

  moe_lb_loss = aux["moe_lb_loss"] / config.gradient_accumulation_steps

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions