Skip to content

gradient accumulation #54

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 6, 2024
Merged

gradient accumulation #54

merged 2 commits into from
Jun 6, 2024

Conversation

sblackburn86
Copy link
Collaborator

Adding a kwarg to allow for gradient accumulation.
In the normal MACE and the MLP score network, there is no batchnorm, so doing gradient accumulation is easy.
Here, just passing an argument to pytorch-lightning trainer does the trick.

TBD what happens with DiffusionMACE with the o3.batchnorm...

Copy link
Collaborator

@rousseab rousseab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR only shows changes in the config files. The new argument is not passed to the Trainer... Maybe changes in train_diffusion.py were not committed?

Copy link
Collaborator

@rousseab rousseab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@rousseab rousseab merged commit 095167b into main Jun 6, 2024
1 check passed
@rousseab rousseab deleted the gradient_accumulation branch June 6, 2024 18:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants