Skip to content

Conversation

@jakob-schloer
Copy link
Collaborator

@jakob-schloer jakob-schloer commented Dec 10, 2025

Weight averaging is a technique to improve model generalization by averaging model weights during training. In the new pytorch lightning version (2.6.0) different versions of weight averaging are now supported.

Here, I refactor the existing SWA implementation to support general weight averaging methods through the config. For example, EMA training can be used by adding the following to the config.training:

         weight_averaging:
            _target_: pytorch_lightning.callbacks.EMAWeightAveraging
            decay: 0.999

Note:

  • I've deliberately deleted the previous version of weight initialization to keep the code cleaner.
  • Old configs should still work, since defaults are null if weight_averaging does not exist, but this is still a breaking change if swa was used before.
  • To test the code with lighting==2.6, this PR fix: progress bar with lightning==2.6.0 #739 needs to be merged first

As a contributor to the Anemoi framework, please ensure that your changes include unit tests, updates to any affected dependencies and documentation, and have been tested in a parallel setting (i.e., with multiple GPUs). As a reviewer, you are also responsible for verifying these aspects and requesting changes if they are not adequately addressed. For guidelines about those please refer to https://anemoi.readthedocs.io/en/latest/

By opening this pull request, I affirm that all authors agree to the Contributor License Agreement.


📚 Documentation preview 📚: https://anemoi-training--743.org.readthedocs.build/en/743/


📚 Documentation preview 📚: https://anemoi-graphs--743.org.readthedocs.build/en/743/


📚 Documentation preview 📚: https://anemoi-models--743.org.readthedocs.build/en/743/

@github-project-automation github-project-automation bot moved this to To be triaged in Anemoi-dev Dec 10, 2025
@jakob-schloer jakob-schloer changed the title Configurable weight initialization for training with exponential moving average (EMA), ... feat: Configurable weight initialization (exponential moving average (EMA), ...) Dec 10, 2025
@jakob-schloer jakob-schloer changed the title feat: Configurable weight initialization (exponential moving average (EMA), ...) feat: Configurable weight averaging (exponential moving average (EMA), ...) Dec 10, 2025
@jakob-schloer jakob-schloer added the ATS Approval Needed Approval needed by ATS label Dec 10, 2025
@jakob-schloer jakob-schloer changed the title feat: Configurable weight averaging (exponential moving average (EMA), ...) feat (training): Configurable weight averaging (exponential moving average (EMA), ...) Dec 10, 2025
@jakob-schloer jakob-schloer force-pushed the feature/configurable_weight_averaging branch from e7d6602 to 89d10ca Compare December 16, 2025 12:59
@HCookie HCookie moved this from To be triaged to Now In Progress in Anemoi-dev Dec 19, 2025
@anaprietonem
Copy link
Contributor

@jakob-schloer - How you're getting on with this PR? Are you planning to work on this soon?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Now In Progress

Development

Successfully merging this pull request may close these issues.

3 participants