Skip to content

Conversation

@ElliottKasoar
Copy link
Collaborator

@ElliottKasoar ElliottKasoar commented Nov 11, 2025

Pre-review checklist for PR author

PR author must check the checkboxes below when creating the PR.

Summary

Add S30L benchmark, including MAEs for neutral and charged complexes.

Linked issue

Resolves #100

Progress

  • Calculations
  • Analysis
  • Application
  • Documentation

Testing

Tested on MACE models, Orb, MatterSim and UMA models. MACE and Orb results agree with https://arxiv.org/pdf/2510.25380 for overall MAE.

MatterSim differs slightly, and UMA differs significantly, but is closer to the result here: http://mlip-testing.stfc.ac.uk:8050 (which does not include D3 correction.

New decorators/callbacks

None

@ElliottKasoar ElliottKasoar self-assigned this Nov 11, 2025
@ElliottKasoar ElliottKasoar added the new benchmark Proposals and suggestions for new benchmarks label Nov 11, 2025
@ElliottKasoar ElliottKasoar changed the title Add supra s30 l Add S30L benchmark Nov 11, 2025
@ElliottKasoar
Copy link
Collaborator Author

This also includes weights, which hopefully work once #109 is merged, so only the overall MAE is used by default.

@ElliottKasoar ElliottKasoar marked this pull request as ready for review November 12, 2025 17:42
joehart2001
joehart2001 previously approved these changes Nov 12, 2025
Copy link
Collaborator

@joehart2001 joehart2001 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all good, just need to update analysis after gmtkn is in main

Copy link
Collaborator

@joehart2001 joehart2001 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all working great

@ElliottKasoar ElliottKasoar merged commit f0c7028 into main Nov 14, 2025
14 checks passed
@ElliottKasoar ElliottKasoar deleted the add-supra-S30L branch November 14, 2025 20:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new benchmark Proposals and suggestions for new benchmarks

Projects

None yet

Development

Successfully merging this pull request may close these issues.

S30L

3 participants