Skip to content

[Feat] Add direct MM vs QM comparisons #190

@lilyminium

Description

@lilyminium

At the moment all comparisons are squashed into a per-molecule ICRMSD, which makes it difficult to tell directly which parameters and which bond/angles are the problematic ones. It would be really helpful to be able to:

a) compute a MM vs QM comparison, then
b) group these per-assigned parameter (which will differ according to FF)

What this would mean for a multi-FF store is being able to choose which FF to label topology values with, and then group all MM vs QM values per parameter. The goal is to end up with diagrams like those below:

Angles

Image

Bonds
Image

And to a lesser extent torsions:

Image

This is similar to the kind of analysis that Lexie did with small ring angles. Specifically, step b) , where we group the MM/QM values per-parameter, cannot be pre-computed and pre-stored -- as the parameter ID assigned will change depending on which FF is used to label the parameter. However, it'd be hugely beneficial to store the computed bond/angle/torsion values so we can re-label and re-plot with additional FFs. So this might be hard to do in yammbs currently (at least quickly), but I think this is a hugely valuable analysis.

The workflow for obtaining the images above was as follows:

  • A QM dataset was downloaded and processed into a Parquet format.
  • The topology values (bonds, angles, proper torsions) of the QM dataset was computed with this script (ignore the impropers in that script -- they're done incorrectly. I re-did the impropers here).
  • Each conformer in the QM dataset was optimized with Sage 2.1, Sage 2.2.1, and Sage 2.3rc2 separately.
  • For each optimized geometry, we calculated topology values using the script above.
  • I labelled each topology set (bond/angle/torsion) by their assigned parameter in Sage 2.3.0 using this script
  • I grouped each MM/QM value, and calculated the difference between the values, here. (We plot a difference box-plot instead of a scatter plot because of the quirks of calculating angle differences and wrapping angle values).
  • I plotted the data here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions