Skip to content

Conversation

@jthorton
Copy link
Contributor

Description

This PR adds the ability to export basic datasets to hdf5 files, currently, we extract the energies and gradients but more properties could be included in future.

Layout

Each molecule has its own group under its fixed hydrogen layer inchikey with the following datasets:

  • smiles: The mapped explicit hydrogen smiles which can be used to construct the molecule via the openff-toolkit using Molecule.from_mapped_smiles
  • atomic_numbers: An array of the atomic numbers type int16
  • charge: The total charge on the molecule calculated by the openff-toolkit as the sum of formal charges, type int16
  • specification: The method:basis used to compute the results type h5py string.
  • energies: An array of energies for the molecule in units of hartree in the same order as the conformations, so energies[i] corresponds to conformation[i] type float64
  • conformations: An array of conformations for the molecule in bohr type float64
  • gradients: An array of gradients in units hartree / bohr in the same order as the conformations, so gradients[i] corresponds to conformation[i] type float64.

Todos

Notable points that this PR has either accomplished or will accomplish.

  • add tests

Questions

  • should we extract any other properties?

Status

  • Ready to go

@codecov
Copy link

codecov bot commented Apr 14, 2022

Codecov Report

Merging #196 (6907573) into main (fff7590) will decrease coverage by 0.86%.
The diff coverage is 3.03%.

@mattwthompson
Copy link
Member

I know this is old, but I bet this is still doable with a quick rebase/update. Still interested?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants