Skip to content

[WIP] LAMMPS Flows #1185

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 207 commits into
base: main
Choose a base branch
from

Conversation

vir-k01
Copy link

@vir-k01 vir-k01 commented Apr 22, 2025

Summary

This is an effort that picks up from #173 to incorporate workflows to run LAMMPS in atomate2. Quite a bit of the initial code was taken from the atomate2-lammps add-on (here) written by @ml-evs and @gbrunin. The input set generator and templates have been moved to pymatgen.io.lammps, and a concurrent PR has been opened to integrate those into pymatgen. Also tagging in @esoteric-ephemera who helped structure some of the code here, and @davidwaroquiers for their interest in this PR.

  • Function to call lammps based on settings provided in the atomate2.yaml config file, including running in parallel with mpi.
  • Base Maker that generates inputs, runs lammps, and parses the outputs into a LAMMPS TaskDoc.
  • Implemented sets and makers for common MD simulations: NVE/NVT/NPT, and a job to perform geometry minimization under an applied pressure.
  • Implemented a flow to melt, then quench, then thermalize a structure, suitable for creating liquids/glasses/general phase transformations.
  • Wrote a convertor to parse lammps dump files into ase/pymatgen trajectories (might have to tweak this to match how other workflows do this step)
  • Implemented a CustomLammpsMaker that takes in a user written input file (for jobs that aren't a combination of NVT/NPT steps or more complicated lammps simulation) and user specified settings. This maker a port of the lammps implementation in atomate, and I expect this Maker to be the most used by existing LAMMPS users.
  • Added mock_lammps and basic tests for the sets, jobs and schemas.
  • Added a notebook under tutorials for how to set-up and use the makers.

TODO

  • These flows are primarily designed around solids (interfaced through the pymatgen Structure) with forcefields that rely on pair_styles (including MLIPs), and as such all design decisions, units, default values and validation checks are tuned for solids. I'm open to suggestions on how the current implementation can be extended to molecules. (Any changes in this regard will have to make changes to the pymatgen PR as well).
  • Dump files (in their entirety) are presently stored as strings in the job store, and are parsed and additionally stored as either an ASE/pymatgen trajectory if specified by the user. This is done to avoid parsing/storing exceedingly large dump files as the heavier trajectory objects, however, it could very well be possible that storing the files as strings also becomes prohibitively expensive (for large classical MD simulations). Any suggestions on how to deal with such problems are appreciated!
  • I haven't tested these flows with kokkos/gpu or the other lammps add-ons yet.

vir-k01 and others added 30 commits October 24, 2024 16:54
…cture, copied over all the work done on atomate2-lammps (by @ml-evs and @gbrunin at Matgenix). The basic functions have been implemented, the actual tasks to be done are generating the right input sets for a wide range of lammps calculations that can be done, and to write a task doc that can handle the outputs of these calculations. Will update the run.py once I get it to work with a complied version of lammps for a simple test case.
… are probably better expressed in the LAMMPS_CMD, which is specified through the environment's atomate2.yaml file.
…rs from pmg, better handling of inputs to the makers. TODO: make init more readable, and allow for better management of how upstream generators call base set generator
… based on atomate2.ase.utils.TrajectoryObserver. Also accounted for reading in molecules and saving as a trajectory.
…s to json files for easier access, added utility funcs to process settings dicts
…ings to allow restart keyword to be provided in template
…and langevin/berendsen for now. Added nph as a thermostat too.
…for langevin, need for nve integrator for nvt/npt with non-nose-hoover
… take in TaskState and StoreTrajectoryOption objects from emmet for consistency
@JaGeo
Copy link
Member

JaGeo commented Apr 22, 2025

@vir-k01 Thank you!

Before I check out the code in more detail, a naive question:
I am not an expert user of LAMMPS, but I nevertheless have a question: as far as I know, a python interface to lammps can be compiled. Are there drawbacks of using this interface in contrast to input files?

@vir-k01
Copy link
Author

vir-k01 commented Apr 23, 2025

@JaGeo Yes, that's a very valid question. I personally haven't used the python interface to LAMMPS much, but from what I know of it, it's entirely equivalent to just writing templates since it does not provide actual objects to use (other than the cmd line runner). I'm sure the lammps python interface has its uses, but in the context of this PR though, I think using templated input files offers a lot more flexibility, since anything that's not a simple NVT/NPT MD is difficult to convert into a structured input set. At least this way, the user can prepare their input file using whatever approach they're comfortable with, whether that be the pymatgen interface to lammps, or the native python lammps interface, or the ASE interface, or just directly writing the file out in a text editor; and pipe that input file into the flows here.

@JaGeo
Copy link
Member

JaGeo commented Apr 23, 2025

@vir-k01 Thank you very much for the answer! That sounds very good!

Copy link
Contributor

@gpetretto gpetretto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @vir-k01,
thanks a lot for all this work.
Since we are planning to use this I have made a first review of the code and left some comments.

I have already mentioned it in the comments, but my main concern is about how to handle the connection between different jobs. At the moment it seems that the only automation allowed would be passing the output Structure of one Job to the next one. However, in this kind of MD simulations seems that all the additional information (like velocities and thermostat information) would be necessary to have a meaningful connetion.
One potential solution would be to use the "restart" feature in LAMMPS. From a quick test, the size of the restart file generated (or of the "data" file from the write_data) is realtively small (~50KB for a system with ~400 atoms). It may be an idea to always write the restart at the end, so that the jobs could always be composable, or alternatively at least ease the addition of write_restart/write_data to the input file in case one wants to join jobs.
Is there any other way to better ensure the transfer of the required information between two different execution of LAMMPS?

import numpy as np


class LammpsNVESet(BaseLammpsSetGenerator):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generators should probably be made as dataclasses. Otherwise all the attributes below will be seen as class attributes, instead of instance attributes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh shoot, I did not realize that. I wrote out the init functions this way for the core set generators to allow the user to provide keywords that they normally would (such as temperature, nsteps etc for NVT) without having to create a LammpsSettings object beforehand. I can make this change but that'll require writing out an init function regardless since this can't be moved into the post_init?

Lammps input set for NVE MD simulations.
"""
ensemble : MDEnsemble = MDEnsemble.nve
settings : dict = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only dictionary? Why not also LammpsSettings as in the base class?

"""
Lammps input set for NVE MD simulations.
"""
ensemble : MDEnsemble = MDEnsemble.nve
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a point to this argument, or at least having it as atomate2.ase.md.MDEnsemble? I could understand if there was a base class with common code among LammpsNVESet, LammpsNVTSet and just set the value in ensamble. But here the class is called LammpsNVESet, is there any other meaningful value to set for ensamble here? The same for the other generators.

ensemble : MDEnsemble = MDEnsemble.nve
settings : dict = {}

def __init__(self, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably making these generators dataclasses and replace this with a __post_init__ will remove the need for a good part of the code in the __init__?

'''
def __init__(self, dumpfile, store_md_outputs : StoreTrajectoryOption = StoreTrajectoryOption.NO, read_index: str | int = ':') -> None:
self.store_md_outputs = store_md_outputs
self.traj = read(dumpfile, index=read_index) if isinstance(read_index, str) else [read(dumpfile, index=read_index)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not entirely correct. In fact, if read_index containts an integer index in the form of a string (for example "1") read still returns a single Atoms object. I understand that this is a particular case, but it is still make the simple check with isinstance(read_index, str) potentially incorrect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants