QUIP/LAMMPS different energies for different number of cores

Hello,

I am attempting to create a GAP potential for silicon and carbon. I have installed the QUIP library with the architecture linux_x86_64_gfortran. For the training of the GAP I am using the following command

gap_fit energy_parameter_name=energy force_parameter_name=forces do_copy_at_file=F sparse_separate_file=T gp_file=GAP.xml at_file=train.xyz  gap={distance_Nb order=2 cutoff=4.5 n_sparse=15 covariance_type=ard_se delta=5 theta_uniform=2.0 sparse_method=uniform compact_clusters=T : distance_Nb order=3 cutoff=3.5 n_sparse=200 covariance_type=ard_se delta=0.3 theta_uniform=2.0 sparse_method=uniform compact_clusters=T : soap atom_sigma=0.5 l_max=8 n_max=8 cutoff=4.5 cutoff_transition_width=1.0 delta=0.1 covariance_type=dot_product n_sparse=2000 zeta=4} default_sigma={0.005 0.08 0.0 0.0} config_type_sigma={dimer:0.0008:0.02:0.0:0.0:single:0.0001:0.005:0.0:0.0:perfect_crystall:0.00001:0.01:0.0:0.0:interstitials:0.0002:0.05:0.0:0.0} sparse_jitter=1e-8 gp_file=GAP.xml

I now want to run bigger simulations in lammps. For lammps I tried several versions (lammps_stable_sep_2021, stable_5Jun2019,stable_29Aug2024) and installed the QUIP pair_style by using cmake directly and also by referring to the libquip.a library. 

When I now run lammps with mpi on mulitple cores and on a single core the results for the energies are quite different. For a cell of 385 atoms, the energy difference is around 3eV. The command for the pair_style looks as follows

pair_coeff * * GAP/29_08/without_stress/GAP.xml "Potential xml_label=GAP_2024_8_30_120_8_42_16_201" 14 6


If I now change all silicon atoms to carbon or all carbon to silicon the results for the mpi and single core runs are the same. I also encountered the problem for bigger cells and smaller cells, with defects and without defects. The problem with different results for different numbers of cores is also not present for all GAP potentials I have trained so far. The one above is just an example of where the error occurs.

Does someone have an idea where the error comes from and if this will influence the results I get from longer and more realistic runs (physical wise)? 

[log_serial.txt](https://github.com/user-attachments/files/16855989/log_serial.txt)
[log_parallel.txt](https://github.com/user-attachments/files/16855990/log_parallel.txt)
[training.txt](https://github.com/user-attachments/files/16855991/training.txt)
[h_110_1.txt](https://github.com/user-attachments/files/16855992/h_110_1.txt)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

QUIP/LAMMPS different energies for different number of cores #669

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

QUIP/LAMMPS different energies for different number of cores #669

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions