You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+7Lines changed: 7 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -129,6 +129,13 @@ equitrain-preprocess \
129
129
130
130
The preprocessing command accepts `.xyz`, `.lmdb`/`.aselmdb`, and `.h5` inputs; LMDB datasets are automatically converted to the native HDF5 format before statistics are computed. XYZ files are parsed through ASE so that lattice vectors, species labels, and per-configuration metadata are retained. The generated HDF5 archive is a lightweight collection of numbered groups where each entry stores positions, atomic numbers, energy, optional forces and stress, the cell matrix, and periodic boundary conditions. Precomputed statistics (means, standard deviations, cutoff radius, atomic energies) are stored alongside and reused by the training entry points.
131
131
132
+
Under the hood, each processed file is organised as:
133
+
134
+
-`/structures`: per-configuration metadata (cell, energy, stress, weights, etc.) and pointers into the per-atom arrays.
135
+
-`/positions`, `/forces`, `/atomic_numbers`: flat, chunked arrays sized by the total number of atoms across the dataset. Random reads only touch the slices required for a batch.
136
+
137
+
This layout keeps the HDF5 file compact even for tens of millions of structures: chunked per-atom arrays avoid the pointer-chasing overhead of variable-length fields, enabling efficient multi-worker dataloaders that issue many small reads concurrently.
138
+
132
139
<!-- TODO: change this following a notebook style -->
Utilities and examples for working with pre-trained [MACE](https://github.com/ACEsuit/mace) foundation models in the JAX backend via the companion [mace-jax](https://github.com/ACEsuit/mace-jax) project.
4
+
5
+
## Contents
6
+
7
+
-`convert_foundation_to_jax.py` – downloads a Torch MACE foundation model (e.g. the `mp` “small” checkpoint), converts it to MACE-JAX parameters using `mace_jax.cli.mace_torch2jax`, and writes a ready-to-use bundle (`config.json` + `params.msgpack`).
8
+
9
+
## Usage
10
+
11
+
Activate an environment that has both `mace` and `mace-jax` installed (including the optional `cuequivariance` extras when available), then run:
This produces a directory containing the serialized parameters and a JSON configuration that can be passed directly to Equitrain’s JAX backend (`--model path/to/bundle`) or loaded with the utilities in `mace_jax.tools`.
21
+
22
+
Use `--source` to pick a different foundation family (`mp`, `off`, `anicc`, `omol`) and `--model` to select a specific variant when multiple sizes exist.
23
+
24
+
## Dependencies
25
+
26
+
The script relies on the optional `mace` and `mace-jax` stacks, including their CUDA-enabled cuequivariance extensions. Install them via:
27
+
28
+
```bash
29
+
pip install equitrain[mace,jax] # or the corresponding mace/mace-jax wheels
30
+
```
31
+
32
+
If the cuequivariance libraries are unavailable, the script will exit after downloading the Torch model; the export step itself requires the accelerated kernels to be importable. Run `python -c "import mace_jax, cuequivariance_ops_torch"` to check whether your environment is configured correctly.
0 commit comments