Skip to content

Conversation

@bhcao
Copy link

@bhcao bhcao commented Jan 26, 2026

These are two relatively fast models. LiTEN performs as well as Visnet on Spice2 (force MAE=14meV), while So3krates performs poorly (force MAE=29meV). But its parameter count is small and its speed is fast (600k parameters, speed improvement of about 2-3 times), and So3krates perform well on some small datasets (such as md17 reported in their paper).

You can choose whether to accept these models or not. If not, perhaps you can use an archive branch to store these models for others to compare or use on specific small datasets?

Here are the hyperparameters I use (maybe not the best):

# LiTEN
num_epochs: 100
init_learning_rate: 2e-5
peak_learning_rate: 1e-4
final_learning_rate: 2e-5
loss: HuberLoss
# other parameters are default
# So3krates
num_epochs: 60
init_learning_rate: 2e-4
peak_learning_rate: 1e-3
final_learning_rate: 2e-4
loss: MSELoss
# Model hyperparams
chi_irreps: 1e+2e+3e+4e
# other parameters are default

src/mlip/models/radial_basis.py and src/mlip/models/cutoff.py are collections of all radial bases and cutoffs that may be used.

@chrbrunk
Copy link
Collaborator

Thank you very much for the contribution.

Thanks to your implementation, we'll be looking into these models now in a testing phase and do a review phase of the code if that looks promising. We'll try to proceed with this as soon as possible, however, note that it might still take a little while. Will let you know when I have any updates to share.

@bhcao bhcao changed the title Implement liten and so3krates Implement Liten, So3krates and EquiformerV2 Jan 30, 2026
@bhcao
Copy link
Author

bhcao commented Jan 30, 2026

I have submitted the code of EquiformerV2, please note:

  1. Due to the large size of the model and the many options available, I only tested a limited number of options and only conducted one epoch of training to test whether the loss could go down. Therefore, there may still be some bugs.
  2. The features I have modified include: direct force prediction, calculation of average number of atoms in dataset info, input of rng and n_node to support dropout and random rotation matrix construction. However, I have not yet implemented the function of discarding direct force head after stage-one.
  3. My implementation is semantically equivalent to EquiformerV2, but not exactly the same. Some of the features that were going to implementing in eSEN, such as changing the order of representations and building rotation matrices through gamma instead of the original complex methods, have also been used in my implementation.
  4. Regarding the solution to the SO2 singularity, I am different from eSEN in that I did not truncate the backward gradient, but instead implemented it by adding epsilon and using the Chebyshev polynomial instead of arctan. Original EquiformerV2 does not support auto-grad forces, so it did not fix this issue.
  5. I still doubt the performance of EquiformerV2 on molecular datasets. EquiformerV2 has shown excellent performance on OC20, but it may not perform well on molecular datasets. I found that it did not use any radial envelope at all, which may hinder its performance.
  6. I have not implemented the function of truncating edges exceeding max_neighbors_per_atom here, which is often used in stage one to reduce computational complexity and can be restored in the second stage. This can be added after v0.2.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants