Replies: 2 comments 4 replies
-
|
Thank you for your interest in our code, Pezhman. Development currently happens in a private repository with updates to the public one soon, and an eventual transition to development in the public one as well. We are currently working on improving the infrastructure to launch training directly, as is done currently, and using AiiDA. The idea is that the "direct" option would be suitable for smaller runs that can fit on a single machine, while AiiDA will offer more generality and better use of HPC resources. I can't give you a specific timeline, but it is something that we are working on at the moment. As for using |
Beta Was this translation helpful? Give feedback.
-
|
It depends a bit on how In that case, you only really need to modify Line 123 in 9fc12fd to something like If you're dealing with a more complicated situation, where you need to work out the pinning/mapping, I can also share more details from a modification done for |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi,
Thanks for making the code public. It seems very promising and efficient.
I'm wondering if you have a short-term plan to make it scheduler friendly. More specifically, I would be interested in training the potential in HPCs with Slurm/SGE scheduler. Either way we use wrappers such as
srunto run the calculations.I gave it a try by directly calling
mpirunbut jobs are crashing. I still have not investigated the failure in depth but it seems that training is being carried out but thescaling.datafile is still not available so the code crashes.Any hint or suggestion would be highly appreciated.
Best regards,
Pezhman
Beta Was this translation helpful? Give feedback.
All reactions