-
Notifications
You must be signed in to change notification settings - Fork 1
LUMI setup
Jaume Zaragoza edited this page May 19, 2023
·
5 revisions
Recommended reads:
- https://docs.lumi-supercomputer.eu/storage/
- https://docs.lumi-supercomputer.eu/runjobs/
- https://docs.lumi-supercomputer.eu/software/installing/container-wrapper/
- https://github.com/paracrawl/cirrus-scripts#readme
- https://docs.google.com/document/d/1YyjdWofZ65ib9qTnGiJ8n0Rvgm4PKRhwvnFYfXrSMRg/edit?usp=sharing
- https://docs.google.com/presentation/d/1zRPEm2QM3MSrmE6894U-E4TAhFH1cEz_pbpH3895dqA/edit?usp=sharing
Please ignore "Compiling Software" section in README, instead follow these steps. The conda container that will be used, contains most of the software needed.
Clone the repo (do not clone recursively), change to lumi
branch and clone only needed submodules
git clone https://github.com/paracrawl/cirrus-scripts
cd cirrus-scripts
git checkout lumi
git submodule update --init env/src/preprocess
Edit env/init.d/lumi.sh
and set PATH
variable to the bin
directory of the conda container.
Right now is set to project_462000252/zaragoza/bitextor-8.1/bin
, which is a working env and you can use it, so no need to change it.
Edit config.d/10.lumi.sh
to set up working directories for processed data:
- Change
PROJ_DIR
andSCRATCH_DIR
to your directories in projappl and scratch partitions of the project (e.g./projappl/project_462000252/user
). Project partition will be used to store the code and models, scratch to store the data. - Set up collection names and directories. For the test runs, there is no need to do additional changes, only to copy the data (explained afterwards).
- Other relevant variables that may not need modifications for the test runs:
-
SBATCH_ACCOUNT
specifies the project that will be billed for the computing hours. -
SBATCH_PARTITION
: we will be usingsmall
for the test but will probably change tostandard
. -
SBATCH_MEM_PER_CPU
: only needed forsmall
partition. Comment this line forstandard
partition. -
SLURM_LOGS
: directory to store the logs of all the jobs. THIS DIRECTORY NEEDS TO BE CREATED before running jobs, otherwise they will fail. Also note that this directory grows significantly in number of files, so make sure to clean it from time to time.
-