This repository contains a Docker compose setup that generate an environment to test locally jobflow, jobflow-remote and atomate2.
This Docker Compose setup provides three integrated containers:
- JupyterLab with atomate2 and jobflow-remote preconfigured for usage.
- SLURM with SSH access to simulate an HPC cluster.
- MongoDB database for workflow data storage.
.
├── docker-compose.yml
├── .env
├── slurm/
│ ├── Dockerfile
│ └── slurm_startup.sh
├── jupyter/
│ └── Dockerfile
├── config/
│ └── jfremote_template.yaml
├── notebooks/
│ └── (jupyter notebooks for the hands on sessions)
└── shared/
└── (folder mounted inside the jupyter container)
Before building the project name can be configured. This can be done opening the
.env file with a text editor and setting the PROJECTNAME value. This is not mandatory
and a default value (test_project) is already set.
In addition, some of the usernames and ports can be configured in the .env file, in case the default
values clash with some of your local services.
To launch the environment, clone the repository, navigate to the project directory and run:
docker-compose up -dOnce the containers are running, verify their status with:
docker psIf everything is set up correctly, you should see the containers listed (e.g., jobflow_tutorial-jupyter-1,
jobflow_tutorial-slurm-1, jobflow_tutorial-mongodb-1).
Based on JupyterLab container, is the main entry point for the execution of workflows.
The base python environment includes atomate2, jobflow, jobflow-remote and all related
packages required to execute the workflows.
Jobflow-remote is already fully configured.
Connect to http://localhost:8888 (or your custom JUPYTER_PORT) and use atomate as password.
Can also be accessed through a shell. Run
docker container listto get the name of the container (similar to jobflow_tutorial-jupyter-1) and launch a bash
session inside the container with
docker exec -it <container-name> /bin/bash A local worker with slurm to mimic the execution on an HPC cluster. Includes the same python packages as in the jupyter container and can be accessed through ssh with
ssh -p 2222 atomate@localhost(or your custom SSH_PORT and SLURM_USERNAME) from the local machine or with
ssh atomate@slurmfrom the jupyter container. The password is the same as the username.
The MongoDB is accessible on localhost:27018 (or your custom MONGODB_PORT) from the local
machine and on mongodb:27017 from the jupyter container. There is no password protection
on the database.
It may be instructive to explore the content of the database with a GUI like
MongoDB Compass.
To ensure data persistence across container restarts, the following volumes are mounted:
- JupyterLab (
jupyter_data): Stores data in/home/jovyan/work. Useful files are copied in this folder at container startup. - SLURM (
slurm_data): Holds job execution data in/home/${SLURM_USERNAME:-atomate}/jobs. - MongoDB (
mongodb_data): Persists database records in/data/db.
These volumes allow workflows and job results to be retained even if the containers are stopped or rebuilt.
The jobflow-remote GUI can be started in the jupyter container running
jf guiand can be accessed from the local machine connecting to http://localhost:5001 (or your custom WEB_APP_PORT).
The local folder notebooks is copied in the jupyter container in the ~/work folder.
Warning
Switching on and off the containers will not overwrite the content of the ~/work/notebooks folder,
but deleting the jupyter_data volume associated will delete the files there.
The ~/work/develop folder in the jupyter container is added to the PYTHONPATH in the jobflow-remote
configuration, so that it can be used to store newly developed workflows that can be recognised from the
local_shell worker.
docker-compose downTo remove all data volumes:
docker-compose down -vCaution
This will delete all the notebooks, job files and DB content in the containers.