If snakemake is having trouble detecting job failures, you can provide a custom
script to --cluster-generic-status-cmd
. You can either do this at the command-line or add
it to your config.v8+.yml
.
Important: To use any of these, you must add the flag --parsable
to the
call to sbatch
in the field cluster-generic-submit-cmd
in config.v8+.yml
Important: These scripts must be executable: chmod +x <script>
Sources:
- Snakemake documentation for
--cluster-generic-status-cmd
slurm-status.py
from official Slurm profile
Scripts:
-
status-sacct.sh
- Bash script that usessacct
to determine job status (recommended) -
status-sacct.py
- Python script that usessacct
to determine job status -
status-scontrol.sh
- Bash script that usesscontrol
to determine job status -
status-sacct-multi.sh
- Bash script that usessacct
to determine job status in a multi-cluster setup -
status-sacct-robust.sh
- Bash script that re-runssacct
multiple times if it fails to return a valid status
I prefer to keep scripts as simple as possible. However, if none of the above simple scripts are sufficient for your use case, I recommend you try one of the following available status scripts:
-
slurm-status.py
from the official profile
0