ood-proteinfold is an Open OnDemand interactive app for computational protein structure prediction using Proteinfold (AlphaFold2, Boltz, ColabFold, ESMFold, and RoseTTAFold-All-Atom) originally built for UNSW's Katana HPC.
- Predict protein structures from amino acid sequences, FASTA files, or samplesheets.
- Supports multiple prediction methods found within nf-core/proteinfold.
- Customisable run parameters via web form.
- Results accessible via web interface.
- Access: Launch the app via the Open OnDemand portal.
- Input: Provide a sequence, FASTA directory, or samplesheet.
- Configure: Select prediction method, mode, and database options.
- Submit: Start the job; monitor progress and download results when complete.
form.yml.erb: Defines the web form and input parameters.template/script.sh.erb: Main job script for running predictions.submit.yml.erb: Job submission configuration.info.html.erb: Displays result links after job completion..github/workflows/: CI/CD deployment workflows.
To deploy ProteinFold on your own Open OnDemand instance:
-
Clone this repository into your OnDemand apps directory:
git clone https://github.com/Australian-Structural-Biology-Computing/ood-proteinfold.git /var/www/ood/apps/sys/kod-proteinfold
-
Edit files to match institutional settings:
- Update
submit.yml.erbwith your facility-specific settings. - Change the cluster in
form.yml.erbto match your HPC. - Create
kod_proteinfold-${RUN_ENVIRONMENT}.configfiles for proteinfold-specific nextflow configurations. Or remove this line fromscript.sh.erbif not needed. - Create a
template/.envfile to change any of these default parameters:
- Update
# Example .env for ood-proteinfold
# Base output directory for all runs (job-specific OUT_DIR set at runtime)
BASE_OUT_DIR=/srv/scratch/${USER}/proteinfold_output
# Environment name (e.g., prod, dev)
RUN_ENVIRONMENT=prod
# Project root: location of Nextflow configs and samplesheet-utils environment
PROJECT_ROOT=/srv/scratch/sbf-pipelines/proteinfold
# Database path for structure prediction modules
DB_PATH=/srv/scratch/sbf-pipelines/proteinfold/dbs
# Git branch to use for proteinfold
BRANCH=master
# Repository path for using alternative versions of proteinfold
REPOSITORY=Australian-Structural-Biology-Computing/proteinfold
# Debug group for permissions on debug files
DEBUGGROUP=sbf-pipelines
# Base debug directory (job-specific DEBUGDIR set at runtime)
BASE_DEBUGDIR=/srv/scratch/sbf/debug
# Nextflow work directory base (job-specific NXF_WORK set at runtime)
BASE_NXF_WORK=/srv/scratch/${USER}/.proteinfold/work
# Apptainer/Singularity blob cache directories (set here or via your institutional Nextflow config, e.g. https://github.com/Australian-Structural-Biology-Computing/unsw_katana)
# APPTAINER_CACHEDIR=/srv/scratch/${USER}/.images
# SINGULARITY_CACHEDIR=/srv/scratch/${USER}/.images
# Nextflow Apptainer/Singularity image cache/library directories (set here or via institutional Nextflow config, e.g. https://github.com/nf-core/configs/blob/master/conf/pipeline/proteinfold/unsw_katana.config)
# NXF_APPTAINER_CACHEDIR=/srv/scratch/${USER}/.images
# NXF_SINGULARITY_CACHEDIR=/srv/scratch/${USER}/.images
# NXF_APPTAINER_LIBRARYDIR=/srv/scratch/sbf-pipelines/proteinfold/singularity
# NXF_SINGULARITY_LIBRARYDIR=/srv/scratch/sbf-pipelines/proteinfold/singularity
samplesheet-utils is required for generating and validating input samplesheets.
You must create a Python virtual environment and install the correct version:
python3 -m venv ${PROJECT_ROOT}/${RUN_ENVIRONMENT}/venv
source ${PROJECT_ROOT}/${RUN_ENVIRONMENT}/venv/bin/activate
pip install samplesheet-utils==1.1.2Update the paths in .env if your environment is elsewhere.
Please cite doi.org/10.26190/4KQF-M552 and acknowledge the Structural Biology Facility at UNSW in publications.
For support, please open an issue.