GDAN HCMI Project: Driver event concordance pipeline

Overview

This project is part of the Genomic Data Analysis Network (GDAN) initiative within the Human Cancer Models Initiative (HCMI). The primary goal is to assess the concordance of driver mutations between paired tumor and tumor-model whole genome sequencing (WGS) samples.

Usage

Configuration

Before running the pipeline, ensure that the copy number data, SNV/indel data, and TCGA data are downloaded into directories of your choice.
Next, edit pipeline/config.yaml to map the input large data files correctly.
If you want to start from the point where the copy number and mutation data are already compiled, you can use this synapse link (https://www.synapse.org/Synapse:syn71463818) to access h5ad files for the concordance calculation.

Install python dependencies

The pipeline also requires snakemake, so you must make sure it's correctly installed (either through mamba or pip).

pip install -r requirements.txt
cd pysrc/
pip install .
cd ../pipeline/  # run pipeline within pipeline/ directory

Pipeline structure

This snakemake pipeline is structured as shown in the following graph:

Pipeline run command

Make sure snakemake is installed in your environment before proceeding.

1. Running in a cluster

The following script assumes you are using SLURM as your job distributor.

SLURM_PARTITION=componc_cpu
CLUSTER_FMT="sbatch --partition=$SLURM_PARTITION --cpus-per-task={threads} --mem={resources.mem_mb} --job-name={rule}.{wildcards} --error=logs/{rule}/{rule}.{wildcards}.%j.err --output=logs/{rule}/{rule}.{wildcards}.%j.out --time=24:00:00"
cmd="snakemake --executor cluster-generic"
cmd="$cmd --cluster-generic-submit-cmd \"$CLUSTER_FMT\""
cmd="$cmd --profile profile/"
cmd="$cmd --singularity-args \"--cleanenv --bind /path/to/data\""
cmd="$cmd -p"
cmd="$cmd --jobs 600"
cmd="$cmd --configfile config.yaml"
cmd="$cmd --use-singularity"
eval $cmd

2. Running locally

You could also run the pipeline in a single node.

cmd="snakemake "
cmd="$cmd --singularity-args \"--cleanenv --bind /path/to/data\""
cmd="$cmd -p"
cmd="$cmd --jobs 50"
cmd="$cmd --configfile config.yaml"
cmd="$cmd --use-singularity"
eval $cmd

Contact

For questions, please feel free to reach out to Seongmin Choi ([email protected]).

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
external/tcga/metadata		external/tcga/metadata
metadata		metadata
pipeline		pipeline
pysrc		pysrc
scripts		scripts
.gitignore		.gitignore
README.md		README.md
pipeline.png		pipeline.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GDAN HCMI Project: Driver event concordance pipeline

Overview

Usage

Configuration

Install python dependencies

Pipeline structure

Pipeline run command

1. Running in a cluster

2. Running locally

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Languages

shahcompbio/driver-concordance

Folders and files

Latest commit

History

Repository files navigation

GDAN HCMI Project: Driver event concordance pipeline

Overview

Usage

Configuration

Install python dependencies

Pipeline structure

Pipeline run command

1. Running in a cluster

2. Running locally

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages