DITTO

!!! For research purposes only !!!

NOTE: In a past life, DITTO used a different remote Git management provider, UAB Gitlab. It was migrated to Github in April 2023, and the Gitlab version has been archived.

DITTO is an explainable neural network that can be helpful for accurate and rapid interpretation of small genetic variants for pathogenicity using patient’s genotype (VCF) information.

Using DITTO

DITTO scores for variants can be obtained by the below 3 ways. Webapp and API are for single variant analysis and the local setup is for batch/bulk variant predictions.

Webapp

DITTO is available for public use at this website.

API

DITTO is not hosted as a public API but one can serve up locally to query DITTO scores. Please follow the instructions in this GitHub repo.

Setting up to use locally

NOTE: This setup will allow one to annotate a VCF sample and make DITTO predictions. Currently tested only in Cheaha (UAB HPC) because of resource limitations to download datasets from OpenCRAVAT. Docker versions may need to be explored later to make it useable in Mac and Windows.

System Requirements

Tools:

Anaconda3
OpenCravat-2.4.1
Git

Resources:

CPU: > 2
Storage: ~1TB
RAM: ~25GB for a WGS VCF sample

Installation

Requirements:

DITTO repo from GitHub
OpenCravat with databases to annotate
Nextflow >=22.10.7

To fetch DITTO source code, change in to directory of your choice and run:

git clone https://github.com/uab-cgds-worthey/DITTO.git

Run DITTO pipeline on UAB Cheaha

To run on UAB cheaha, please update the model.job (outdir and samplesheet) and .test_data/file_list.txt (inout vcfs) files with complete file paths and submit a slurm job using the command below

sbatch model.job

Run DITTO pipeline outside of UAB Cheaha

Setup OpenCravat (only one-time installation)

Please follow the steps mentioned in install_openCravat.md.

NOTE: Current version of OpenCravat that we're using doesn't support "Spanning or overlapping deletions" variants i.e. variants with * in ALT Allele column. More on these variants

here.

These will be ignored when running the pipeline.

Setup Nextflow

Create an environment via conda. Below is an example to install nextflow.

Anaconda virtual environment

# create environment. Needed only the first time. Please use the above link if you're not using Mac.
conda create --name ditto-env

conda activate ditto-env

# Install nextflow
conda install bioconda::nextflow

Please make a samplesheet .test_data/file_list.txt with VCF files (incl. path). Please make sure to edit the directory paths as needed and run the pipeline as shown below.

nextflow run pipeline.nf \
  --outdir ./data/ \
  -work-dir ./wor_dir \
  --build hg38 -with-report \
  --oc_modules /data/opencravat/modules \
  --sample_sheet .test_data/file_list

Reproducing the DITTO model

Detailed instructions on reproducing the model is explained in build_DITTO.md

Download DITTO DB (Precomputed scores)

Precomputed scores for all possible SNVs and known Indels from gnomADv3.0 in main chromosomes in hg38 reference genome are available to download here - https://s3.lts.rc.uab.edu/cgds-public/dittodb/dittodb.html

How to cite?

Mamidi, T.K.K.; Wilk, B.M.; Gajapathy, M.; Worthey, E.A. DITTO: An Explainable Machine-Learning Model for Transcript-Specific Variant Pathogenicity Prediction. Preprints 2024, 2024040837. https://doi.org/10.20944/preprints202404.0837.v1

Contact information

For queries, please open a GitHub issue.

For urgent queries, send an email with clear description to

Name	Email
Tarun Mamidi	tmamidi@uab.edu
Liz Worthey	lworthey@uab.edu

Name		Name	Last commit message	Last commit date
Latest commit History 391 Commits
.github		.github
.test_data		.test_data
configs		configs
data		data
docs		docs
model		model
shap_plots		shap_plots
src		src
.editorconfig		.editorconfig
.gitignore		.gitignore
.markdownlint.json		.markdownlint.json
.pylintrc		.pylintrc
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
cheaha.config		cheaha.config
model.job		model.job
pipeline.nf		pipeline.nf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DITTO

Using DITTO

Webapp

API

Setting up to use locally

System Requirements

Installation

Run DITTO pipeline on UAB Cheaha

Run DITTO pipeline outside of UAB Cheaha

Reproducing the DITTO model

Download DITTO DB (Precomputed scores)

How to cite?

Contact information

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

License

uab-cgds-worthey/DITTO

Folders and files

Latest commit

History

Repository files navigation

DITTO

Using DITTO

Webapp

API

Setting up to use locally

System Requirements

Installation

Run DITTO pipeline on UAB Cheaha

Run DITTO pipeline outside of UAB Cheaha

Reproducing the DITTO model

Download DITTO DB (Precomputed scores)

How to cite?

Contact information

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages