Recipes using BDG projects. Apache 2 licensed.
This repository is a home for "recipes" that use a Big Data Genomics project to accomplish some task. By default, these recipes use EC2 to create a Spark cluster, on which we run ADAM/etc. These recipes serve three purposes:
- As a quickstart for people new to the BDG project, who would like to figure out how to use BDG software to replace their current workflows.
- As a benchmarking/regression testing environment for the various BDG tools.
- As a sandbox where we can set up head-to-head tests against other tools (e.g., for experiments for papers).
Our recipe book contains the following recipes:
- Single node recipes:
** bqsr-head-to-head: Runs a head-to-head single node speed test of the
ADAM and GATK
base quality score recalibration (BQSR) engines.
** flagstat-head-to-head: Runs a head-to-head performance test of
Flagstat
using ADAM, samtools, and sambamba. ** indel-realignment-head-to-head: Runs a head-to-head single node speed test of the ADAM and GATK INDEL realignment engines. ** markdup-head-to-head: Runs a head-to-head performance test of duplicate marking using ADAM, samtools, and sambamba, and picard. ** sort-head-to-head: Runs a head-to-head performance test of sorting using ADAM, samtools, and sambamba, and picard. - Multiple node recipes: ** adam-transforms: Runs scale-out performance testing on ADAM's BQSR, Flagstat, INDEL realignment, duplicate marking, and sort implementations.
To run a single node recipe, run:
fab _configure_master_aptitude
fab bake:<recipe>
To run a multi node recipe with n nodes, run:
fab provision:<n>
fab _configure_master_yum
fab bake:<recipe>
If you are adding a new recipe, you should add a directory. Under this directory, you should create a Bash
script named run.sh
that runs the recipe. If the recipe needs setup, you should add the necessary details to
the fabfile configuration target.
The ADAM mailing list is a good
way to sync up with other people who use bdg projects including the core developers. You can
subscribe by sending an email to [email protected]
or just post using
the web forum page.
A lot of the developers are hanging on the #adamdev freenode.net channel. Come join us and ask questions.
bdg-recipes is released under an Apache 2.0 license.