ML-guided qPCR primer design with off-target minimization.
# Create environment with external tools
conda env create -f environment.yml
conda activate qprimer-designer
# Install the package
pip install .docker pull ghcr.io/broadinstitute/qprimer_designer:latest
docker run --rm ghcr.io/broadinstitute/qprimer_designer qprimer --helpVerify installation:
qprimer --help
qprimer generate --helpThe pipeline expects input sequences in the following structure:
.
└── target_seqs/
└── original/
├── target1.fa
├── target2.fa
└── offtarget.fa
FASTA files in ./target_seqs/original should be multi-sequence, unaligned FASTAs with .fa extension.
Configure workflows/Snakefile.template with your targets:
TARGETS = ['target1']
CROSS = []
HOST = ['offtarget']Run the pipeline:
cd workflows
snakemake -s Snakefile.example --cores allOutput will be in the final/ directory as a CSV file.
Configure the panel targets:
PANEL = ['target1', 'target2']
HOST = ['offtarget']Run with multiplex enabled:
snakemake -s Snakefile.example --config multiplex=1 --cores allOutput will be final/multiplex_output.csv containing top candidates for each target.
The qprimer CLI provides the following subcommands:
| Command | Description |
|---|---|
qprimer generate |
Generate primer candidates from target sequences |
qprimer pick-representatives |
Select representative sequences from MSA |
qprimer prepare-input |
Prepare input data for ML evaluation |
qprimer evaluate |
Run ML model to score primer candidates |
qprimer filter |
Filter primers based on evaluation scores |
qprimer build-output |
Build final output CSV with scores |
qprimer select-multiplex |
Select best multiplex primer set |
If GPU is available, add the resource flag:
snakemake -s Snakefile.example --cores all --resources gpu=1CPU performance is acceptable for most use cases.
# Install with dev dependencies
pip install -e ".[dev]"
# Run tests
pytest tests/ -vSee CLAUDE.md for development guidelines.
Pre-trained models are bundled with the package in src/qprimer_designer/data/. Training scripts are available in the training/ directory for reference (raw dataset available upon request).
MIT License - see LICENSE for details.