This tool provides in silico predictions of microarray hybridization results, calculating the expected binding interactions between query DNA sequences (primers and probes) and a genome database. Based on the BLAST hits and their mismatch numbers, the functionality of the primers and probes can be estimated. The tool also checks the strand specificity of the primers and probes.
The tool has been evaluated and was published at MDPI applied biosciences (https://doi.org/10.3390/applbiosci4020018)
- BLAST Database Creation: Automatically generates a custom BLAST database from user-provided genome files.
- BLAST Search Execution: Runs BLAST search to evaluate how primers or probes interact with both DNA strands.
- Mismatch Analysis: Identifies and filters matches with acceptable levels of mismatches, simulating how primers perform in the presence of mutations or single nucleotide polymorphisms (SNPs).
- Off-Target Detection: Helps identify off-target binding sites to ensure primer/probe specificity.
- Support for Multiple Genomes: Capable of performing searches across multiple genome sequences, ideal for comparative genomics or multi-strain analyses.
- Results in Multiple Formats: Outputs detailed results in TSV, and text formats for easy review and further analysis.
- Python 3.11+
- rnajena-sugar
- NCBI BLAST+
Install AssayBLAST alongside a BLAST installation:
pip install assay_blastOne way to install BLAST and AssayBLAST with conda:
conda create -c bioconda -n assay_blast_env python=="3.13" blast
conda activate assay_blast_env
pip install assay_blastTo install the development version, use this repository, e.g.:
pip install https://github.com/rnajena/AssayBLAST/archive/refs/heads/main.zipVersion 2.0 comes with two separate scripts: one for running BLAST and another for analyzing the results.
assay_blast <genome_files_glob_pattern> -q <query_file.fasta> -o <BLAST output file> [options]
assay_analyze <BLAST output file>genomes: Glob pattern for the genome FASTA or GenBank files.-q, --queries: Path to the query FASTA file containing primers or probes.
For a description of optional arguments please run assay_blast -h.
The --num-threads parameter can be used to specify the number of threads used by the BLAST search.
fname: BLAST output file fromassay_blast
For a description of optional arguments please run assay_analyze -h.
assay_test -d . # Download the two example files
assay_blast example_database.fasta -q example_queries.fasta --mismatch 2
assay_analyze blast_results.tsv --mismatch 2These commands:
- Uses the FASTA files
example_database.fastato build the BLAST database. - Runs the BLAST search using the primers/probes in
example_queries.fasta. - Allows a maximum of 2 mismatches in alignments.
- BLAST TSV Results: Detailed output of BLAST hits including scores, E-values, mismatches.
- BLAST Mismatching Alignments: Overview of mismatching alignments
- TSV Overview: Summary table containing mismatch counts and growth analysis for each probe/primer.
- TSV Details: Detailed table with all found probes/primers, mismatch counts, distances and positions
- BLAST Database Creation: The tool generates a BLAST database from the user-provided genome files.
- Forward and Reverse Complement BLAST: Runs two separate BLAST searches: one for the detection of the sequences and a second one to output alignments of the matches.
- Mismatch Filtering and Analysis: Alignments are filtered based on user-defined mismatch thresholds. Results include mismatch counts and binding positions.
- Result Generation: Outputs the results in various formats including TSV (for easy parsing and analysis), and text (for alignments).
Run the tests with assay_test.
Feel free to contribute to this project by submitting pull requests, reporting issues, or suggesting improvements.
This project is licensed under the MIT License. See the LICENSE file for more details.
For any questions, issues, or suggestions, please reach out via GitHub Issues.
