Releases: kevlar-dev/kevlar
Releases · kevlar-dev/kevlar
Kevlar version 0.7
Added
- A new Snakemake workflow for preprocessing BAM inputs for analysis with kevlar (see #305, #355).
- A new Snakemake workflow for kevlar's standard processing procedure (see #306, #355).
- New
unbandmodule to merge augmented Fastq files produced with a k-mer banding strategy (see #316). - New
varfiltermodule to filter out preliminary variant calls overlapping with problematic/unwanted loci or features (see #318, #342, #354). - New dependency:
intervaltreepackage (see #318). - A new
sandboxdirectory with convenience scripts for development and analysis (see #335). - A new
--min-like-scorefilter for thesimlikemodule (see #343). - A new
--drop-outliersfilter for thesimlikemodule (see #350).
Changed
- Added a new flag to print to the terminal (stderr) and a logfile simultaneously (see #308).
- The functionality of the previous
filtermodule is now split between the newunbandmodule and a reimplementation of thefiltermodule (see #316). - Added a "fast mode" to the
simlikemodule, prematurely halting computations for calls already marked for filtering (see #328). - Added a filter for problematic short indels adjacent to homopolymers (see #336, #338, #339).
- Implemented new filters in the
simlikemodule based on thresholds and k-mer abundances: theControlAbundancefilter for predictions with too many high-abundance parent/control k-mers spanning the variant, and theCaseAbundancefilter for predictions with too many consecutive proband/child k-mers spanning the variant (see #327, #339).
Fixed
- Corrected a bug that reported the reference target sequence instead of the assembled contig sequence in the
CONTIGattribute of indel calls in the VCF (see #304). - Corrected a bug that called adjacent substitutions as independent SNVs rather than an aggregate MNV (see #332).
Removed
Kevlar version 0.6.1
Fixed
- Updated
setup.pyso that the README markdown is included in the long description attribute for rendering on PyPI (see commit 9f51024). - Removed direct calls to fixures that are no longer supported by pytest (see commit dab6418).
- Updated the Makefile so that
kevlar/tests/__init__.pyis not included when running the test suite. Now compatible with pytest>=4.0.0 (see commit 965bd0d).
Kevlar version 0.6
Added
- The
kevlar countoperation now supports masks and 8-, 4-, or 1-bit counters (see #277 and #291). - A Jupyter notebook and supporting code and data for evaluating kevlar's performance on a simulated data set (see #271).
- New flags for filtering gDNA cutouts or calls from specified sequences (see #285).
- New filter that discards any contig/gDNA alignment with more than 4 mismatches (see #288).
- A new feature that generates a Nodetable containing only variant-spanning k-mers to support re-counting k-mers and computing likelihood scores in low memory (see #289, #292, #302).
- A new
ProgressIndicatorclass that provides gradually less frequent updates over time (see #299).
Changed
- Ported augfastx handling from
kevlar.seqiomodule to a new Cython module (see #279). - Dynamic error model for likelihood calculations is now an configurable option (see #286).
- Cleaned up overlap-related code with a new
ReadPairclass (see #283). - Updated
kevlar assemble,kevlar localize, andkevlar callto accept streams of partitioned reads; previously, only reads for a single partition were permitted (see #294). - Overhauled the
kevlar localizecommand to compute seed locations for all seeds in all partitions with a single BWA call, massively improving efficiency (see #294 and #301). - Updated the variant calling procedure to discard alignment blocks less than
ksizein length (see #303).
Fixed
- Minor bug with .gml output due to a change in the networkx package (see #278).
Removed
Kevlar version 0.5
Fixed
Added
- Multithreading is now supported natively in
kevlar alac(see #249 and unmergedfeed-threadbranch). - A limited-scope VCF reader (see #256).
- Script for computing likelihood scores is now a first-class kevlar citizen as
kevlar simlike(see #259). - New
kevlar distsubcommand for computing average and standard deviation of k-mer abundances for likelihood calculations (see #264). - Paired-end awareness for
kevlar dump(see #265). - New
LikelihoodFailfilter for variant calls with a negative likelihood score (see #266).
Kevlar version 0.4.2
Kevlar version 0.4.1
Kevlar version 0.4
Added
- New
kevlar gentriocommand for a more realistic similation of trios for testing and evaluation (#171). - New filter for
kevlar alacfor discarding partitions with a small number of interesting k-mers (#189). - New
kevlar splitsubcommand for splitting a partitioned augfastq file into N chunks (see #206). - New
-p/--part-idflag inkevlar alacfor processing a single partition in a partitioned augfastq file (see #206). - New reader/parser for parititioned augfastx files (see #206).
- New strategy for discriminating between variants and off-target calls using pairing information (see #210).
- New optional "fallback" assembly strategy: if fermi-lite fails, try our homegrown greedy assembly algorithm (see #214 and #219).
- New parameter for excluding SNV calls too near to the end of a contig (see #222).
Changed
- Replaced
pep8withpycodestylefor enforcing code style in development (see #167). - The
--refrargument of thekevlar dumpcommand is now optional, and when no reference is explicitly specifiedkevlar dumpacts primarily as a BAM to Fastq converter (see #170). - Split the functionality of the
countsubcommand: simple single-sample k-mer counting was kept incountwith a much simplified interface, while the memory efficient multi-sample "masked counting" strategy was split out to a new subcommandeffcount(see #185). - Replaced
kevlar reaugmentwith a more generalizablekevlar augmentsubcommand (see #188). - Replaced
--ksizewith--seed-sizeinkevlar localizeso thatkevlar alaccan now support different values for k-mers and localizing seeds/anchors (see #198). - Improved variant sorting, scoring, and reporting strategy (see #199).
- The augmented Fastx format now permits annotation of 1 or more mate sequences (see #210).
- Split
vcf.pyandvarmap.pymodules off from thecall.pymodule (see #229).
Fixed
- Incorrect file names in the quick start documentation page (see 9f6bec0).
- The
kevlar alacprocedure now accepts a stream of read partitions (instead of a stream of reads) at the Python API level, and correctly handles a single partition-labeled sequence file at the CLI level (see #165). - CIGARs that begin with I blocks (alternate allele contig is longer than reference locus) are now handled properly (see #191).
- Bug with how
kevlar alachandles "no reference match" scenarios resolved (see #192). - Bug with
kevlar countwhen reading from multiple input files (see #202). - Can now call SNVs near INDELs (see #229).
Removed
- The JCA assembly mode is no longer supported (see #231).
Kevlar version 0.3.0
This release includes many new features, some refactoring of the core codebase, and the first end-to-end analysis workflow implemented in a single command.
Details are included below.
Fixed
- Abundances reported by
kevlar filternow correctly show re-computed proband k-mer abundances, not pre-filtering abundances (see #111). - The
kevlar localizeandkevlar callprocedures now handle multiple assembled contigs, calling variants from the best reference match for each contig (see #124, #126, and #147).
Added
- New abundance screen now a part of
kevlar novel. If any k-mer in a read is below some abundance threshold, the entire read is discarded (see #106). - Better error reporting and handling of various issues with assembly, localization, and alignment (see #113, #114).
- Support for VCF output (see #130 and #144), including "windows" with all k-mers containing the reference allele (RW) and alternate allele (VW) to facilitate distinguishing inherited mutations from novel mutations (see #144 and #152).
- New subcommands
alac: assembles, localizes, aligns, and calls variants on a single partition basissimplex: invokes the entire simplex analysis workflow
Changed
- The
kevlar filterprocedure now handles both contamination and reference matches under a single "mask" interface (see #103). - Explicitly dropped support for Python 2.7. Now supports only Python >=3.5 (see #125).
- Main methods for each core subcommand are now implemented as minimal wrappers around generator functions, to facilitate composing different steps of the workflow or invoking them from third-party Python code (see #95, #126, #133, #148, #149, #150, #159, #161).
- The home-grown greedy assembly implementation has been replaced by calls to the
fermi-litelibrary, which is now bundled with kevlar (see #156). - The default behavior of
kevlar partitionis now to output a single stream of reads.
Writing each partition to a distinct file is still supported with the--splitoption.
Removed
- The
kevlar collectcommand and associated tests. Its functionality has now been fully distributed to other subcommands.- Read filtering to
kevlar filter - Junction count contig assembly to
kevlar filteras an optional mode
- Read filtering to
Kevlar version 0.2.0
Kevlar release v0.2 adds new subcommands for read partitioning and variant calling, fixes a major bug with contig assembly, and introduces many minor fixes, improvements, and code refactoring.
Added
- New subcommands
partition: group reads by shared interesting k-merslocalize: determine an assembled contig's location in the reference genomecall: align assembled contigs to reference and call variant
- Documentation suite in
docs/, hosted at https://kevlar.readthedocs.io - New third-party dependency
ksw2for computing alignments. Wrapped with Cython, which is a new development-time dependency (but not install or run time). - The
pandaspackage is now a dependency, andpysamandnetworkxare now hard dependencies (rather than conditional).
Fixed
- Bug with assembly when the order of a read pair was swapped and they had the opposite orientation (see #85).
Kevlar version 0.2.0 (release candidate 2)
Fixing some issues with packaging, and updating the installation docs.