Releases: TimD1/vcfdist
Releases · TimD1/vcfdist
v2.6.1
v2.6.0
Add --max-supercluster-size
option
- Previously, supercluster size can blow up in regions with high-density variations, with no upper restrictions on size.
- This caused an explosion in required RAM and runtime, causing vcfdist to crash. There was no way to evaluate these regions.
- This release limits maximum supercluster size by providing a
--max-supercluster-size
parameter
v2.5.3
v2.5.2
v2.5.1
v2.5.0
Major Changes
- New definition of "sync groups" (complex variants) when attributing credit to variants. The new definition will break dependencies if the selected (rather than all possible) backtracking path(s) pass(es) through the reference diagonal. As a result, there should be more smaller sync groups, and fewer partial credit calls.
- Precision-recall backtracking algorithm now maximizes TP calls
- Removed the
-s, --smallest-variant
option. It offers no runtime benefits and will negatively impact performance (since small variants are prematurely filtered, they cannot be found equivalent to remaining variants). Instead, stratify variants after benchmarking or adjust the--sv-threshold
and-l --largest-variant
parameters to evaluate the desired variants.
Minor bugfixes
- Fixed an erroneous
return
instead ofbreak
statement that caused segfaults in v2.4.0 when using--cluster gap
or--cluster size
. - Fixed a logical error that caused
left_reach
andright_reach
to not be calculated for the first and last clusters on a contig, resulting in incorrect superclustering.
v2.4.0
Major changes
- changed handling of BED regions (see wiki) to exclude variants on border, necessary to be consistent with Truvari and how ground truth BEDs were generated
Minor updates
- added
-lm
and-lstdc++
during linking, which should allowclang++
compilation (working towards bioconda release) - removed
libstdc++fs
dependency (further increasing compatibility)
v2.3.4
v2.3.3
Minor updates
- started the vcfdist wiki, which is currently a work-in-progress
- added
THRESHOLD
column toprecision-recall-summary.tsv
, containing eitherNONE
orBEST
- added
make install
command - added new size-based clustering heuristic, explained in wiki
- added evaluating
ALL
variants to tostdout
andprecision-recall.tsv
- added
RD
andQD
tags tosummary.vcf
, listing reference and query distances from truth sequence - added
REF_DIST
andQUERY_DIST
columns toquery.tsv
andtruth.tsv
containing the same info
Minor bugfixes
- fixed
precision-recall-summary.tsv
extra tab - added
Makefile
comment thatlibstdc++fs
inclusion depends on GCC version - fixed off-by-one error that miscounted
TRUTH_TP
andTRUTH_FN
(atg.max_qual
only) - fixed segfault when no variants are present
FORMAT/BC
tag insummary.vcf
is nowFloat
(notString
)- fixed
credit
being set to0.0
for all FP query variants below--credit-threshold
v2.3.2
Major updates to analysis-v2 scripts
- these scripts accompany the upcoming vcfdist-v2 paper
Minor updates
- added
--sv-threshold
, which adds third precision/recall stratification
Minor bugfixes
- fixed divide-by-zero if several variants are equivalent to no variants
- fixed off-by-one error in phasing analysis logs