Releases: fulcrumgenomics/fgbio
Release 1.0.0
Major feature release with the following changes:
Major Changes
- Cross-building support moved from
[2.11, 2.12]
->[2.12, 2.13]
- Support added for the high-performance Intel Inflator and Deflator for working with gzipped data
- Significant performance improvements to
CallDuplexConsensusReads
and the addition of multi-threaded calling - A new 100% scala API for reading, writing and working with VCF files
Minor Changes
- Broken pipes while writing to stdout/stderr will print a concise error instead of a long stack trace
- Common option to
fgbio.jar
to set validation stringency when reading/writing SAM/BAM - Minor fixes to HapCutToVcf
- UmiConsensusCaller and related tools now merge
platform
values in read groups case-insensitively
Release 0.8.1
Minor point release with a single new tool to sort FASTQ files by read name and number.
Release 0.8.0
Major release with the following changes:
- Major improvements to the pairwise Aligner class:
- Significant performance improvements in the Aligner class for pairwise alignments
- When aligning DNA sequences aligner will produce matches in CIGAR for matches between compatible IUPAC codes (e.g. R paired with A or G)
- New method to produce all alignments above a score threshold from a pair of sequences
- New interface to allow for custom gap scoring
- Added
Sequences.revcomp()
function that correctly reverse complements all IUPAC DNA/RNA codes - Added method to
Metric
class to return anIterator
over a metrics file instead of reading the whole file into memory Io
object now automatically supports bgzipped files with.bgz
or.bgzip
extensions- Fixed bug in
SamReader
that would occasionally cause exceptions with overlapping query regions - Updated to latest scala point version to create classes/JARs compatible with JDK 9 and 10 at runtime
- Added method to
ExtractBasecallingParamsForPicard
to enable easy access to unmatched BAM file path
Release 0.7.0
Release 0.7.0 introduces the following changes to existing tools:
- GroupReadsByUmi
- DemuxFastqs: enable
--quality-encoding
to be used on the command line (#417) - HapCutToVcf
In addition, the following new tools were added:
- FindSwitchbackReads: Tool to detect templates with strand-switch events in them (#438)
The following API changes were also introduced:
FastqSource
can handle read numbers > 2 (#408)- Fixed writing and parsing of
Double.Nan
,Double.PositiveInfinity
andDouble.NegativeInfinity
inMetric
classes (#411) SamBuilder
should accept missing bases and quals with a cigar (#424)- Add message to
require()
call inSample
(#425) ReadStructure
to allow and strip out whitespace within the read structure during parsing (#425)ProgressLogger.record
should return if logging was triggered and a method to log the last record (#421)- Bug fix:
Metric.write
was not closing its writer (#421) - Adding a few useful methods to
Sequences
(#421) Metric
now extendsCommons
Writer
so we can useAsyncWriter
on it (#437)- Improve the error message when validating a sample shee. (#412)
Release 0.6.1
Bug fix release which resolves a problem introduced in a dependency that caused fgbio to be unable to read BAM files from stdin or named pipes. All users of 0.6.0 should upgrade to 0.6.1.
Release 0.6.0
Release 0.6.0 introduces the following changes to existing tools:
- ReviewConsensusVariants: output
PASS
when there are no filters on the variant; fix format of bases output - MaskPrimers: improved usage documentation to make primer file format clearer
The following API changes were also introduced:
- Added constants to
SamRecord
for SAM/BAM related constant values - NeedlemanWunchAligner renamed to Aligner (old name deprecated by still works)
- Implemented Glocal (or semi-global) alignment mode
- Impleemnted Local alignment mode
- Fixed affine gap implementation
- Fixed
Alignment.subByQuery/subByTarget
to correctly handle adjacent deletions
- In metrics files, ensure 0.0 always formats as
0
and not0E0
- Updated how
Rscript
finds resources in the classpath to support local paths and absolute paths with and without leading slashes
Release 0.5.1
Release 0.5.1 is a minor bug-fix release and introduces the following changes:
- ExtractUmisFromBam
- Improved error messaging
- Fixed bug that prevented it from working when only one read per pair contained a UMI
- GroupReadsByUmi now adds the sub-sort
SS
tag to the header of BAMs produced - CallMolecularConsensusReads and CallDuplexConensusReads attempt to detect the sort order of input data and will fail if the sort order is incompatible
- DemuxFastqs changed some output metrics from 32-bit
Int
to 64-bitLong
to avoid overflows on NovaSeq data
Release 0.5.0
Release 0.5.0 introduces the following changes to existing tools:
- CallDuplexConsensusReads: Fixed a rare bug where the consensus base quality could be zero or one if the two strands' base qualities differ by two or less.
- FilterConsensusReads: Fix for bug where duplex reads formed from raw reads from a single strand only could be incorrectly filtered.
- CorrectUmis: Now stores the original UMI sequences in the
OX
tag upon correction. - DemuxFastqs: Bug fix to correct quality scores in output BAM files
- ClipOverlappingReads: Removed previously deprecated tool. Use
ClipBam
instead. - ClipBam:
- Now optionally outputs metrics about clipping present in reads before and after execution.
- New option to "upgrade" clipping, e.g. replace existing soft-clipping with hard-clipping
Changes to APIs were as follows:
- Various deprecated methods were removed this release.
Metric
formatting now prints smallerDouble
s in scientific notation, and the formatting is generally more efficient.NeedlemanWunchAligner
gained aGlocal
alignment mode for aligning all of a query sequence to a sub-region of a target sequence
Release 0.4.0
Release 0.4.0 introduces the following changes to existing tools:
- CallDuplexConsensusReads
- The single strand consensus bases and quals for each duplex consensus read are output into tags on the duplex consensus read
- Added option to output consensus reads that are formed from only a single strand
- FilterConsensusReads
- New option to filter out reads with low mean base quality
- New option to filter out reads whose minimum depth is too low
- New option to filter duplex consensus reads where the single strand consensuses disagree
- New optional tags will store the the single-strand consensus bases and qualities for duplex consensus reads.
- DemuxFastqs
- will no longer output
/1
and/2
on read names when running in Illumina standards mode - fixed a bug causing an exception when the sample barcode is found in multiple reads (ex. i5 and i7)
- will no longer output
- ErrorRateByReadPosition - fixed bug that resulted in
C>G
errors being counted asA>G
errors - GroupReadsByUmi
- Reads with UMIs with
N
s in them are now rejected - Log messages added with counts of reads filtered out by reason
- Memory usage improvements when grouping reads at very, very high depth.
- Supports enforcing a minimum UMI length and partial UMIs except for the
paired
strategy (duplex sequencing).
- Reads with UMIs with
Finally, changes to various APIs were as follows:
- Method in
Bams
to sort records by tag, or by a function applied to a tag - Improve speed of
Metric.read
for loading large numbers of rows from metrics files - Changed
SamSource
to extendIterableView
instead ofIterable
so thatmap()
,filter()
, etc. return lazy views - Fixed a bug where the specified temporary directory was not being used for sorting.
- Added a
BinomialDistribution
class implemented using unlimited precision decimal math which is slower, but allows computation of cumulative probabilities where other implementations overflow or underflow
Release 0.3.0
Release 0.3.0 introduces the following changes to existing tools:
- ClipBam - The
--overlapping-reads
option was not being used internally and is deprecated in favor of--clip-overlapping-reads
. This caused overlapping reads to always be clipped. - CollectDuplexSeqMetrics - Added the optional output of duplex-umi frequencies with
DuplexUmiMetric
s. - DemuxFastqs - The default output sort order is changed from
Unsorted
toQueryname
. Add an option--illumina-standards
to output file names using Illumina naming conventions. Tuned the amount of memory used, especially for a large # of samples (>96). - CallDuplexConsensusReads - Do not except when we find potential collisions in duplex molecules, instead, do not generate a consensus read.
- FilterBam - adding a few more filters.
- Added a global parameter for log-level.
In addition, the following new tools were added:
- CollectErccMetrics - This will collect metrics for analyzing ERCC spike-ins in
RNA-Seq experiments for dose response but not fold-change
response.
Finally, changes to various APIs were as follows:
- ReferenceSetBuilder - Moved to the
testing
packages for use in projects that extendfgbio
. - Alignment - Added
subByQuery()
andsubByTarget()
methods toAlignment
.