Skip to content

Releases: fulcrumgenomics/fgbio

Release 1.0.0

06 Aug 19:56
Compare
Choose a tag to compare

Major feature release with the following changes:

Major Changes

  • Cross-building support moved from [2.11, 2.12] -> [2.12, 2.13]
  • Support added for the high-performance Intel Inflator and Deflator for working with gzipped data
  • Significant performance improvements to CallDuplexConsensusReads and the addition of multi-threaded calling
  • A new 100% scala API for reading, writing and working with VCF files

Minor Changes

  • Broken pipes while writing to stdout/stderr will print a concise error instead of a long stack trace
  • Common option to fgbio.jar to set validation stringency when reading/writing SAM/BAM
  • Minor fixes to HapCutToVcf
  • UmiConsensusCaller and related tools now merge platform values in read groups case-insensitively

Release 0.8.1

29 Mar 22:28
Compare
Choose a tag to compare

Minor point release with a single new tool to sort FASTQ files by read name and number.

Release 0.8.0

14 Feb 00:14
Compare
Choose a tag to compare

Major release with the following changes:

  • Major improvements to the pairwise Aligner class:
    • Significant performance improvements in the Aligner class for pairwise alignments
    • When aligning DNA sequences aligner will produce matches in CIGAR for matches between compatible IUPAC codes (e.g. R paired with A or G)
    • New method to produce all alignments above a score threshold from a pair of sequences
    • New interface to allow for custom gap scoring
  • Added Sequences.revcomp() function that correctly reverse complements all IUPAC DNA/RNA codes
  • Added method to Metric class to return an Iterator over a metrics file instead of reading the whole file into memory
  • Io object now automatically supports bgzipped files with .bgz or .bgzip extensions
  • Fixed bug in SamReader that would occasionally cause exceptions with overlapping query regions
  • Updated to latest scala point version to create classes/JARs compatible with JDK 9 and 10 at runtime
  • Added method to ExtractBasecallingParamsForPicard to enable easy access to unmatched BAM file path

Release 0.7.0

06 Nov 20:43
Compare
Choose a tag to compare

Release 0.7.0 introduces the following changes to existing tools:

  • GroupReadsByUmi
    • check that the raw UMI tag is found foreach read (#406)
    • Fix log message in GroupReadsByUmi to be more accurate / less misleading (#436)
  • DemuxFastqs: enable --quality-encoding to be used on the command line (#417)
  • HapCutToVcf
    • fix ambiguous (IUPAC) reference bases on the fly #418)
    • add an option to skip indexing the output file (ex. when the input does not have CONTIG lines) #418)

In addition, the following new tools were added:

  • FindSwitchbackReads: Tool to detect templates with strand-switch events in them (#438)

The following API changes were also introduced:

  • FastqSource can handle read numbers > 2 (#408)
  • Fixed writing and parsing of Double.Nan, Double.PositiveInfinity and Double.NegativeInfinity in Metric classes (#411)
  • SamBuilder should accept missing bases and quals with a cigar (#424)
  • Add message to require() call in Sample (#425)
  • ReadStructure to allow and strip out whitespace within the read structure during parsing (#425)
  • ProgressLogger.record should return if logging was triggered and a method to log the last record (#421)
  • Bug fix: Metric.write was not closing its writer (#421)
  • Adding a few useful methods to Sequences (#421)
  • Metric now extends Commons Writer so we can use AsyncWriter on it (#437)
  • Improve the error message when validating a sample shee. (#412)

Release 0.6.1

18 May 16:43
Compare
Choose a tag to compare

Bug fix release which resolves a problem introduced in a dependency that caused fgbio to be unable to read BAM files from stdin or named pipes. All users of 0.6.0 should upgrade to 0.6.1.

Release 0.6.0

05 Apr 18:56
Compare
Choose a tag to compare

Release 0.6.0 introduces the following changes to existing tools:

  • ReviewConsensusVariants: output PASS when there are no filters on the variant; fix format of bases output
  • MaskPrimers: improved usage documentation to make primer file format clearer

The following API changes were also introduced:

  • Added constants to SamRecord for SAM/BAM related constant values
  • NeedlemanWunchAligner renamed to Aligner (old name deprecated by still works)
    • Implemented Glocal (or semi-global) alignment mode
    • Impleemnted Local alignment mode
    • Fixed affine gap implementation
    • Fixed Alignment.subByQuery/subByTarget to correctly handle adjacent deletions
  • In metrics files, ensure 0.0 always formats as 0 and not 0E0
  • Updated how Rscript finds resources in the classpath to support local paths and absolute paths with and without leading slashes

Release 0.5.1

27 Feb 22:11
Compare
Choose a tag to compare

Release 0.5.1 is a minor bug-fix release and introduces the following changes:

  • ExtractUmisFromBam
    • Improved error messaging
    • Fixed bug that prevented it from working when only one read per pair contained a UMI
  • GroupReadsByUmi now adds the sub-sort SS tag to the header of BAMs produced
  • CallMolecularConsensusReads and CallDuplexConensusReads attempt to detect the sort order of input data and will fail if the sort order is incompatible
  • DemuxFastqs changed some output metrics from 32-bit Int to 64-bit Long to avoid overflows on NovaSeq data

Release 0.5.0

11 Feb 15:31
Compare
Choose a tag to compare

Release 0.5.0 introduces the following changes to existing tools:

  • CallDuplexConsensusReads: Fixed a rare bug where the consensus base quality could be zero or one if the two strands' base qualities differ by two or less.
  • FilterConsensusReads: Fix for bug where duplex reads formed from raw reads from a single strand only could be incorrectly filtered.
  • CorrectUmis: Now stores the original UMI sequences in the OX tag upon correction.
  • DemuxFastqs: Bug fix to correct quality scores in output BAM files
  • ClipOverlappingReads: Removed previously deprecated tool. Use ClipBam instead.
  • ClipBam:
    • Now optionally outputs metrics about clipping present in reads before and after execution.
    • New option to "upgrade" clipping, e.g. replace existing soft-clipping with hard-clipping

Changes to APIs were as follows:

  • Various deprecated methods were removed this release.
  • Metric formatting now prints smaller Doubles in scientific notation, and the formatting is generally more efficient.
  • NeedlemanWunchAligner gained a Glocal alignment mode for aligning all of a query sequence to a sub-region of a target sequence

Release 0.4.0

15 Nov 17:41
a9445b4
Compare
Choose a tag to compare

Release 0.4.0 introduces the following changes to existing tools:

  • CallDuplexConsensusReads
    • The single strand consensus bases and quals for each duplex consensus read are output into tags on the duplex consensus read
    • Added option to output consensus reads that are formed from only a single strand
  • FilterConsensusReads
    • New option to filter out reads with low mean base quality
    • New option to filter out reads whose minimum depth is too low
    • New option to filter duplex consensus reads where the single strand consensuses disagree
    • New optional tags will store the the single-strand consensus bases and qualities for duplex consensus reads.
  • DemuxFastqs
    • will no longer output /1 and /2 on read names when running in Illumina standards mode
    • fixed a bug causing an exception when the sample barcode is found in multiple reads (ex. i5 and i7)
  • ErrorRateByReadPosition - fixed bug that resulted in C>G errors being counted as A>G errors
  • GroupReadsByUmi
    • Reads with UMIs with Ns in them are now rejected
    • Log messages added with counts of reads filtered out by reason
    • Memory usage improvements when grouping reads at very, very high depth.
    • Supports enforcing a minimum UMI length and partial UMIs except for the paired strategy (duplex sequencing).

Finally, changes to various APIs were as follows:

  • Method in Bams to sort records by tag, or by a function applied to a tag
  • Improve speed of Metric.read for loading large numbers of rows from metrics files
  • Changed SamSource to extend IterableView instead of Iterable so that map(), filter(), etc. return lazy views
  • Fixed a bug where the specified temporary directory was not being used for sorting.
  • Added a BinomialDistribution class implemented using unlimited precision decimal math which is slower, but allows computation of cumulative probabilities where other implementations overflow or underflow

Release 0.3.0

05 Oct 19:22
Compare
Choose a tag to compare

Release 0.3.0 introduces the following changes to existing tools:

  • ClipBam - The --overlapping-reads option was not being used internally and is deprecated in favor of --clip-overlapping-reads. This caused overlapping reads to always be clipped.
  • CollectDuplexSeqMetrics - Added the optional output of duplex-umi frequencies with DuplexUmiMetrics.
  • DemuxFastqs - The default output sort order is changed from Unsorted to Queryname. Add an option --illumina-standards to output file names using Illumina naming conventions. Tuned the amount of memory used, especially for a large # of samples (>96).
  • CallDuplexConsensusReads - Do not except when we find potential collisions in duplex molecules, instead, do not generate a consensus read.
  • FilterBam - adding a few more filters.
  • Added a global parameter for log-level.

In addition, the following new tools were added:

  • CollectErccMetrics - This will collect metrics for analyzing ERCC spike-ins in
    RNA-Seq experiments for dose response but not fold-change
    response.

Finally, changes to various APIs were as follows:

  • ReferenceSetBuilder - Moved to the testing packages for use in projects that extend fgbio.
  • Alignment - Added subByQuery() and subByTarget() methods to Alignment.