Releases: nextstrain/augur
32.0.0
These release notes are automatically extracted from the full changelog.
Major Changes
- ancestral, translate: These will now error when the length of any reference gene is indivisible by 3, instead of silently padding with N to translate to 'X'. #1895 (@victorlin)
augur.utils.load_featuresis deprecated and will be removed in a future major version. Users should useaugur.io.load_featuresinstead. #1912 (@victorlin)
Features
augur curate apply-record-annotationswill now warn if an annotation was unnecessary, often indicative of the upstream data being updated. #1893 (@jameshadfield)
31.5.0
These release notes are automatically extracted from the full changelog.
- A new command,
augur subsample, supports complex subsampling using file-based configuration. See the updated Filtering and Subsampling guide for a comparison withaugur filter. #635 (@victorlin)
31.4.0
These release notes are automatically extracted from the full changelog.
Features
- schema: Allow parentheses (
()) in gene names. #1819 (@kimandrews) - geolocation rules: Add rules to define region per country to ensure that regions are labelled for all countries. This is especially useful for data sources that do not include region in the metadata. #1844 (@joverlee521)
- support numpy v2 in addition to v1. #1855 (@corneliusroemer)
- support for Python 3.13. #1857 (@corneliusroemer)
- tree: Prefer
iqtree3binary overiqtree2andiqtreewhen available. #1875 (@joverlee521) - export v2: URLs encoded in metadata (both TSV and node-data JSONs) will be associated with the value in the exported JSON. Given a column/key
<X>then a valid URL in a column/key named<X>__urlwill be automatically used. This allows values to be a clickable link when viewed in Auspice. #1852 (@jameshadfield)
Bug fixes
- filter: Improved speed of using
--group-by monthon large datasets. #1845 (@victorlin) - merge: Added validation to require at least two sequence inputs for merging, consistent with metadata merging behavior. #1865 (@victorlin)
- validate: Send all log messages to
stderr. #1869 (@victorlin) - validate: only print the entire merged Auspice config to
stderrwhen there's a validation error. #1878(@joverlee521)
31.3.0
These release notes are automatically extracted from the full changelog.
Features
- traits: Added new options
--branch-labelsand--branch-confidenceto export branch labels for nodes which have a corresponding state change. These are useful for creating streamtrees which convey geographic jumps. #1814 (@jameshadfield) - filter, merge: Added a new option
--nthreadsto configure parallelism. Right now, it is only passed to SeqKit, but it may be used for other internal optimizations in the future. #1833 (@victorlin) - filter: Added a new option
--skip-checksto bypass checks for duplicates in sequences and whether ids in metadata have a sequence entry. Mainly useful when working with larger files. #1833 (@victorlin) - Added a new
AUGUR_PROFILEenvironment variable. If set, Augur will run with Python's cProfile profiler and save results to the value which should be a file path. This may result in slightly slower run times, and should only be used for debugging purposes. #1835 (@victorlin)
Bug fixes
- filter, merge: Improved run time of sequence I/O operations, especially in the common use case of having a workflow manager run multiple invocations simultaneously. #1833 (@victorlin)
- filter, merge: Previously, SeqKit was hardcoded to use its default of 4 threads per command, which could have resulted in oversubscription of resources in the common use case of having a workflow manager run multiple invocations simultaneously. The default behavior has been updated to use 1 thread per command to discourage oversubscription of resources. It is configurable with the new
--nthreadsoption described above. #1833 (@victorlin)
31.2.1
These release notes are automatically extracted from the full changelog.
Bug fixes
- curate format-dates: Removed redundant warning messages that were previously displayed when using
--failure-reporting "warn". #1816 (@victorlin) - filter: Improved performance of
--output-sequencesby using SeqKit internally. #1794 (@victorlin) - filter: Improved performance when using
--sequenceswithout--sequence-indexby skipping indexing of--sequenceswhen no sequence-based filters are used. #1827 (@victorlin) - filter: Fixed a bug that prevented proper checking of duplicates and sequence index mismatches on VCF inputs. #1826 (@victorlin)
- merge: Fixed a performance bug where input sequence file validation unnecessarily loaded file contents into device memory. #1820 (@victorlin)
- refine: Fixed a bug where inferred dates were being wrongly marked as not inferred. #1829 (@victorlin)
31.2.0
These release notes are automatically extracted from the full changelog.
Features
- merge: Support merging of sequence files with
--sequences. #1579 (@victorlin) - read-file: Multiple files are now accepted. #1815 (@victorlin)
- schema: Added fields for streamtrees and default zoom branch label. #1813 (@jameshadfield)
Bug fixes
31.1.0
These release notes are automatically extracted from the full changelog.
Features
- schema: Allow full stop character (
.) in gene names. #955 (@jameshadfield)
Bug fixes
- filter: Improved speed of using
--group-by,--min-date, and--max-dateon large datasets. #1792, #1811 (@victorlin)
31.0.0
These release notes are automatically extracted from the full changelog.
Major Changes
augur mask --mask,augur tree --exclude-sites: BED files with inconsistent CHROM values (i.e., values in the first column of data lines) will throw an error, as Augur (implicitly) expects to be working on a single piece of DNA (chromosome, segment, etc), and multiple CHROM values in a BED file indicate a violation of this expectation. This is a breaking change. #945 (@genehack)- filter: Empty values in the metadata id column will result in an error that can only be resolved by editing the metadata file or by specifying a different id column with
--metadata-id-columns. #1807 (@joverlee521)
Bug fixes
augur mask --mask,augur tree --exclude-sites: Providing an empty BED file, or one with only header lines and no data lines, will no longer cause an error to be thrown. #945 (@genehack)augur.utils.read_bed_file()was rewritten for increased compliance with the BED file specification. In particular, header line dectection is improved and multiple header lines are now supported. #945 (@genehack)- export v2: Improved the error message that is displayed when the metadata index column has duplicated values #1791 (@genehack)
- tree: Improved help text for
--tree-builder-argsto explain some IQ-TREE options won't work because of defline rewriting #875 (@genehack) - export v2: Automatically rename fields within the
filtersandcoloringsconfigs of the provided auspice config file to match the renamed fields in the exported nodes. #1804 (@joverlee521) - export v2: Divergence values are now exported with increased precision, showing up to 6 significant digits instead of 3. #1801 (@rneher)
30.0.1
These release notes are automatically extracted from the full changelog.
Bug fixes
- filter: Removed the note that appeared in output when running with
--sequencesand without--sequence-index. The help text of both options has been updated to clarify the relationship between the two. #1797 (@victorlin)
30.0.0
These release notes are automatically extracted from the full changelog.
Major Changes
Note: The following breaking changes were effective as of version 29.1.0.
- filter: Date values in
<year>-<month>format with more than 4 digits in the year (e.g.02025-04) or more than 2 digits in the month (e.g.2025-004) are no longer supported. Support for these was unintentional, but it worked in practice. #1786 (@victorlin) - filter: Date values in
<year>-<month>-<day>format that fall outside of valid date boundaries now fail with an error. For example,2025-00-01is invalid. Previously, all date parts were treated categorically without date validation somonth=0was its own category. #1786 (@victorlin) - filter: Date values in
<year>-<month>format that fall outside of valid date boundaries are now auto-converted to the closest date. For example,2025-00will be auto-converted to2025-01. Previously, all date parts were treated categorically without date validation somonth=0was its own category. It will now be treated asmonth=1. This is a side-effect of the change in 29.1.0 that switched to the same internal date parsing function that is used by other commands. A future major version may change behavior to fail with an error to better align with handling of<year>-<month>-<day>. [#1774][] (@victorlin)
Bug fixes
- filter: version 29.1.0 inadvertently dropped support for date values in
<year>-<month>or<year>-<month>-<day>format that are not inYYYY-MMorYYYY-MM-DDformat. Support for some values has been restored. See the "Major Changes" section for details on which values are explicitly no longer supported. #1785 (@victorlin)