@@ -4,6 +4,184 @@ Below are the release notes for the full RTG suite, upon which
44RTG.VERSION is based. Not all features described below may be included
55in this product.
66
7+
8+ RTG Core 3.10 (2018-10-29)
9+ --------------------------
10+
11+ This release primarily contains smaller improvements and bugfixes.
12+ Several of these result in command line arguments or changes to program
13+ outputs, so check existing scripts for compatibility before
14+ upgrading. Larger features of note:
15+
16+ * Several improvements to simulation tools. In particular, a new command
17+ pedsamplesim has been included that makes it very easy to simulate
18+ multiple samples at once, given a pedigree file. pedsamplesim
19+ automatically simulates founder individuals, inheritance by children,
20+ and de novo mutations.
21+
22+ * Java 11 compatibility testing. RTG is compatible with Java 11,
23+ although currently we recommend Java 8 for performance reasons. Also
24+ note that due to differences in Java Math library implementation after
25+ Java 8, in rare situations minor output differences may be observed
26+ when comparing results obtained using Java 8 with later Java versions.
27+ Builds that include a bundled JRE have been updated to the latest JRE
28+ 8u181.
29+
30+ There have been many other minor improvements and feature
31+ additions. Detailed changes are listed below by area. For more
32+ information on new features, see the RTG Operations Manual.
33+
34+ ## Basic Formatting and Mapping
35+
36+ * petrim: Now outputs read length distribution statistics.
37+
38+ * petrim: Fixed an incorrect filename extension being used for fragment
39+ and overlap length distribution output files.
40+
41+ * map: Now allows the use of both --repeat-freq and
42+ --blacklist-threshold at the same time.
43+
44+ * map: Unmapped but placed reads have had minor adjustments made to
45+ their expected mapping position. As well as causing changes to BAM
46+ annotations, this can cause subsequent changes to variant calling
47+ annotations (such as AVR scores).
48+
49+ * map: Fix a rare crash that could occur when mapping a male sample. The
50+ fix for this can similarly have some changes to subsequent variant calling.
51+
52+ * sammerge: New flag --min-read-length to permit filtering out
53+ alignments where the read length is below the specified threshold.
54+
55+ * sammerge: New flag --select-read-group to include only alignments from
56+ the specified read groups.
57+
58+ * sammerge: New flag --remove-duplicates to detect and remove duplicate
59+ reads based on mapping position. This is like the duplicate detection
60+ that the analysis tools such as variant callers normally perform on
61+ the fly.
62+
63+ * sammerge: Supports --Xforce to allow overwriting existing output
64+ files.
65+
66+ * sdfsubset/sdfsplit: These commands now pass SAM read group information
67+ from the input SDF to the output SDF.
68+
69+
70+ ### Variant Calling
71+
72+ * variant callers: The GT fields for unphased calls are now in a
73+ normalized (numerically increasing) format. Previously the choice of
74+ allele ordering for alleles within a GT field was somewhat arbitrary,
75+ giving the impression of some significance where there was none.
76+
77+ * variant callers: Population variants loaded via --population-priors
78+ are only used to refine complex call regions when the non-reference
79+ allele fractions for the variant are higher than 1%. Previously the
80+ use of a population priors source such as gnomAD that includes many
81+ rare variants could lead to reduced sensitivity.
82+
83+ * variant callers: Improved the ability to identify candidate local
84+ haplotypes when jointly calling a large number of samples or where
85+ there is wide variation in coverage between samples. The effect of
86+ this is greater sensitivity to rare variants such as singletons and de
87+ novo variants.
88+
89+ * variant callers: Ignore SAM records where the reads have zero length.
90+
91+ * many: Region based SAM/BAM record retrieval could sometimes skip
92+ records in the case of a small inter-region gap.
93+
94+ * segment: The --min-panel-coverage option has been renamed to
95+ --min-norm-control-coverage (with extended functionality).
96+
97+ * avrbuild: New flag --annotated that allows supplying positive/negative
98+ labels via annotations on each VCF record, as an alternative to
99+ supplying separate positive and negative VCFs. The supported
100+ annotation is the same as produced by vcfeval --output-mode=annotate
101+ format.
102+
103+ * avrbuild: New flag --bed-regions to only read those training instances
104+ that overlap the specified regions. This is a convenience method that
105+ can be used to train on a specific subset of the data.
106+
107+
108+ ### Variant Processing and Analysis
109+
110+ * svdecompose: Fixed a crash caused by records where SVTYPE=INS but
111+ where the record did not also contain an SVLEN annotation. These
112+ records are now ignored.
113+
114+ * vcfdecompose: Fixed a crash on records that did not contain a GT
115+ format field. This also affected vcfeval when using --decompose. In
116+ addition, the error reporting for records with invalid GT fields has
117+ been improved.
118+
119+ * many: Clearer error handling for VCF records that are invalid due to
120+ extra TABs
121+
122+ * rocplot: Move the legend for precision/sensitivity graphs to the left
123+ hand side, where it is less likely to obstruct the curves themselves.
124+
125+ * vcfannotate: Change in matching semantics when annotating with
126+ IDs. Now uses the span of the record rather than just the start
127+ position.
128+
129+ * many: New derived annotation VAF1 that contains the VAF of the most
130+ frequent alt allele. Being a single value annotation, it can be easily
131+ used during AVR model building.
132+
133+ * vcfmerge: Fix a crash that could occur when trying to merge a record
134+ containing duplicated alleles.
135+
136+
137+ ### Other
138+
139+ * samplesim: Changed the behaviour when simulating from VCF records
140+ without an AF annotation. Now these variants are ignored (i.e. never
141+ selected for use by the sample), previously samplesim would treat all
142+ alleles as equally likely. The old behaviour is available via new flag
143+ --allow-missing-af.
144+
145+ * childsim: The misleadingly named flag --num-crossovers has been
146+ renamed to --extra-crossovers.
147+
148+ * denovosim: Now allows the original and derived sample names to be the
149+ same, in which case the sample in the output VCF is updated rather
150+ than creating a new sample column.
151+
152+ * denovosim: No longer sets the DN flag to "N" for samples not receiving
153+ the de novo mutation, as in multi-sample simulation scenarios this is
154+ not a reliable indicator.
155+
156+ * denovosim: Fix bug when determining if a putative de novo site would
157+ overlap with pre-existing variants.
158+
159+ * pedsamplesim: New command that allows simulating several samples in
160+ one run according to a pedigree. This uses the methods of samplesim,
161+ denovosim, and childsim to greatly ease the simulation of multiple
162+ samples.
163+
164+ * pedstats: New flag --delimiter that can be used to output sample
165+ identifiers with an alternative delimiter. For example, use comma as a
166+ delimiter when directly supplying a sample list to vcfsubset
167+ --keep-samples.
168+
169+ * simulation tools: Most commands now support --Xforce to overwrite
170+ existing files.
171+
172+ * simulation tools: Improvements have been made to parameter validation.
173+
174+ * misc: Updates for compatibility with Java 11. However, for performance
175+ reasons we recommend using Java 8 for computationally intensive
176+ analysis such as mapping and variant calling.
177+
178+ * misc: Update bundled JRE to 1.8.0_181.
179+
180+ * misc: Improved percentage memory allocation behaviour when total
181+ system memory can not be determined. Will now fall back to Java
182+ default memory allocation.
183+
184+
7185RTG Core 3.9.1 (2018-05-29)
8186---------------------------
9187
@@ -434,7 +612,7 @@ Major features of this release:
434612 simulation of population-level variants (popsim), individual sample
435613 genomes using population variants (samplesim), simulation of samples
436614 as member of a pedigree obeying inheritance rules (childsim),
437- simulation of de-novo variants (denovosom ), generation of a genome
615+ simulation of de-novo variants (denovosim ), generation of a genome
438616 given a VCF of sample variants (samplereplay), and read simulation
439617 according to a range of sequencer parameters (readsim/cgsim).
440618
0 commit comments