Release [2.2.0] - Ulm - 2020-10-21 · nf-core/eager

`Added`

Major Automated cloud tests with large-scale data on AWS
Major Re-wrote input logic to accept a TSV 'map' file in addition to direct paths to FASTQ files
Major Added JSON Schema, enabling web GUI for configuration of pipeline available here
Major Lane and library merging implemented
- When using TSV input, one library with the multiple lanes will be merged together, before mapping
  - Strip FASTQ will also produce a lane merged 'raw' but 'stripped' FASTQ file
- When using TSV input, one sample with multiple (same treatment) libraries will be merged together
- Important: direct FASTQ paths will not have this functionality. TSV is required.
#40 - Added the pileupCaller genotyper from sequenceTools
Added validation check and clearer error message when --fasta_index is provided and filepath does not end in .fai.
Improved error messages
Added ability for automated emails using mailutils to also send MultiQC reports
General documentation additions, cleaning, and updated figures with CC-BY license
Added large 'full size' dataset test-profiles for ancient fish and human contexts human
#257 - Added the bowtie2 aligner as option for mapping, following Poullet and Orlando 2020 doi: 10.3389/fevo.2020.00105
#451 - Adds ANGSD genotype likelihood calculations as an alternative to typical 'genotypers'
#566 - Add tutorials on how to set up nf-core/eager for different contexts
Nuclear contamination results are now shown in the MultiQC report
Tutorial on how to use profiles for reproducible science (i.e. parameter sharing between different groups)
#522 - Added post-mapping length filter to assist in more realistic endogenous DNA calculations
#512 - Added flexible trimming of BAMs by library type. 'half' and 'none' UDG libraries can now be trimmed differentially within a single eager run.
Added a .dockstore.yml config file for automatic workflow registration with dockstore.org
Updated template to nf-core/tools 1.10.2
#544 - Add script to perform bam filtering on fragment length
#456 - Bumps the base (default) runtime of all processes to 4 hours, and set shorter time limits for test profiles (1 hour)
#552 - Adds optional creation of MALT SAM files alongside RMA6 files
Added eigenstrat snp coverage statistics to MultiQC report. Process results are published in genotyping/*_eigenstrat_coverage.txt.

`Fixed`

#368 - Fixed the profile test to contain a parameter for --paired_end
Mini bugfix for typo in line 1260+1261
#374 - Fixed output documentation rendering not containing images
#379 - Fixed insufficient memory requirements for FASTQC edge case
#390 - Renamed clipped/merged output directory to be more descriptive
#398 - Stopped incompatible FASTA indexes being accepted
#400 - Set correct recommended bwa mapping parameters from Schubert et al. 2012
#410 - Fixed nf-core/configs not being loaded properly
#473 - Fixed bug in sexdet_process on AWS
#444 - Provide option for preserving realigned bam + index
Fixed deduplication output logic. Will now pass along only the post-rmdup bams if duplicate removal is not skipped, instead of both the post-rmdup and pre-rmdup bams
#497 - Simplifies number of parameters required to run bam filtering
#501 - Adds additional validation checks for MALT/MaltExtract database input files
#508 - Made Markduplicates default dedupper due to narrower context specificity of dedup
#516 - Made bedtools not report out of memory exit code when warning of inconsistent FASTA/Bed entry names
#504 - Removed uninformative sexdeterrmine-snps plot from MultiQC report.
Nuclear contamination is now reported with the correct library names.
#531 - Renamed 'FASTQ stripping' to 'host removal'
Merged all tutorials and FAQs into usage.md for display on nf-co.re
Corrected header of nuclear contamination table (nuclear_contamination.txt).
Fixed a bug with nSNPs definition in print_x_contamination.py. Number of SNPs now correctly reported
print_x_contamination.py now correctly converts all NA values to "N/A"
Increased amount of memory MultiQC by default uses, to account for very large nf-core/eager runs (e.g. >1000 samples)

`Dependencies`

Added sequenceTools (1.4.0.6) that adds the ability to do genotyping with the 'pileupCaller'
Latest version of DeDup (0.12.6) which now reports mapped reads after deduplication
#560 Latest version of Dedup (0.12.7), which now correctly reports deduplication statistics based on calculations of mapped reads only (prior denominator was total reads of BAM file)
Latest version of ANGSD (0.933) which doesn't seg fault when running contamination on BAMs with insufficient reads
Latest version of MultiQC (1.9) with support for lots of extra tools in the pipeline (MALT, SexDetERRmine, DamageProfiler, MultiVCFAnalyzer)
Latest versions of Pygments (7.1), Pymdown-Extensions (2.6.1) and Markdown (3.2.2) for documentation output
Latest version of Picard (2.22.9)
Latest version of GATK4 (4.1.7.0)
Latest version of sequenceTools (1.4.0.6)
Latest version of fastP (0.20.1)
Latest version of Kraken2 (2.0.9beta)
Latest version of FreeBayes (1.3.2)
Latest version of xopen (0.9.0)
Added Bowtie 2 (2.4.1)
Latest version of Sex.DetERRmine (1.1.2)
Latest version of endorS.py (0.4)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[2.2.0] - Ulm - 2020-10-21

`Added`

`Fixed`

`Dependencies`