Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dsl1 2.5.0 #1021

Merged
merged 28 commits into from
Nov 1, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
631f18e
Merge pull request #984 from nf-core/patch
jfy133 May 16, 2023
550225c
Add UDG info to trimbam output bam name.
TCLamnidis Aug 15, 2023
1005ce8
bump multiQC version
TCLamnidis Aug 15, 2023
09af474
add parameters to schema and config
TCLamnidis Aug 18, 2023
3d878c6
Add mapdamage as alternative damage estimator
TCLamnidis Aug 18, 2023
72854ee
rename parameter
TCLamnidis Aug 18, 2023
09ff881
Do not subsample by default in mapdamage
TCLamnidis Aug 18, 2023
eba08a2
Disable downsampling by default
TCLamnidis Aug 18, 2023
d794b8e
Add tests
TCLamnidis Aug 18, 2023
6aedb96
Update CHANGELOG.md
TCLamnidis Aug 18, 2023
1cae42d
Partial update to output.md
TCLamnidis Aug 18, 2023
7dc2334
pass mapdamage output to MQC
TCLamnidis Aug 22, 2023
b133d8b
Update output.md
TCLamnidis Aug 22, 2023
1d03083
Apply suggestions from code review
TCLamnidis Aug 23, 2023
8190c75
Standardise formatting of mapDamage across
TCLamnidis Aug 23, 2023
42d9dca
document besswarm plot fix
TCLamnidis Aug 30, 2023
3a84510
bump MultiQC version
TCLamnidis Oct 13, 2023
83163bb
mention NXF version cap. fix some broken URLs
TCLamnidis Oct 13, 2023
23b74e9
attempt to add mapadamage to multiqc
TCLamnidis Oct 13, 2023
322fdcd
tweak mqc mapdamage linking
TCLamnidis Oct 27, 2023
dabdee2
correct mqc version
TCLamnidis Oct 27, 2023
a61e83a
Update RG tags
TCLamnidis Oct 27, 2023
e87bad6
Update docs/output.md
TCLamnidis Oct 27, 2023
dd157ca
Merge branch 'patch' into dsl1-2.5.0
TCLamnidis Oct 27, 2023
92d8377
always index fasta if no fasta_index
TCLamnidis Nov 1, 2023
f0bd5be
Update Changelog
TCLamnidis Nov 1, 2023
754d5d8
add date to changelog
TCLamnidis Nov 1, 2023
b9c15ef
small bugfix
TCLamnidis Nov 1, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 10 additions & 7 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ jobs:
strategy:
matrix:
# Nextflow versions: check pipeline minimum and current latest
nxf_ver: ['20.07.1', '22.10.6']
nxf_ver: ["20.07.1", "22.10.6"]
steps:
- name: Check out pipeline code
uses: actions/checkout@v2
Expand Down Expand Up @@ -58,7 +58,7 @@ jobs:
run: |
git clone --single-branch --branch eager https://github.com/nf-core/test-datasets.git data
- name: DELAY to try address some odd behaviour with what appears to be a conflict between parallel htslib jobs leading to CI hangs
run: |
run: |
if [[ $NXF_VER = '' ]]; then sleep 1200; fi
- name: BASIC Run the basic pipeline with directly supplied single-end FASTQ
run: |
Expand All @@ -74,7 +74,7 @@ jobs:
nextflow run ${GITHUB_WORKSPACE} -profile test,docker --save_reference
- name: REFERENCE Basic workflow, with supplied indices
run: |
nextflow run ${GITHUB_WORKSPACE} -profile test,docker --bwa_index 'results/reference_genome/bwa_index/BWAIndex/' --fasta_index 'https://github.com/nf-core/test-datasets/blob/eager/reference/Mammoth/Mammoth_MT_Krause.fasta.fai'
nextflow run ${GITHUB_WORKSPACE} -profile test,docker --bwa_index 'results/reference_genome/bwa_index/BWAIndex/' --fasta_index 'https://github.com/nf-core/test-datasets/blob/eager/reference/Mammoth/Mammoth_MT_Krause.fasta.fai'
- name: REFERENCE Run the basic pipeline with FastA reference with `fna` extension
run: |
nextflow run ${GITHUB_WORKSPACE} -profile test_tsv_fna,docker
Expand Down Expand Up @@ -107,7 +107,7 @@ jobs:
nextflow run ${GITHUB_WORKSPACE} -profile test,docker --clip_adapters_list 'https://github.com/nf-core/test-datasets/raw/eager/databases/adapters/adapter-list.txt'
- name: ADAPTER LIST Run the basic pipeline using an adapter list, skipping adapter removal
run: |
nextflow run ${GITHUB_WORKSPACE} -profile test,docker --clip_adapters_list 'https://github.com/nf-core/test-datasets/raw/eager/databases/adapters/adapter-list.txt' --skip_adapterremoval
nextflow run ${GITHUB_WORKSPACE} -profile test,docker --clip_adapters_list 'https://github.com/nf-core/test-datasets/raw/eager/databases/adapters/adapter-list.txt' --skip_adapterremoval
- name: POST_AR_FASTQ_TRIMMING Run the basic pipeline post-adapterremoval FASTQ trimming
run: |
nextflow run ${GITHUB_WORKSPACE} -profile test,docker --run_post_ar_trimming
Expand Down Expand Up @@ -141,6 +141,9 @@ jobs:
- name: BEDTOOLS Test bedtools feature annotation
run: |
nextflow run ${GITHUB_WORKSPACE} -profile test,docker --run_bedtools_coverage --anno_file 'https://github.com/nf-core/test-datasets/raw/eager/reference/Mammoth/Mammoth_MT_Krause.gff3'
- name: MAPDAMAGE2 damage calculation
run: |
nextflow run ${GITHUB_WORKSPACE} -profile test,docker --damage_calculation_tool 'mapdamage'
- name: GENOTYPING_HC Test running GATK HaplotypeCaller
run: |
nextflow run ${GITHUB_WORKSPACE} -profile test_tsv_fna,docker --run_genotyping --genotyping_tool 'hc' --gatk_hc_out_mode 'EMIT_ALL_ACTIVE_SITES' --gatk_hc_emitrefconf 'BP_RESOLUTION'
Expand Down Expand Up @@ -193,11 +196,11 @@ jobs:
nextflow run ${GITHUB_WORKSPACE} -profile test,docker --run_bam_filtering --bam_unmapped_type 'fastq' --run_metagenomic_screening --metagenomic_tool 'malt' --database "/home/runner/work/eager/eager/databases/malt/" --metagenomic_complexity_filter
- name: MALTEXTRACT Download resource files
run: |
mkdir -p databases/maltextract
for i in ncbi.tre ncbi.map; do wget https://github.com/rhuebler/HOPS/raw/0.33/Resources/"$i" -P databases/maltextract/; done
mkdir -p databases/maltextract
for i in ncbi.tre ncbi.map; do wget https://github.com/rhuebler/HOPS/raw/0.33/Resources/"$i" -P databases/maltextract/; done
- name: MALTEXTRACT Basic with MALT plus MaltExtract
run: |
nextflow run ${GITHUB_WORKSPACE} -profile test,docker --run_bam_filtering --bam_unmapped_type 'fastq' --run_metagenomic_screening --metagenomic_tool 'malt' --database "/home/runner/work/eager/eager/databases/malt" --run_maltextract --maltextract_ncbifiles "/home/runner/work/eager/eager/databases/maltextract/" --maltextract_taxon_list 'https://raw.githubusercontent.com/nf-core/test-datasets/eager/testdata/Mammoth/maltextract/MaltExtract_list.txt'
nextflow run ${GITHUB_WORKSPACE} -profile test,docker --run_bam_filtering --bam_unmapped_type 'fastq' --run_metagenomic_screening --metagenomic_tool 'malt' --database "/home/runner/work/eager/eager/databases/malt" --run_maltextract --maltextract_ncbifiles "/home/runner/work/eager/eager/databases/maltextract/" --maltextract_taxon_list 'https://raw.githubusercontent.com/nf-core/test-datasets/eager/testdata/Mammoth/maltextract/MaltExtract_list.txt'
- name: METAGENOMIC Run the basic pipeline but with unmapped reads going into Kraken
run: |
nextflow run ${GITHUB_WORKSPACE} -profile test_tsv_kraken,docker --run_bam_filtering --bam_unmapped_type 'fastq'
Expand Down
19 changes: 19 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,25 @@
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.html).

## [2.5.0] - 2023-11-01

### `Added`

- [#1020](https://github.com/nf-core/eager/issues/1020) Added mapdamage2 as an alternative for damage calculation.

### `Fixed`

- [#1017](https://github.com/nf-core/eager/issues/1017) Fixed file name collision in niche cases with multiple libraries of multiple UDG treatments.
- [#1024](https://github.com/nf-core/eager/issues/1024) `multiqc_general_stats.txt` is now generated even if the table is a beeswarm plot in the report.
- Updated RG tags for all mappers. RG-id now includes Sample as well as Library ID. Added `LB:` tag with the library ID.
- [#1031](https://github.com/nf-core/eager/issues/1031) Always index fasta regardless of mapper. This ensures that DamageProfiler and genotyping processes get submitted when using bowtie2 and not providing a fasta index.

### `Dependencies`

- `multiqc`: 1.14 -> 1.16

### `Deprecated`

## [2.4.7] - 2023-05-16

### `Added`
Expand Down
15 changes: 9 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,9 @@

[![Get help on Slack](http://img.shields.io/badge/slack-nf--core%20%23eager-4A154B?logo=slack)](https://nfcore.slack.com/channels/eager)

>[!IMPORTANT]
> nf-core/eager versions 2.* are only compatible with Nextflow versions up to 22.10.6!

## Introduction

<!-- nf-core: Write a 1-2 sentence summary of what data the pipeline is for and what it does -->
Expand All @@ -28,7 +31,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool

## Quick Start

1. Install [`nextflow`](https://nf-co.re/usage/installation) (`>=20.07.1`)
1. Install [`nextflow`](https://nf-co.re/usage/installation) (`>=20.07.1` && `<=22.10.6`)

2. Install any of [`Docker`](https://docs.docker.com/engine/installation/), [`Singularity`](https://www.sylabs.io/guides/3.0/user-guide/), [`Podman`](https://podman.io/), [`Shifter`](https://nersc.gitlab.io/development/shifter/how-to-use/) or [`Charliecloud`](https://hpc.github.io/charliecloud/) for full pipeline reproducibility _(please only use [`Conda`](https://conda.io/miniconda.html) as a last resort; see [docs](https://nf-co.re/usage/configuration#basic-configuration-profiles))_

Expand All @@ -52,7 +55,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool
nextflow clean -f -k
```

See [usage docs](https://nf-co.re/eager/docs/usage.md) for all of the available options when running the pipeline.
See [usage docs](https://nf-co.re/eager/usage) for all of the available options when running the pipeline.

**N.B.** You can see an overview of the run in the MultiQC report located at `./results/MultiQC/multiqc_report.html`

Expand All @@ -69,7 +72,7 @@ By default the pipeline currently performs the following:
* Sequencing adapter removal, paired-end data merging (`AdapterRemoval`)
* Read mapping to reference using (`bwa aln`, `bwa mem`, `CircularMapper`, or `bowtie2`)
* Post-mapping processing, statistics and conversion to bam (`samtools`)
* Ancient DNA C-to-T damage pattern visualisation (`DamageProfiler`)
* Ancient DNA C-to-T damage pattern visualisation (`DamageProfiler` or `mapDamage`)
* PCR duplicate removal (`DeDup` or `MarkDuplicates`)
* Post-mapping statistics and BAM quality control (`Qualimap`)
* Library Complexity Estimation (`preseq`)
Expand Down Expand Up @@ -133,9 +136,9 @@ The nf-core/eager pipeline comes with documentation about the pipeline: [usage](
* [Pipeline installation](https://nf-co.re/usage/local_installation)
* [Adding your own system config](https://nf-co.re/usage/adding_own_config)
* [Reference genomes](https://nf-co.re/usage/reference_genomes)
3. [Running the pipeline](https://nf-co.re/eager/docs/usage.md)
3. [Running the pipeline](https://nf-co.re/eager/usage)
* This includes tutorials, FAQs, and troubleshooting instructions
4. [Output and how to interpret the results](https://nf-co.re/eager/docs/output.md)
4. [Output and how to interpret the results](https://nf-co.re/eager/output)

## Credits

Expand Down Expand Up @@ -246,7 +249,7 @@ In addition, references of tools and data used in this pipeline are as follows:
* **Bowtie2** Langmead, B. and Salzberg, S. L. 2012 Fast gapped-read alignment with Bowtie 2. Nature methods, 9(4), p. 357–359. doi: [10.1038/nmeth.1923](https:/dx.doi.org/10.1038/nmeth.1923).
* **sequenceTools** Stephan Schiffels (Unpublished). Download: [https://github.com/stschiff/sequenceTools](https://github.com/stschiff/sequenceTools)
* **EigenstratDatabaseTools** Thiseas C. Lamnidis (Unpublished). Download: [https://github.com/TCLamnidis/EigenStratDatabaseTools.git](https://github.com/TCLamnidis/EigenStratDatabaseTools.git)
* **mapDamage2** Jónsson, H., et al 2013. mapDamage2.0: fast approximate Bayesian estimates of ancient DNA damage parameters. Bioinformatics , 29(13), 1682–1684. [https://doi.org/10.1093/bioinformatics/btt193](https://doi.org/10.1093/bioinformatics/btt193)
* **mapDamage** Jónsson, H., et al 2013. mapDamage2.0: fast approximate Bayesian estimates of ancient DNA damage parameters. Bioinformatics , 29(13), 1682–1684. [https://doi.org/10.1093/bioinformatics/btt193](https://doi.org/10.1093/bioinformatics/btt193)
* **BBduk** Brian Bushnell (Unpublished). Download: [https://sourceforge.net/projects/bbmap/](sourceforge.net/projects/bbmap/)

## Data References
Expand Down
18 changes: 15 additions & 3 deletions assets/multiqc_config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ run_modules:
- gatk
- kraken
- malt
- mapdamage
- mtnucratio
- multivcfanalyzer
- picard
Expand Down Expand Up @@ -91,6 +92,7 @@ top_modules:
path_filters:
- "*.preseq"
- "damageprofiler"
- "mapdamage"
- "mtnucratio"
- "qualimap"
- "sexdeterrmine"
Expand Down Expand Up @@ -155,6 +157,11 @@ table_columns_visible:
3 Prime2: False
mean_readlength: True
median: True
mapDamage:
5 Prime1: True
5 Prime2: True
3 Prime1: False
3 Prime2: False
mtnucratio:
mt_nuc_ratio: True
QualiMap:
Expand Down Expand Up @@ -240,10 +247,15 @@ table_columns_placement:
3 Prime2: 730
mean_readlength: 740
median: 750
mapDamage:
5 Prime1: 760
5 Prime2: 765
3 Prime1: 770
3 Prime2: 775
mtnucratio:
mtreads: 760
mt_cov_avg: 770
mt_nuc_ratio: 780
mtreads: 780
mt_cov_avg: 785
mt_nuc_ratio: 790
QualiMap:
mapped_reads: 800
mean_coverage: 805
Expand Down
Loading
Loading