Skip to content

Add nf-test #1063

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 75 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
75 commits
Select commit Hold shift + click to select a range
f7fceaf
Add general structure, and generate (not working!) snapshot for basic…
jfy133 Apr 26, 2024
0778e58
Try adding an ignore zip system, doesn't work atm though...
jfy133 Apr 26, 2024
eb73ae6
Merge branch 'dev' into nf-test-conversionMerge in from dev
jfy133 Jun 28, 2024
cc114f3
Continue customisation
jfy133 Jun 28, 2024
37994df
More debugging
jfy133 Jul 12, 2024
e07dc06
Get working function
jfy133 Jul 19, 2024
c0f93d9
Get working function properly, and start integrating into snapshot: p…
jfy133 Jul 19, 2024
de035ca
Extract names properly
jfy133 Jul 19, 2024
7890575
Remove unnecesary code
jfy133 Jul 19, 2024
b795b9e
Get working string check function!
jfy133 Jul 19, 2024
cac8152
TODO
jfy133 Jul 19, 2024
a8dcd64
Update MQC so works on gitpod, try to re-add original getAllFilesFrom…
jfy133 Jul 20, 2024
56cba0c
And get it back again
jfy133 Jul 20, 2024
0f12b21
Sort output to ensure consistency
jfy133 Jul 22, 2024
d41aad9
Sort file name only funciton
jfy133 Jul 22, 2024
61c4d24
Start testing preprocessing
jfy133 Jul 26, 2024
19db88c
Merge branch 'dev' into nf-test-conversion
jfy133 Jul 26, 2024
bcc3baf
Start adding preprocessing dir but diff not working so can't ser vari…
jfy133 Jul 26, 2024
0b58cf7
Backing up TODO notes
jfy133 Jul 26, 2024
eb4824e
Final snapshots, need to double check all files are covered but I thi…
jfy133 Aug 16, 2024
9ce930f
Finalise first test
jfy133 Sep 20, 2024
5598eec
Merge branch 'dev' into nf-test-conversion
jfy133 Sep 20, 2024
6362fa3
Update tests to latest dev
jfy133 Sep 20, 2024
61086b5
Merge branch 'dev' into nf-test-conversion
jfy133 Oct 4, 2024
9a56d48
Merge branch 'dev' into nf-test-conversion
jfy133 Oct 4, 2024
bb12ca1
Start refactoring to use nft-utils
jfy133 Oct 4, 2024
87fafc5
fix qualimap output dir typo
TCLamnidis Oct 11, 2024
ef71d3b
no sampleid in qualimap outdir
TCLamnidis Oct 11, 2024
a7fadc6
bump nft-utils version
TCLamnidis Oct 18, 2024
4baefaf
start refactoring to use nft-utils
TCLamnidis Oct 18, 2024
8b7e950
update snapshot. Not sure it works yet.
TCLamnidis Oct 18, 2024
2e5bf4f
add preprocessing
TCLamnidis Nov 8, 2024
c204fe9
fix preprocessing checks
TCLamnidis Nov 29, 2024
4efc86b
Merge remote-tracking branch 'origin/dev' into nf-test-conversion
TCLamnidis Nov 30, 2024
9f3611f
Merge remote-tracking branch 'origin/dev' into nf-test-conversion
TCLamnidis Dec 3, 2024
6893eb6
reorder and add todos
TCLamnidis Dec 3, 2024
13a640c
remove versions.ymls from snapshot
TCLamnidis Dec 3, 2024
5241ffc
check bam names instead of content
TCLamnidis Dec 4, 2024
0dab178
add ave_filtered_bam parameter to test profile
TCLamnidis Dec 4, 2024
01850ad
fix result pickup for bam filtering and deduplication
TCLamnidis Dec 4, 2024
42b3a47
add final_bams,mapstats. fix mapping. remove old code
TCLamnidis Dec 6, 2024
0c6c8e3
simplify bam_input_stats snapshot
TCLamnidis Dec 6, 2024
50ea5e1
update snapshot
TCLamnidis Dec 6, 2024
997fb5b
exclude unstable qualimap results.
TCLamnidis Dec 6, 2024
82cb639
Check existence of Multiqc output files. reorder tests to match alpha…
TCLamnidis Dec 11, 2024
61f6285
Merge branch 'dev' into nf-test-conversion
TCLamnidis Dec 15, 2024
c0cd1eb
remove leftover todo
TCLamnidis Dec 20, 2024
53d82b0
Merge branch 'nf-test-conversion' of github.com:nf-core/eager into nf…
TCLamnidis Dec 20, 2024
5ed2b0b
add command legend, and align a bit more
TCLamnidis Dec 20, 2024
80007d7
Fix test. and snapshot.
TCLamnidis Jan 24, 2025
21a3462
Merge branch 'dsl2-restructure-output' into nf-test-conversion
TCLamnidis Feb 21, 2025
024433e
bump nft-bam version
TCLamnidis Feb 21, 2025
3091e4a
Start over with new output directory structure
TCLamnidis Feb 21, 2025
8627928
bump nft-bams to latest
TCLamnidis Feb 21, 2025
64106a6
remove premature snapshot
TCLamnidis Feb 28, 2025
44cda40
WIP test reimplementation
TCLamnidis Feb 28, 2025
466e999
add all remaining sections. tests still fail.
TCLamnidis Mar 7, 2025
6e17b72
Merge branch 'dev' into nf-test-conversion
TCLamnidis Mar 7, 2025
757eae4
Slightly improved docs phrasing and structure
jfy133 Mar 14, 2025
ddfd514
Add some TODOs
jfy133 Mar 14, 2025
70a1bcf
Add ebugging prints
jfy133 Mar 21, 2025
4e4dacb
Fix var name
jfy133 Mar 21, 2025
a71ab62
Update test snapshot to exclude directories and contained files
jfy133 Mar 21, 2025
07e700b
Merge branch 'dev' into nf-test-conversion
jfy133 Mar 25, 2025
2da70df
Add missing bam and flagstat files
jfy133 Mar 25, 2025
b851b79
remove unused nft-bam plugin
TCLamnidis Apr 11, 2025
fe911fe
cleanup unused code and debug output
TCLamnidis Apr 11, 2025
f647346
Merge branch 'dev' into nf-test-conversion
TCLamnidis Apr 11, 2025
8b87f81
Merge branch 'dev' into nf-test-conversion
TCLamnidis May 2, 2025
9042f8d
add bamfiltering_savefilteredbams again
TCLamnidis May 2, 2025
4735f76
remve duplicate Qualimap config
TCLamnidis May 2, 2025
edcad86
update test. add genotyping and metagenomics
TCLamnidis May 2, 2025
744f415
update snapshot
TCLamnidis May 2, 2025
76d6af4
exclude unstable files from md5sum
TCLamnidis May 2, 2025
64171ed
Stricter checking of VCF file checksums using nft-vcf.
TCLamnidis May 2, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 21 additions & 3 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
@@ -1,10 +1,13 @@
name: nf-core CI
# This workflow runs the pipeline with the minimal test dataset to check that it completes without any syntax errors
name: nf-core CI
on:
push:
branches:
- dev
- "dev"
pull_request:
branches:
- "dev"
- "master"
release:
types: [published]
workflow_dispatch:
Expand All @@ -15,16 +18,31 @@ env:
NXF_SINGULARITY_LIBRARYDIR: ${{ github.workspace }}/.singularity

concurrency:
group: "${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}"
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true

jobs:
define_nxf_versions:
name: Choose nextflow versions to test against depending on target branch
runs-on: ubuntu-latest
outputs:
matrix: ${{ steps.nxf_versions.outputs.matrix }}
steps:
- id: nxf_versions
run: |
if [[ "${{ github.event_name }}" == "pull_request" && "${{ github.base_ref }}" == "dev" && "${{ matrix.NXF_VER }}" != "latest-everything" ]]; then
echo matrix='["latest-everything"]' | tee -a $GITHUB_OUTPUT
else
echo matrix='["latest-everything", "23.10.0"]' | tee -a $GITHUB_OUTPUT
fi

test:
name: "Run pipeline with test data (${{ matrix.NXF_VER }} | ${{ matrix.test_name }} | ${{ matrix.profile }})"
# Only run on push if this is the nf-core dev branch (merged PRs)
if: "${{ github.event_name != 'push' || (github.event_name == 'push' && github.repository == 'nf-core/eager') }}"
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
NXF_VER:
- "24.04.2"
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@ testing*
*.pyc
null/
.nf-test*

13 changes: 0 additions & 13 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -1067,19 +1067,6 @@ process {
ext.args = { "--profiler ${meta.profiler} --output ${meta.profiler}taxpasta_table.tsv" }
}

//
// QUALIMAP
//

withName: 'QUALIMAP_BAMQC_WITHBED|QUALIMAP_BAMQC_NOBED' {
tag = { "${meta.reference}|${meta.sample_id}" }
publishDir = [
path: { "${params.outdir}/mapstats/qualimap/${meta.reference}/" },
mode: params.publish_dir_mode,
saveAs: { filename -> filename.equals('versions.yml') ? null : filename },
]
}

//
// DAMAGE CALCULATION
//
Expand Down
1 change: 1 addition & 0 deletions conf/test.config
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ params {
bamfiltering_minreadlength = 30
bamfiltering_mappingquality = 37
deduplication_tool = 'markduplicates'
bamfiltering_savefilteredbams = true

// PreSeq
mapstats_preseq_mode = 'c_curve'
Expand Down
14 changes: 14 additions & 0 deletions nf-test.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
config {

testsDir "tests"
workDir ".nf-test"
configFile "tests/nextflow.config"
profile ""

// load the necessary plugins
plugins {
load "[email protected]"
load "[email protected]"
}

}
5 changes: 5 additions & 0 deletions tests/nextflow.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
/*
========================================================================================
Nextflow config file for running tests
========================================================================================
*/
142 changes: 142 additions & 0 deletions tests/test.nf.test
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
nextflow_pipeline {

name "Test pipeline: NFCORE_EAGER"
script "main.nf"
tag "pipeline"
tag "nfcore_eager"
tag "test"

test("test_profile") {

when {
params {
outdir = "$outputDir"
}
}

then {

///////////////////
// DOCUMENTATION //
///////////////////

// The contents of each top level results directory should be tested with individually named snapshots.
// Within each snapshot, there should be two to three distinct variables, that contain the files to be tested.
// - stable_name_<dir> is for files with variable md5sums (i.e. content) so only names will be compared
// - stable_content_<dir> is for files with stable md5sums (i.e. content) so md5sums will be compared
// - bams_<dir> is for BAM files, where the headerMD5 is checked for stability (since the content can be unstable)
// If a directory is fully stable, you can drop `stable_name_*`
// If a directory contains no BAMs, you can drop `bams_*`

// Generate with: nf-test test --tag test --profile docker,test --update-snapshot
// Test with: nf-test test --tag test --profile docker,test
// NOTE: BAMs are always only stable in name, because:
// a) sharding breaks header since the shard that was first is named in the header (Fixed in https://github.com/nf-core/eager/pull/1112)
// b) the order of the reads in the BAMs is not stable (sorted, but reads that share a start position can be in any order)
// point b) also causes BAIs to be unstable.
// c) Merging of multiple BAMs with duplicate @RG / @PG tags can cause the header to be unstable (particularly in the case of shards/lanes)

//////////////////////
// DEFINE VARIABLES //
//////////////////////

// Define exclusion patterns for files with unstable contents
// NOTE: When a section needs more than a couple of small patterns, consider adding a variable to store the patterns here
// This is particularly important if the patterns excluded in the stable content section should be included in the stable name section
def unstable_patterns_auth = [
'**/mapped_reads_gc-content_distribution.txt',
'**/genome_gc_content_per_window.png',
'**/*.{svg,pdf,html}',
'**/DamageProfiler.log',
]

// Check that no files are missing/added
// Command legend: Result directory to index , includeDir: include dirs?, ignore: exclude patterns , ignoreFile: exclude pattern list , include: include patterns
def stable_name_all = getAllFilesFromDir("$outputDir/" , includeDir: false , ignore: ['pipeline_info/*'] , ignoreFile: null , include: ['*', '**/*'] )

// Authentication
def stable_content_authentication = getAllFilesFromDir("$outputDir/authentication" , includeDir: false , ignore: unstable_patterns_auth , ignoreFile: null , include: ['*', '**/*'] )
def stable_name_authentication = getAllFilesFromDir("$outputDir/authentication" , includeDir: false , ignore: null , ignoreFile: null , include: unstable_patterns_auth)

// Deduplication - TODO -> snapshot both lists are empty!?
def stable_content_deduplication = getAllFilesFromDir("$outputDir/deduplication" , includeDir: false , ignore: null , ignoreFile: null , include: ['**/*.flagstat'] )
def stable_name_deduplication = getAllFilesFromDir("$outputDir/deduplication" , includeDir: false , ignore: null , ignoreFile: null , include: ['**/*.{bam,bai}'] )

// Final_bams
def stable_content_final_bams = getAllFilesFromDir("$outputDir/final_bams" , includeDir: false , ignore: null , ignoreFile: null , include: ['**/*.flagstat'] )
def stable_name_final_bams = getAllFilesFromDir("$outputDir/final_bams" , includeDir: false , ignore: null , ignoreFile: null , include: ['**/*.{bam,bai}'] )

// Mapping (incl. bam_input flasgstat)
def stable_content_mapping = getAllFilesFromDir("$outputDir/mapping" , includeDir: false , ignore: null , ignoreFile: null , include: ['**/*.flagstat'] )
def stable_name_mapping = getAllFilesFromDir("$outputDir/mapping" , includeDir: false , ignore: null , ignoreFile: null , include: ['**/*.{bam,bai}'] )

// Preprocessing
// NOTE: FastQC html appears stable, but I worry it might just include a day timestamp instead of a full timestamp. To keep the expression simpler I removed both from checksum testing.
def stable_content_preprocessing = getAllFilesFromDir("$outputDir/preprocessing" , includeDir: false , ignore: ['**/*.{zip,log,html}'], ignoreFile: null , include: ['**/*'] )
def stable_name_preprocessing = getAllFilesFromDir("$outputDir/preprocessing" , includeDir: false , ignore: null , ignoreFile: null , include: ['**/*.{zip,log,html}'] )

// Read filtering
def stable_content_readfiltering = getAllFilesFromDir("$outputDir/read_filtering" , includeDir: false , ignore: null , ignoreFile: null , include: ['**/*.flagstat'] )
def stable_name_readfiltering = getAllFilesFromDir("$outputDir/read_filtering" , includeDir: false , ignore: null , ignoreFile: null , include: ['**/*.{bam,bai}'] )

// Genotyping
def stable_content_genotyping = getAllFilesFromDir("$outputDir/genotyping" , includeDir: false , ignore: ['**/*.{tbi,vcf.gz}'] , ignoreFile: null , include: ['**/*'] )
def stable_name_genotyping = getAllFilesFromDir("$outputDir/genotyping" , includeDir: false , ignore: null , ignoreFile: null , include: ['**/*.tbi'] )
// We need to collect the vcfs separately to run more specific md5sum checks on the header (contnts are unstable due to same reasons as BAMs, explained above).
def genotyping_vcfs = getAllFilesFromDir("$outputDir/genotyping" , includeDir: false , ignore: null , ignoreFile: null , include: ['**/*.vcf.gz'] )

// Metagenomics
def stable_content_metagenomics = getAllFilesFromDir("$outputDir/metagenomics" , includeDir: false , ignore: ['**/*.biom'] , ignoreFile: null , include: ['**/*'] )
def stable_name_metagenomics = getAllFilesFromDir("$outputDir/metagenomics" , includeDir: false , ignore: null , ignoreFile: null , include: ['**/*.biom'] )

// MultiQC
def stable_name_multiqc = getAllFilesFromDir("$outputDir/multiqc" , includeDir: false , ignore: null , ignoreFile: null , include: ['*', '**/*'] )

///////////////////////
// DEFINE ASSERTIONS //
///////////////////////

assertAll(
{ assert workflow.success },
// This checks that there are no missing or additional output files.
// Also a good starting point to look at all the files in the output folder than need to be checked in subsequent sections.
{ assert snapshot( stable_name_all*.name ).match("all_files") },

// Checking changes to contents of each section
// NOTE: Keep the order of the sections in the alphanumeric order of the output directories.
// Each section should first check stable_content, stable_name second (if applicable).
{ assert snapshot( stable_content_authentication , stable_name_authentication*.name ).match("authentication") },
{ assert snapshot( stable_content_deduplication , stable_name_deduplication*.name ).match("deduplication") },
{ assert snapshot( stable_content_final_bams , stable_name_final_bams*.name ).match("final_bams") },
// NOTE: The snapshot section for mapping cannot be named 'mapping'. See https://github.com/askimed/nf-test/issues/279
{ assert snapshot( stable_content_mapping , stable_name_mapping*.name ).match("mapping_output") },
{ assert snapshot( stable_content_preprocessing , stable_name_preprocessing*.name ).match("preprocessing") },
{ assert snapshot( stable_content_readfiltering , stable_name_readfiltering*.name ).match("read_filtering") },
{ assert snapshot( stable_content_genotyping , stable_name_genotyping*.name ).match("genotyping") },
// Additional checks on the genotyping VCFs for content. Specifically the md5sums of the header FORMAT and INFO lines
{ assert snapshot(
genotyping_vcfs.collect {
file ->
def vcf_head = path(file.toString()).vcf.header
// The header contains lines in the "OTHER" category, which contain a timestamp, so we need to filter those out, then calculate md5sums.
def header_md5 = [
vcf_head.getFormatHeaderLines().toString(),
vcf_head.getInfoHeaderLines().toString(),
vcf_head.getFilterLines().toString(),
vcf_head.getIDHeaderLines().toString(),
vcf_head.getGenotypeSamples().toString(),
vcf_head.getContigLines().toString(),
].join(' ').md5()
file.getName() + ":header_md5," + header_md5
}
).match("genotyping_vcfs")},
{ assert snapshot( stable_content_metagenomics , stable_name_metagenomics*.name ).match("metagenomics") },
{ assert snapshot( stable_name_multiqc*.name ).match("multiqc") },

// Versions
{ assert new File("$outputDir/pipeline_info/nf_core_eager_software_mqc_versions.yml").exists() },

)
}
}
}
Loading
Loading