Skip to content

Conversation

@nwiltsie
Copy link
Contributor

This PR removes references to UCLA-specific cluster paths, including all NFTest files. The template.config file is now filled with example/* paths to indicate what users are expected to provide.

The config tests now work with config/template.config and the input/*.yaml files.

The big remaining task is documenting the provenance of the intersect_regions file - I've removed the default UCLA path but it's unclear how an external user would re-create the file for themselves.

Copilot AI review requested due to automatic review settings August 12, 2025 18:12
@wiz-inc-8da00b022c
Copy link

wiz-inc-8da00b022c bot commented Aug 12, 2025

Wiz Scan Summary

Scanner Findings
Vulnerability Finding Vulnerabilities -
Data Finding Sensitive Data -
Total -

View scan details in Wiz

To detect these findings earlier in the dev lifecycle, try using Wiz Code VS Code Extension.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR removes UCLA-specific infrastructure dependencies by replacing hardcoded UCLA cluster paths (/hot/*) with generic example paths (/example/*, /path/to/*). The purpose is to make the pipeline more accessible to external users by eliminating UCLA-specific references.

Key changes include:

  • Removal of all NFTest configurations and UCLA-specific test files
  • Update of template configuration to use example paths instead of UCLA cluster paths
  • Conversion of config test files to use generic template configuration

Reviewed Changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated no comments.

Show a summary per file
File Description
test/yaml/* Removes all test YAML files containing UCLA-specific BAM and VCF paths
test/global.config Removes Docker configuration with UCLA-specific user/group settings
test/configtest-*.json Updates config tests to use template.config with example paths
test/config/* Removes all UCLA-specific test configuration files
test/assert_vcf.sh Removes VCF comparison utility script
nftest.yml Removes entire NFTest configuration with UCLA-specific test cases
input/example-*.yaml Updates example input files to use generic placeholder paths
config/template.config Updates template to use example paths instead of empty/UCLA paths

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@sorelfitzgibbon
Copy link
Contributor

The big remaining task is documenting the provenance of the intersect_regions file - I've removed the default UCLA path but it's unclear how an external user would re-create the file for themselves.

We do have a description in the README, I assume this is what you're looking for?

intersect_regions* yes string A bed file listing the genomic regions for variant calling. Excluding decoy regions is HIGHLY recommended *

*Providing intersect_regions is required and will limit the final output to just those regions. All regions of the reference genome could be provided as a bed file with all contigs, however it is HIGHLY recommended to remove decoy contigs from the human reference genome. Including these thousands of small contigs will require the user to increase available memory for Mutect2 and will cause a very long runtime for Strelka2. See #216.

Copy link
Contributor

@sorelfitzgibbon sorelfitzgibbon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me other than a potential issue catching BAM vs VCF input configuration

Comment on lines 139 to -168
"input": {
"normal": [
{
"BAM": "/hot/data/unregistered/SMC-HET/normal/bams/A-mini/n2/output/HG002.N-n2.bam"
}
],
"tumor": [
{
"BAM": "/hot/data/unregistered/SMC-HET/tumours/A-mini/bams/n2/output/S2.T-n2.bam",
"contamination_table": "/hot/software/pipeline/pipeline-call-sSNV/Nextflow/development/input/data/A-mini/S2.T-n2_getpileupsummaries_calculatecontamination.table"
}
]
"muse": "/path/to/muse.vcf.gz",
"mutect2": "/path/to/mutect2.vcf.gz",
"somaticsniper": "/path/to/somaticsniper.vcf.gz",
"strelka2": "/path/to/strelka2.vcf.gz"
},
"input_type": "bam",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this missing the potential BAM inputs and catching the potential VCF inputs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change seen here is a result of the test configs being updated above; with the test files removed from the repo, the config that's loaded is config/template.config and the standard input/call-sSNV-template-VCF.yaml

Comment on lines 141 to 154
"input": {
"normal": [
{
"BAM": "/hot/data/unregistered/SMC-HET/normal/bams/A-mini/n2/output/HG002.N-n2.bam"
"BAM": "/path/to/normal.bam"
}
],
"tumor": [
{
"BAM": "/hot/data/unregistered/SMC-HET/tumours/A-mini/bams/n2/output/S2.T-n2.bam",
"contamination_table": "/hot/software/pipeline/pipeline-call-sSNV/Nextflow/development/input/data/A-mini/S2.T-n2_getpileupsummaries_calculatecontamination.table"
"BAM": "/path/to/tumor.bam",
"contamination_table": "/path/to/contamination.table"
}
]
},
"input_type": "bam",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we handle checking configuration for VCF input (one or the other of BAM or VCF type input is allowed)?

@github-actions
Copy link

github-actions bot commented Sep 8, 2025

Bleep bloop, I am a robot.

Alas, some of the Nextflow configuration tests failed!

test/configtest-F32.json

@ ["valid_algorithms"]
- ["somaticsniper","strelka2","mutect2","muse","deepsomatic"]

If the above changes are surprising, stop and determine what happened.

If the above changes are expected, there are two ways to fix this:

  1. Automatically: Post a comment starting with "/fix-tests" (without the quotes) and I will update the tests for you (you must review my work afterwards).
  2. Manually: Follow these steps on Confluence.

@sorelfitzgibbon
Copy link
Contributor

/fix-tests

@github-actions
Copy link

github-actions bot commented Sep 8, 2025

Bleep bloop, I am a robot.

I have updated all of the failing tests for you with 3505e56. You must review my work before merging this pull request!

intersect_regions = '/hot/resource/tool-specific-input/pipeline-call-sSNV-6.0.0/GRCh38-BI-20160721/Homo_sapiens_assembly38_no-decoy.bed.gz'
output_dir = ''
dataset_id = ''
algorithm = ['somaticsniper'] // 'somaticsniper', 'strelka2', 'mutect2', 'muse', 'deepsomatic'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment: Minor point, but do we want somaticsniper to be the default in the template?

@github-actions
Copy link

github-actions bot commented Sep 8, 2025

Bleep bloop, I am a robot.

Alas, some of the Nextflow configuration tests failed!

test/configtest-F16.json

@ ["params","algorithm",["set"],{}]
+ "mutect2"
+ "muse"
+ "strelka2"

test/configtest-F32.json

@ ["params","algorithm",["set"],{}]
+ "mutect2"
+ "muse"
+ "strelka2"

If the above changes are surprising, stop and determine what happened.

If the above changes are expected, there are two ways to fix this:

  1. Automatically: Post a comment starting with "/fix-tests" (without the quotes) and I will update the tests for you (you must review my work afterwards).
  2. Manually: Follow these steps on Confluence.

@sorelfitzgibbon
Copy link
Contributor

/fix-tests

@github-actions
Copy link

github-actions bot commented Sep 8, 2025

Bleep bloop, I am a robot.

I have updated all of the failing tests for you with 0012faf. You must review my work before merging this pull request!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants