-
Notifications
You must be signed in to change notification settings - Fork 0
Description
All SCOUT scripts adopt hardcoded absolute paths that depend on the machine used to produce the data, i.e., Orfeo.
For instance, SPN01 simulate_mutations.R script contains the following lines.
outdir <- "/orfeo/scratch/cdslab/shared/SCOUT/SPN01/races/"
forest <- load_samples_forest(paste0(outdir,"sample_forest.sff"))
setwd("/orfeo/cephfs/scratch/cdslab/shared/ProCESS/GRCh38")
m_engine <- MutationEngine(setup_code = "GRCh38",tumour_type = "COAD",
tumour_study = "US")
This approach works on Orfeo, but it is not general at all and prevents using SCOUT scripts on different machines or directory configurations. This is a major issue because data reproducibility is one of the main goal of SCOUT.
I suggest removing the absolute paths and exclusively accessing the subdirectories of the working directory. For instance, the above line would become
outdir <- "output"
forest <- load_samples_forest(file.path(outdir, "sample_forest.sff"))
m_engine <- MutationEngine(setup_code = "GRCh38",tumour_type = "COAD",
tumour_study = "US")
If using subdirectories is not optimal, for instance because you want to share the same mutation engine directory, you can either
- define the subdirectories "
output" and "GRCh38" as symbolic links by executing the following command-line linesln -s /orfeo/scratch/cdslab/shared/SCOUT/SPN01/races/ output ln -s /orfeo/cephfs/scratch/cdslab/shared/ProCESS/GRCh38/GRCh38 GRCh38 - add two parameters to the SCOUT scripts to explicity get output and mutation engine directories. In this case, the original code snippet could become:
args <- commandArgs(trailingOnly = TRUE) if (length(args) != 2) { args <- commandArgs(trailingOnly = FALSE) script_path <- sub("^--file=", "", args[grep("^--file=", args)]) stop(paste("Syntax error: Rscript", basename(script_path), "<output_dir> <mutation_engine_directory>")) } output <- args[1] mu_dir <- args[1] forest <- load_samples_forest(paste0(outdir,"sample_forest.sff")) setwd(mu_dir) m_engine <- MutationEngine(setup_code = "GRCh38",tumour_type = "COAD", tumour_study = "US")
Using paste0() to join paths should also be deprecated in favor of file.path().