Skip to content

Question/suggestion regarding naming of temporary BED files #285

@Vbbab

Description

@Vbbab

Hi,

I noticed that when given VCFs as input to AnnotSV, in VCFsToBED, it seems to convert them to temporary BED files named using the current timestamp:

set SV_BEDfile "$g_AnnotSV(outputDir)/[clock format [clock seconds] -format "%Y%m%d-%H%M%S"]_AnnotSV_inputSVfile.bed"

I was wondering if the naming for this file could be slightly improved (for example, to include the PID of the current process), because this seems to have unintended consequences with file races when multiple instances of AnnotSV (e.g., which are running as part of batch jobs) are working within the same output directory. I myself was running AnnotSV as part of a larger pipeline using Snakemake and SLURM, and noticed that, despite having multiple separate inputs being processed, the temporary BED files clearly looked like they were being raced by multiple processes and had corrupted and duplicated entries all over the place, leading to bedtools being unable to process them and thus AnnotSV to error out entirely. Likely, what was happening was that multiple instances of AnnotSV were started at the same time, and thus decided on the same temporary filename.

If you'd like, I should be able to prepare a PR to address this issue--it seems to be a rather straightforward change, though I don't have much experience with TCL. I'd appreciate it if you could let me know what you think, though! Thanks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions