-
Notifications
You must be signed in to change notification settings - Fork 16
Open
Labels
Description
Is your feature request related to a problem? Please describe
Am working on in-planta infection RNAseq assays for the Lentil-Ascochyta pathosystem. The pathogen side of things works perfectly with the pipeline as currently constituted (albeit with a few hiccups on the cleanup of large intermediate files as mentioned in a separate issue report). However, the analysis cannot go past samtools indexing when using the Lentil genome as a reference.
Describe the solution you'd like
Am not entirely sure how to handle this (perhaps a try-catch-except), but, if the tool could do some kind of chromosome length check and send the BAM files to appropriate samtools indexing and generate either the default *.BAI or *.CSI for larger genomes.
Describe alternatives you've considered
I have modified the process "samtools_index" as below to add the -c flag thus enabling csi index for this run.
process samtools_index {
publishDir "${params.outdir}/Samples/${sample_id}", mode: params.publish_dir_mode, pattern: publish_pattern_samtools_index
tag { sample_id }
label "samtools"
input:
set val(sample_id), file(bam_file) from SORTED_FOR_INDEX
output:
set val(sample_id), file(bam_file) into BAM_INDEXED_FOR_STRINGTIE
set val(sample_id), file("*.bam.csi") into BAI_INDEXED_FILE
// set val(sample_id), file("*.bam.bai") into BAI_INDEXED_FILE
set val(sample_id), file("*.bam.log") into BAM_INDEXED_LOG
script:
"""
echo "#TRACE sample_id=${sample_id}"
echo "#TRACE bam_bytes=`stat -Lc '%s' *.bam`"
// samtools index ${bam_file}
samtools index -c ${bam_file}
samtools stats ${bam_file} > ${sample_id}.bam.log
"""
}