Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aligning basecalled bam file to genome #1244

Open
baibhav-bioinfo opened this issue Feb 5, 2025 · 7 comments
Open

aligning basecalled bam file to genome #1244

baibhav-bioinfo opened this issue Feb 5, 2025 · 7 comments

Comments

@baibhav-bioinfo
Copy link

hello,
as there is already an option in Dorado to align the bascalled bam file to genome using minimap2. But as i can see there are limited number of options available in customising the commands of alignment.

So, if i align the reads separately using minimap2 and then use the aligned bam for subsequent analysis in modkit, will that be okay?

@sklages
Copy link

sklages commented Feb 6, 2025

Yes, of course. That's what we do here from the very beginning for better resource and parameter control .. make sure that you pass over all (relevant) tags from the input BAM ($uBAM) to fastq in order to have methylation information in final alignment results, e.g.:

samtools fastq \
    --threads 8 \
    -T "*" \
    $uBAM \
| minimap2 \
    --secondary=yes \
    -2 \
    -a \
    -y \
    -t 32 \
    -K 2G \
    -x map-ont \
    $FA_REF \
    - \
| samtools sort \
    -m 2G \
    --threads 8 \
    -O BAM \
    -o $BAM_ALN \
    --write-index \
    --reference $FA_REF \
    -

Then feed $BAM_ALN into modkit ...

@baibhav-bioinfo
Copy link
Author

thanks,
but for genome alignment of DRS reads with minimap2, I think the best way is "-x splice" rather than "-x map-ont"

(1) I changed the "ubam" into fastq by preserving the mod tags using "-T" argument
(2) But i lost the mod tags in the sam file after minimap alignment.

how to preserve the tags after genome alignment?

@sklages
Copy link

sklages commented Feb 7, 2025

thanks, but for genome alignment of DRS reads with minimap2, I think the best way is "-x splice" rather than "-x map-ont"

Sure. You didn't mention DRS :-)

(1) I changed the "ubam" into fastq by preserving the mod tags using "-T" argument (2) But i lost the mod tags in the sam file after minimap alignment.

samtools fastq <..> reads a BAM file and converts it into a fastq (formatted stream). It preserves tags during this process by providing the flag -T <tag>. Here we preserve all tags by using *. This needs at least samtools 1.16.

You wanted to align a basecalled BAM. What is "fastq" here?

how to preserve the tags after genome alignment?

How does your actual command (pipe) look like?

@baibhav-bioinfo
Copy link
Author

baibhav-bioinfo commented Feb 7, 2025

the fastq is what we get after converting the basecalled bam into fastq using samtools.
this is done because what I found out is minimap2 can not use bam files as input in alignment. Am i correct?

@sklages
Copy link

sklages commented Feb 8, 2025

That's what the above command is actually doing. We do not store (intermediate) fastq files, so we just read the BAMs once, convert them on-the-fly to fastq and pipe the stream directly to minimap2. There is no need for minimap2 to read to BAM data directly, since reading BAM and conversion with samtools is just as fast as it would be for minimap2 to read the BAM.

But if you already have fastq data and you want to store them, then just take a quick look at the fastq header to see, if the methylation data (MM,ML tags) have been passed over to fastq (-T flag).

If the fastq headers are okay, then make sure you have used -y when mapping your fastq data files with minimap2 (Copy input FASTA/Q comments to output.).

@baibhav-bioinfo
Copy link
Author

okay,
Yes i have all the tags in the fastq headers, but I missed the -y command in minimap2
Thanks, i will try again.

@sklages
Copy link

sklages commented Feb 11, 2025

@baibhav-bioinfo - did it work for you?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants