Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The lowest GPU memory required for dorado correct? #1100

Open
hrluo93 opened this issue Oct 24, 2024 · 3 comments
Open

The lowest GPU memory required for dorado correct? #1100

hrluo93 opened this issue Oct 24, 2024 · 3 comments
Labels
read_correction Read error correction

Comments

@hrluo93
Copy link

hrluo93 commented Oct 24, 2024

Issue Report

I tried to use the CPU model running Dorado correct from the prepared paf file, but the speed was extremely slow 1GB corrected reads in 4 hours.

I would like to know the lowest GPU memory required. Does 16/20Gb enough for correction? GPU with >20Gb memory is quite expensive in my country.

Best wishes!

Run environment:

  • Hardware (CPUs, Memory, GPUs): Intel Xeon 8336C, 512 Gb RAM, No GPU
  • Source data type (e.g., pod5 or fast5 - please note we always recommend converting to pod5 for optimal basecalling performance): Raw reads mapped PAF file
  • Source data location (on device or networked drive - NFS, etc.): Local HDD
  • Details about data (flow cell, kit, read lengths, number of reads, total dataset size in MB/GB/TB): ~40GB raw UL reads, ~120Gb PAF
@HalfPhoton
Copy link
Collaborator

Hi @hrluo93,

We don't have a better estimate for a minimum GPU requirement unfortunately.

As you've read, we recommend GPUs with high VRAM for use with Dorado Correct as it's a computationally intensive task but it is possible to reduce the VRAM usage If you're having trouble by setting --batchsize argument during inference, but that might only go so far.

dorado correct reads.fastq --batch-size <number> > corrected_reads.fasta

@hrluo93
Copy link
Author

hrluo93 commented Oct 24, 2024

Hi @hrluo93,

We don't have a better estimate for a minimum GPU requirement unfortunately.

As you've read, we recommend GPUs with high VRAM for use with Dorado Correct as it's a computationally intensive task but it is possible to reduce the VRAM usage If you're having trouble by setting --batchsize argument during inference, but that might only go so far.

dorado correct reads.fastq --batch-size <number> > corrected_reads.fasta

Thanks for your reply, I am trying to test if 16Gb GPU is possible and giving feedback!

Thanks again!

@iiSeymour iiSeymour added the read_correction Read error correction label Oct 29, 2024
@hrluo93
Copy link
Author

hrluo93 commented Nov 10, 2024

I have tested on an RTX 3090 GPU. Dorado 0.8.2. One cell ~80Gb fastq (~40Gb raw Ul fasta)

  1. Running: correct" "-x" "cuda:0" "--infer-threads" "1" "-b" "32", ([info] Using batch size 32 on device in inference thread 0. )
  2. Running: "correct" "-x" "cuda:0", [2024-11-10 17:21:17.560] [info] Using batch size 8 on device in inference thread 0.
    [2024-11-10 17:21:17.560] [info] Using batch size 8 on device cuda:0 in inference thread 1.

The minimum GPU memory should be at least 22 GB in both two commands.

If I perform all cells combined in one fastq and PAF (such as 3cell ~120Gb raw UL fasta), do dorado will use more GPU memory?

Best wishes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
read_correction Read error correction
Projects
None yet
Development

No branches or pull requests

3 participants