You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I aligned reads to a hifiasm reference (one haplotype only) using dorado aligner (v0.9.1) and output to a directory so that the index and summary files were created.
These were transferred to a different cluster (that had available gpus) to run polish.
I have provided the batch command below as well as the contents of the error file. I'm not familiar with the error language - can you tell me how this failed? Thanks.
Geoff
#SBATCH -J dorado1 #Job name#SBATCH --output=dorado_correct_a100_%j.std#SBATCH --error=dorado_correct_a100_%j.err#SBATCH -N 1 #SBATCH --ntasks-per-node=1#SBATCH --mem=64GB#SBATCH -b now #SBATCH -t 48:00:00#SBATCH -p gpu-a100-mig7#SBATCH --gres=gpu:a100_1g.10gb:1
date
./dorado-0.9.1-linux-x64/bin/dorado polish -o BCNH.Q15.ont.hap1.polish.fasta ONThap1Align1/BCNH.all.bam BCNH.Q15.ont.hap1.fasta
date
[geoff]$ cat dorado_correct_a100_16786687.err
[2025-01-22 15:58:34.414] [info] Running: "polish" "-o" "BCNH.Q15.ont.hap1.polish.fasta" "ONThap1Align1/BCNH.all.bam" "BCNH.Q15.ont.hap1.fasta"
[2025-01-22 15:58:35.985] [info] Input data does not contain move tables.
[2025-01-22 15:58:35.985] [info] Auto resolving the model.
[2025-01-22 15:58:35.985] [info] Downloading model: '[email protected]_polish_rl'
[2025-01-22 15:58:36.030] [warning] Unknown certs location for current distribution. If you hit download issues, use the envvar `SSL_CERT_FILE` to specify the location manually.
[2025-01-22 15:58:36.311] [info] - downloading [email protected]_polish_rl with httplib
[2025-01-22 15:59:56.458] [error] Failed to download [email protected]_polish_rl: Connection timed out
[2025-01-22 15:59:56.463] [info] - downloading [email protected]_polish_rl with curl
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 17.1M 100 17.1M 0 0 37.9M 0 --:--:-- --:--:-- --:--:-- 37.9M
[2025-01-22 15:59:57.057] [info] Parsing the model config: /project/blue_catfish_genome/geoff/.temp_dorado_model-1baaa4b8f1465bd5/[email protected]_polish_rl/config.toml
[2025-01-22 15:59:57.106] [info] Initializing the devices.
[2025-01-22 16:00:01.072] [info] Loaded model to device 0: cuda:0
[2025-01-22 16:00:01.072] [info] Creating the encoder.
[2025-01-22 16:00:01.077] [info] Creating the decoder.
[2025-01-22 16:00:01.078] [info] Creating 128 BAM handles.
[2025-01-22 16:00:02.776] [info] Threads: 128, inference threads: 1, number of devices: 1
terminate called after throwing an instance of 'std::runtime_error'
what(): Caught exception from merge-samples task: index 98 is out of bounds for dimension 1 with size 98
Exception raised from applySelect at /pytorch/pyold/aten/src/ATen/TensorIndexing.h:245 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x57 (0x7fcfb5dc99b7 in /project/blue_catfish_genome/geoff/dorado-0.9.1-linux-x64/bin/../lib/libdorado_torch_lib.so)
frame #1: <unknown function> + 0x39f962a (0x7fcfaed8a62a in /project/blue_catfish_genome/geoff/dorado-0.9.1-linux-x64/bin/../lib/libdorado_torch_lib.so)
frame #2: <unknown function> + 0x45bb7f6 (0x7fcfaf94c7f6 in /project/blue_catfish_genome/geoff/dorado-0.9.1-linux-x64/bin/../lib/libdorado_torch_lib.so)
frame #3: <unknown function> + 0x45bdd03 (0x7fcfaf94ed03 in /project/blue_catfish_genome/geoff/dorado-0.9.1-linux-x64/bin/../lib/libdorado_torch_lib.so)
frame #4: at::Tensor::index_put_(c10::ArrayRef<at::indexing::TensorIndex>, at::Tensor const&) + 0x13a (0x7fcfaf95045a in /project/blue_catfish_genome/geoff/dorado-0.9.1-linux-x64/bin/../lib/libdorado_torch_lib.so)
frame #5: /project/blue_catfish_genome/geoff/dorado-0.9.1-linux-x64/bin/dorado() [0xa1b786]
frame #6: /project/blue_catfish_genome/geoff/dorado-0.9.1-linux-x64/bin/dorado() [0xa1cf33]
frame #7: /project/blue_catfish_genome/geoff/dorado-0.9.1-linux-x64/bin/dorado() [0xa1f34f]
frame #8: /project/blue_catfish_genome/geoff/dorado-0.9.1-linux-x64/bin/dorado() [0xa1fd4c]
frame #9: /project/blue_catfish_genome/geoff/dorado-0.9.1-linux-x64/bin/dorado() [0x9f8af3]
frame #10: /project/blue_catfish_genome/geoff/dorado-0.9.1-linux-x64/bin/dorado() [0x9f957b]
frame #11: /project/blue_catfish_genome/geoff/dorado-0.9.1-linux-x64/bin/dorado() [0x869d1b]
frame #12: <unknown function> + 0xa4a38 (0x7fcfaa344a38 in /usr/lib64/libc.so.6)
frame #13: /project/blue_catfish_genome/geoff/dorado-0.9.1-linux-x64/bin/dorado() [0x9ea1ba]
frame #14: /project/blue_catfish_genome/geoff/dorado-0.9.1-linux-x64/bin/dorado() [0x8770e0]
frame #15: <unknown function> + 0x1196e380 (0x7fcfbccff380 in /project/blue_catfish_genome/geoff/dorado-0.9.1-linux-x64/bin/../lib/libdorado_torch_lib.so)
frame #16: <unknown function> + 0x9f802 (0x7fcfaa33f802 in /usr/lib64/libc.so.6)
frame #17: <unknown function> + 0x3f450 (0x7fcfaa2df450 in /usr/lib64/libc.so.6)
/var/spool/slurmd/job16786687/slurm_script: line 19: 646539
(core dumped)
/project/blue_catfish_genome/geoff/dorado-0.9.1-linux-x64/bin/dorado polish -o BCNH.Q15.ont.hap1.polish.fasta ONThap1Align1/BCNH.all.bam BCNH.Q15.ont.hap1.fasta
The text was updated successfully, but these errors were encountered:
I aligned reads to a hifiasm reference (one haplotype only) using dorado aligner (v0.9.1) and output to a directory so that the index and summary files were created.
These were transferred to a different cluster (that had available gpus) to run polish.
I have provided the batch command below as well as the contents of the error file. I'm not familiar with the error language - can you tell me how this failed? Thanks.
Geoff
The text was updated successfully, but these errors were encountered: