You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This misbehavior happens with fasterq-dump 3.0.7 on the following accession and a few others, at a rate of about 7% of biosample-based accessions. SAMN08049698 and some other biosample-based accessions fail, but most work fine including within the same project. Interestingly, the underlying run in this case (SRR6468610) still works fine.
Why this is a bug:
The sra file appears to indeed contains quality scores
There is only one underlying (single) RUN accession for these biosamples (which is, in turn, the only fastq data associated with them)
The underlying run accession, when supplied directly, works as expected for these cases, including the example.
Other biosample accessions from the same study submitted at the same time from the same modality (even from the same subject!) work perfectly. In this case that would be SAMN08049628 (another biosample that dumps properly directly to fastq).
A workaround could be manually using entrez direct or something to translate into Run IDs, dumping the Run IDs, renaming (or concatenating) to biosample IDs again. But that's quite the workaround for the couple of files this fails on, so I thought I'd report it here.
The text was updated successfully, but these errors were encountered:
GabeAl
changed the title
fasterq-dump.3.0.7 err: the input data is missing the QUALITY-column
"fasterq-dump.3.0.7 err: the input data is missing the QUALITY-column" BUT quality column is present
Sep 12, 2023
GabeAl
changed the title
"fasterq-dump.3.0.7 err: the input data is missing the QUALITY-column" BUT quality column is present
"fasterq-dump.3.0.7 err: the input data is missing the QUALITY-column" BUT sra QUALITY column is present.
Sep 12, 2023
I am having the same problem. I am running fasterq-dump in a snakemake workflow. Weirdly. when i run without specifying a temp directory (which uses default) OR when I direct my home (~/temp-fasterq-dump). I get no errors. I am thinking it might be a permssion or directory lock problem maybe?
Important notes
I download SRA from AWS s3 bucket then do fasterq-dump to download SRA file.
despite the error. fastq 1 and 2 are generated.
$ code/sratoolkit.3.0.7-centos_linux64/bin/fasterq-dump raw-data/testproject-001/GSM3148577_BC10_TUMOR1/SRR7191904.sra --threads 8 --split-3 -O raw-data/testproject-001/GSM3148577_BC10_TUMOR1 -t raw-data/testproject-001/GS
M3148577_BC10_TUMOR1/temp-fasterq-dump
spots read : 41,812,717
reads read : 83,625,434
reads written : 83,625,434
2024-01-09T18:32:54 fasterq-dump.3.0.7 err: the input data is missing the QUALITY-column
This misbehavior happens with fasterq-dump 3.0.7 on the following accession and a few others, at a rate of about 7% of biosample-based accessions. SAMN08049698 and some other biosample-based accessions fail, but most work fine including within the same project. Interestingly, the underlying run in this case (SRR6468610) still works fine.
Why this is a bug:
A workaround could be manually using entrez direct or something to translate into Run IDs, dumping the Run IDs, renaming (or concatenating) to biosample IDs again. But that's quite the workaround for the couple of files this fails on, so I thought I'd report it here.
The text was updated successfully, but these errors were encountered: