-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sam: Quality score ambiguity when sequence is a single base #715
Comments
This has been a known issue for a long time, although probably not tracked here. I don't think there's anything we can do about it really. Fortunately, it also means a length 1 sequence which doesn't generally happen in the wild, so it's a moot point. Most implementations just take the most probable view which is to interpret is as unknown and attempting to remove ambiguity would turn a harmless issue into a potentially more serious one. Edit: as an aside, I note you're also using MAPQ of 255 for "unavailable". Commendable, but my experience is that everyone just uses 0 with unmapped data. I think this is because when FLAG 4 is set the specification states no assumption can be made about MAPQ, so it just feels cleaner to zero it out as all other fields have been. |
TODO: Add footnote to say a single "*" for length 1 is still "unavailable" |
This is an extreme edge case likely to never occur, but nevertheless tool implementors still need to know how to handle it. Given it *may* be QUAL 9 or it *may* be QUAL "unknown", we treat it as always unknown. Fixes samtools#715
"*" is either QUAL 9, or QUAL unavailable. Made a recommendation in a footnote, mainly as an indication that the ambiguity exists. In practice it's vanishingly unlikely to matter. Fixes samtools#715
This in regard to Sequence Alignment/Map Format Specification (2022-08-22) § 1.4 "The alignment section: mandatory fields".
In the following SAM record, the quality scores field (
QUAL
) is ambiguous.Since there is a singe base in the sequence, the quality scores field can either be unavailable (
*
) or represent[9]
.The text was updated successfully, but these errors were encountered: