Description
Before this change samtools/htslib#1854 U was changed to N when read by samtools
Now it will be changed to T
However, I think it would be "better" if we could preserve U in SAM, even when moving SAM->BAM->CRAM->SAM for example.
There is a problem, however, that there is no room in the 4bits BAM uses to represent all 16 IUPAC bases (where T is for T and U).
A solution to this raised by @jmarshall could be to allocate a FLAG bit to indicate an alignment record is RNA, which would then mean the T coming from a BAM, would be written as a U when viewed in SAM.
This would also mean most tools would still work, while building for the future of RNA sequencing methods to represent the base that is actually being measured.
Another solution (though more ad-hoc and less "good") would be to make yet another sam tag, to denote the read is from RNA. This saves using a FLAG bit, but adds more complexity to the solution.
Cheers,
James