Description
This is in regard to CRAM format specification (version 3.1) (2024-09-04).
§ 8.4.1 "Compression header block: Preservation map" describes the reference required (RR
) flag as "true if reference sequence is required to restore the data completely". If this is true and the records do not require a reference sequence to restore the data (e.g., an unmapped slice), is it considered an invalid state?
As I understand it, this flag should be false if the slice is unmapped, but some implementation don't set it as such, e.g., htslib:
$ samtools --version | head -2
samtools 1.21
Using htslib 1.21
$ (
samtools view --output-fmt cram <<EOF
@HD VN:1.6
r1 4 * 0 0 * * 0 0 NNNN !!!!
EOF
) | cram_dump - | grep --after-context 4 "Preservation map" | head -5
Preservation map:
SM => 30398990 (0x1cfda0e)
TD => 30398986 (0x1cfda0a)
RN => 1 (0x1)
AP => 1 (0x1)
All the records in this slice are unmapped (cram_dump: "Slice ref seq -1"), and this implicitly sets the reference required field to true, as per "The boolean values are optional, defaulting to true when absent..." However, a reference sequence is not required to decode this.
Metadata
Metadata
Assignees
Type
Projects
Status