Skip to content

choose pileup call over full-alignment call if pileup call has 1) a way higher QUAL, 2) is an indel #318

@drtconway

Description

@drtconway

Hi!

Thanks for making a good tool!

We have been doing testing and benchmarking with (clinical) trio data, and have found a corner case where the behaviour of the merge between the pileup calls and the full alignment call results in a real variant being discarded.

The pileup VCF file contains the row

chr12   132836482   .   A   ACTCACAGTGACAGGCTCCCAGCAGGGCGCACGGCACTCACAGTGACAGGCTCCCAGCACGGCGCACGGCACTCACAGTGACAGGCTCCCAGCACGGCGCTCGGCC  28.66   PASS    P   GT:GQ:DP:AD:AF:PL   1/1:28:32:3,29:0.9062:56,48,0

And the full alignment has the row

chr12   132836482   .   A   .   0.00    RefCall F   GT:GQ:DP:AD:AF:PL   0/0:0:32:3:0.0938:990

The VCF produced by MergeVcf outputs the RefCall, which causes the insertion to be dropped.

It's a bit of a corner case, since the position of insertion variants in a VCF is the base before the inserted sequence, and that is ref, but in this case, we do want to report the insertion!

I don't fully understand how the positions of the RefCall variants are determined, so I can't tell if this is a freak event, or if there is a systematic process that means these collisions are likely to occur elsewhere. If you could explain how these RefCall events are generated, that would help us understand how the caller works.

Looking at the code of MergeVcf (e.g.

full_alignment_output_set.add((ctg_name, pos))
and
if (ctg_name, pos) in full_alignment_output_set:
), I think I understand how the merge code is dropping the insertion event. I can think of a few possible ways to fix the code, but I haven't thought through all the semantics in detail, to work out what the best way to fix the code would be.

Again, thanks for making a good tool, and I hope this report helps you make it even better.

Tom.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions