-
Notifications
You must be signed in to change notification settings - Fork 33
Description
Hi!
Thanks for making a good tool!
We have been doing testing and benchmarking with (clinical) trio data, and have found a corner case where the behaviour of the merge between the pileup calls and the full alignment call results in a real variant being discarded.
The pileup VCF file contains the row
chr12 132836482 . A ACTCACAGTGACAGGCTCCCAGCAGGGCGCACGGCACTCACAGTGACAGGCTCCCAGCACGGCGCACGGCACTCACAGTGACAGGCTCCCAGCACGGCGCTCGGCC 28.66 PASS P GT:GQ:DP:AD:AF:PL 1/1:28:32:3,29:0.9062:56,48,0
And the full alignment has the row
chr12 132836482 . A . 0.00 RefCall F GT:GQ:DP:AD:AF:PL 0/0:0:32:3:0.0938:990
The VCF produced by MergeVcf
outputs the RefCall
, which causes the insertion to be dropped.
It's a bit of a corner case, since the position of insertion variants in a VCF is the base before the inserted sequence, and that is ref, but in this case, we do want to report the insertion!
I don't fully understand how the positions of the RefCall
variants are determined, so I can't tell if this is a freak event, or if there is a systematic process that means these collisions are likely to occur elsewhere. If you could explain how these RefCall
events are generated, that would help us understand how the caller works.
Looking at the code of MergeVcf
(e.g.
Line 191 in b975475
full_alignment_output_set.add((ctg_name, pos)) |
Line 228 in b975475
if (ctg_name, pos) in full_alignment_output_set: |
Again, thanks for making a good tool, and I hope this report helps you make it even better.
Tom.