Skip to content

How to interpret odgi position output #615

@adadiehl

Description

@adadiehl

Greetings,
I have been working with a pangenome consisting of 51 phased genomes, two unphased genomes and GRCh38, with T2T-CHM13 as the reference, produced with Minigraph-Cactus. I would like to see if positions of interest in specific genomes overlap with structural variant bubbles in the pangenome. I can easily identify SVs in the pangenome as bubbles with length >=50 in the VCF generated by the MC pipeline. From there, I would like to convert the graph coordinates given in the VCF into path-specific start-end coordinates for specific genomes. It looks like odgi position is the tool for the job. However, I am unclear on how to interpret the output. For example, for a bubble with the (truncated) VCF record:

chr1 8 >4387>4650 AACCCTAACCCCTAACCCT

I would expect
odgi position -i chr1.og -r T2T-CHM13#0#chr1 -g 4387
to return the start coordinate and
odgi position -i chr1.og -r T2T-CHM13#0#chr1 -g 4650
to return the end coordinate, in T2T-CHM13 frame. I get the following:

odgi position -i chr1.og -r T2T-CHM13#0#chr1 -g 4387
#target.graph.pos       target.path.pos dist.to.ref     strand.vs.ref
4387,0,+        T2T-CHM13#0#chr1,150,+  377     +

odgi position -i chr1.og -r T2T-CHM13#0#chr1 -g 4650
#target.graph.pos       target.path.pos dist.to.ref     strand.vs.ref
4650,0,+        T2T-CHM13#0#chr1,150,+  544     +

The documentation and tutorial are a little lacking in describing how to interpret the output. Specifically, I don't understand why the path position is the same for both queries, and I don't understand the meaning of "dist.to.ref". How do I interpret this relative to the variant's start and end positions in the frame of the T2T-CHM13 genome? I am particularly confused by why, when the reference allele is 19bp in length, nothing about the two coordinates returned suggests a 19bp window. I am basically looking for a set of BED coordinates but am quite unsure about how to extract it from these results!

Thank you in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions