Skip to content

Find Annotation and Knowledge Graphs to integrate #41

@josiahseaman

Description

@josiahseaman

Assignee: Ali Haider Bangash
The first step is to identify what data could be integrated through a knowledge graph and what is available. What did the other Hackathon teams accomplish? What is available? Information goes in this issue. We're looking for information that relates to genetic variants of the virus:

  • Structural annotations channel. Protein structure => codon table => sequence position. We could mark up pangenome positions related to known protein variants
  • Gene Annotations: Possibly only need the reference gene annotation GFF, but it would be nice to have these positions in the graph genome context. @subwaystation has ensured we have coordinate transforms that go both ways pangenome <-> reference genome coordinates, using faldo in RDF.
  • Clinical data: Possibly the most important. If we have any knowledge of patient outcomes, and what region they're from, we could connect a strain of the virus (which will contain variants) to a patient outcome: how long in hospital, how long on ventilator, etc. We don't necessarily need a viral sequence from that specific individual, but at minimum a probable association with a variant.
    • Human DNA variation data could also be used as in UK Biobank article.
    • Technically, annotated a complete human pangenome is beyond our current scope in that gigabase genomes will put strain on our pipeline. It may be possible, however to make local graphs of key regions like HLA or MHC inside the Human genome.
  • Phylogenetics: We're going to have a phylogenetic tree eventually Phylogenetic Tree Visualization  Schematize#58. It'd be nice to link this with the "country" and "town" concepts in the knowledge graph. What geographic or transmission data could we bring in?

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentationgood first issueGood for newcomersquestionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions