Skip to content

How to capture the lifecycle of a predicted and then later curated mapping? #437

Open
@cthoyt

Description

@cthoyt

Let's say I generate an exact match using a lexical mapping. My mapping tool gives a confidence of 0.7. So I get SSSOM like

subject_id subject_label predicate_id object_id object_label mapping_justification confidence mapping_tool
CHEBI:134180 leucomethylene blue skos:exactMatch mesh:C011010 hydromethylthionine semapv:LexicalMatching 0.7 generate_chebi_mesh_mappings.py

Then, I review this mapping. I say that it's correct with 0.95 confidence. How do I represent this? Here are some options I thought of:

  1. Add an author_id column with my ORCID, and swap the mapping justification to semapv:ManualMappingCuration. Overwrite the confidence from 0.7 to 0.95
  2. Add a reviewer_id column with my ORCID. But then, how do I represent that I have a confidence as a reviewer? Do I throw away the mapping tool's confidence? What if I want to keep track of this?
  3. Some other way? Please also let me know if I've misunderstood how to use author_id/creator_id/reviewer_id

The use case for this question is Biomappings, since we do lexical predictions and curate them, and want to keep track of this provenance.

Given the answer to this question, it will also be possible to generalize the Biomappings curation interface to be a generic SSSOM curation interface

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions