You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Let's say I generate an exact match using a lexical mapping. My mapping tool gives a confidence of 0.7. So I get SSSOM like
subject_id
subject_label
predicate_id
object_id
object_label
mapping_justification
confidence
mapping_tool
CHEBI:134180
leucomethylene blue
skos:exactMatch
mesh:C011010
hydromethylthionine
semapv:LexicalMatching
0.7
generate_chebi_mesh_mappings.py
Then, I review this mapping. I say that it's correct with 0.95 confidence. How do I represent this? Here are some options I thought of:
Add an author_id column with my ORCID, and swap the mapping justification to semapv:ManualMappingCuration. Overwrite the confidence from 0.7 to 0.95
Add a reviewer_id column with my ORCID. But then, how do I represent that I have a confidence as a reviewer? Do I throw away the mapping tool's confidence? What if I want to keep track of this?
Some other way? Please also let me know if I've misunderstood how to use author_id/creator_id/reviewer_id
The use case for this question is Biomappings, since we do lexical predictions and curate them, and want to keep track of this provenance.
Given the answer to this question, it will also be possible to generalize the Biomappings curation interface to be a generic SSSOM curation interface
The text was updated successfully, but these errors were encountered:
This issue is a bit debated; last time we tried to do this we didn't reach a definite conclusion: #345
In a nutshell:
Separating mapping processes during the curation life cycle was not a primary concern of the design of SSSOM, so it was all mushed together into one record
The idea is that 1 single score is there to tell a downstream user "how sure they can be"
If you absolutely want to represent the life cycle, you will have to create intermediate mapping sets, so you say:
Mapping set 1 derived from lexical matching (semapv:LexicalMatching)
Mapping set 2 reviewers mapping set 1 (semapv:MappingReview), and sets mapping_set_source to mapping set 1
Mapping set 3 derived from 1 and 2, referring to both, generating a composite score and using sempav:CompositeMatching or some such as a justification. This last set is the only one you publish the the world.
None of this is super awesome. Another option to make this a bit cleaner would be to push for #359 and then a new slot source_mapping that you can use to point specifically to the mappings used for deriving a particular new mapping..
Let's say I generate an exact match using a lexical mapping. My mapping tool gives a confidence of 0.7. So I get SSSOM like
Then, I review this mapping. I say that it's correct with 0.95 confidence. How do I represent this? Here are some options I thought of:
author_id
column with my ORCID, and swap the mapping justification tosemapv:ManualMappingCuration
. Overwrite the confidence from 0.7 to 0.95reviewer_id
column with my ORCID. But then, how do I represent that I have a confidence as a reviewer? Do I throw away the mapping tool's confidence? What if I want to keep track of this?author_id
/creator_id
/reviewer_id
The use case for this question is Biomappings, since we do lexical predictions and curate them, and want to keep track of this provenance.
Given the answer to this question, it will also be possible to generalize the Biomappings curation interface to be a generic SSSOM curation interface
The text was updated successfully, but these errors were encountered: