-
Notifications
You must be signed in to change notification settings - Fork 32
Description
We are interested in referencing the observatory or mission that generated the dataset in the Dataset record on schema.org. We found a few options and would like input from this group, ideally resulting in an update to the SOSO guidance. Thanks to Zach Boquet for the first json example.
OPTION 1: use the "producer" field
{
"@context": {
"@vocab": "https://schema.org/",
"prov": "http://www.w3.org/ns/prov#"
},
"@id": "https://doi.org/10.concept/doi",
"@type": "Dataset",
….
"producer": [
{"@type": "ResearchProject",
"@id": "spase://SMWG/Observatory/MMS",
"name": "MMS",
"url": "https://mms.gsfc.nasa.gov/"
},
{"@type": "ResearchProject",
"@id": "spase://SMWG/Observatory/MMS/4",
"name": "MMS-4",
"url": "https://mms.gsfc.nasa.gov/"
}
],
...
}
pros: no special items needed, can indicate individual portions of a mission (e.g. MMS has 4 spacecraft, can indicate the MMS mission AND which spacecraft).
cons: doesn't seem to fit the definition of "producer".
option 2: use prov: wasGeneratedBy similar to current guidance for software
https://www.w3.org/TR/prov-o/#wasGeneratedBy
(example edited from current SOSO guidelines for software, please check this)
{
"@context": [
"https://schema.org/",
{
"prov": "http://www.w3.org/ns/prov#",
"provone": "http://purl.dataone.org/provone/2015/01/15/ontology#"
}
],
"@id": "https://doi.org/10.xxxx/Dataset-2",
"@type": "Dataset",
"name": "Removal of organic carbon by natural bacterioplankton communities as a function of pCO2 from laboratory experiments between 2012 and 2016",
"prov:wasDerivedFrom": { "@id": "https://doi.org/10.xxxx/Dataset-1" },
"schema:isBasedOn": { "@id": "https://doi.org/10.xxxx/Dataset-1" },
"prov:wasGeneratedBy":
{"@type": "ResearchProject",
"@id": "spase://SMWG/Observatory/MMS",
"name": "MMS",
"url": "https://mms.gsfc.nasa.gov/"
},
{"@type": "ResearchProject",
"@id": "spase://SMWG/Observatory/MMS/4",
"name": "MMS-4",
"url": "https://mms.gsfc.nasa.gov/"
}
}
pros: likely much simpler, type can be aligned with the choice for the observing network work.
cons: could be confused with the current guidelines explaining how to indicate a software was used to generate the dataset