You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
OMICS-DI ingests publication records for supplemental data from Biostudies-literature (which ingests from European PMC). Clicking 'access data' for these records from the Discovery Portal will take you to the OMICS-DI record, which when clicked, may take you to the Biostudies record, which when clicked, will take you to the European PMC record where you may (or may not) find the downloadable supplemental files.
These records for publication supplemental data make up about ~1.4M out of ~2.2M OMICS-DI records.
To be determined: Do we want to continue to ingest and include all of OMICS-DI, or do we want to filter OMICS-DI such that only data that is not supplemental data from publications is included?
Pros of ingesting as is:
PMC Supplemental data is too messy to ingest. The supplemental data ingested via Biostudies into OMICS-DI is presumably a curated subset, where the metadata has been aligned with OMICS-DI and is expected to be within the scope of OMICS-DI
Helps users find OMICS data that would be otherwise buried in the literature or only in OMICS-DI
Keeps OMICS-DI intact
Cons of ingesting as is:
Adds a large number of records where data access is very convoluted (need to follow 2-3 links to get to it, if it's even available).
Can easily be mistaken as publication records instead of dataset records since the metadata for the supplemental data is based the publication record
It can potentially make OMICS-DI faster to parse and update if such records were skipped based on a base url or identifier match
To do:
Determine if the literature supplemental data ingested into OMICS-DI should be included
IF it should NOT be included, update the parser to skip anything with a biostudies-literature in the url
Issue Name
OMICS-DI literature supplemental data
Issue Description
OMICS-DI ingests publication records for supplemental data from Biostudies-literature (which ingests from European PMC). Clicking 'access data' for these records from the Discovery Portal will take you to the OMICS-DI record, which when clicked, may take you to the Biostudies record, which when clicked, will take you to the European PMC record where you may (or may not) find the downloadable supplemental files.
These records for publication supplemental data make up about ~1.4M out of ~2.2M OMICS-DI records.
To be determined: Do we want to continue to ingest and include all of OMICS-DI, or do we want to filter OMICS-DI such that only data that is not supplemental data from publications is included?
To do:
biostudies-literature
in the urlIssue Example
https://data.niaid.nih.gov/resources?id=s-epmc6182170
Related WBS task
For internal use only. Assignee, please select the status of this issue
Status Description
No response
The text was updated successfully, but these errors were encountered: