-
Notifications
You must be signed in to change notification settings - Fork 9
Open
Description
Create a script that parses Bioschemas content.
Some of our content providers mark-up their events content using Bioschemas specifications. Bioschemas can be represented in either JSON-LD, RDFa, or Microdata formats. Just focus on JSON-LD for this exercise. If you have time later, maybe explore the others but no worries if not.
The Bioschemas Event specification is represented in a YAML format
https://github.com/BioSchemas/specifications/blob/master/Event/specification.html
Write a program that:
- parses this file and takes everything in the properties: key in the YAML specification.
- Goes through and collect each property name
- Download the schema.org spec for each of the expected types. e.g. if expected_type has PostalAddress you need to parse schema.org/PostalAddress.jsonld to get the properties of this subtype
- Downloads a target Bioschemas web-page
- Parse the JSON-LD, maybe using a parser such as this: https://www.npmjs.com/package/@rdfjs/parser-jsonld
- Extract all the properties that match the ones you've collected from the YAML
- Extract any sub-properties
- Push these properties to TeSS using the TeSS API Client
Whilst implementing, think about how you make this as re-usable as possible. e.g. The developer will only have to change the URL of the target page to run it elsewhere.
Some target websites to test it against:
Metadata
Metadata
Assignees
Labels
No labels