|
| 1 | +<center><h1>A Minimal Metadata Standard for ESIP Software in JSON-LD</h1></center>### Project Description |
| 2 | + |
| 3 | +**Introduction** |
| 4 | +The discovery and meaningful reuse of research software depends, in part, on accurate metadata. There are many different community standards for describing the contents of a software package, a codebase, or a library package - but these are often substantively different in the attributes that are described, and the details in which they are described - making it difficult to facilitate the discovery of potentially valuable software through APIs or search engines. |
| 5 | + |
| 6 | +Research in a number of domains also shows that a lack of time, incentive, and training in the creation of metadata hampers progress in sharing, and reusing research objects [e.g. Edwards et al., 2011]. |
| 7 | + |
| 8 | +This project proposes the exploration of an emerging science software "minimal metadata" standard in JSON-LD - which offers a simple, core set of attributes to uniformly describe scientific software in lightweight semantic format. The JSON-LD minimal standard will ease the burden of creating metadata, and also improves the discoverability of software. |
| 9 | + |
| 10 | +This proposal builds off of the the "code as a research object" project by Mozilla Sciene, Github, and Figshare - and directly contributes to the [ongoing work](https://github.com/mbjones/codemeta) of Matt Jones at NESCENT. |
| 11 | + |
| 12 | +This project will contribute to the development of this standard through: |
| 13 | + |
| 14 | +1. A cross walking exercise that maps attribute value pairs from many different existing standards. As the project matures, these categories will converge on a minimum set. |
| 15 | + |
| 16 | +2. Use Cases from ESIP community |
| 17 | + |
| 18 | +Developing a descriptive metadata standard is a first step in achieving a robust network of software archives. Tools like [Fidgit](https://github.com/openjournals/fidgit) - which are developed to lower the barrier to archiving and obtaining a persistent identifier for code can then be leveraged and used by this network. |
| 19 | + |
| 20 | + |
| 21 | +**Some questions and some (preliminary answers):** *Doesn't the DOAP project propose to do exactly this?* |
| 22 | +The Description Of A Project (DOAP) ontology is very relevant to this project- but in short, No. |
| 23 | +[DOAP](https://github.com/edumbill/doap) proposes an "XML/RDF vocabulary to describe software projects, and in particular open source projects" Here, we're interested in exploring a number of different standards, including DOAP - and finding a minimal - or 'core' set of attribute value pairs (in the [Dublin Core](http://dublincore.org/documents/dces/) / [Darwin Core](http://rs.tdwg.org/dwc/)sense) for describing scientific software - notably Earth System Science software that is developed and used in the ESIP community. |
| 24 | +Two limitations to depending - solely- on DOAP for doing this: |
| 25 | +1. It is aimed at RDF - and well, RDF gets complicated quickly. We want to do something lightweight and easily adoptable. |
| 26 | +2. Part of the ambition in using JSON-LD is so that metadata creation can be automated in the future... and that future seems a lot more distant in an RDF / DOAP world. |
| 27 | +*Why JSON-LD?* |
| 28 | + |
| 29 | +[JSON-LD]() is a lightweight format that offers semantic meaning at a substantially lower barrier of creation than RDF, it has emerged over the last 2-3 years as the standard for serving data via APIs, and more importantly for our work here, it can leverage existing ontologies, like 'creative works' in [schema.org](http://schema.org/Code), for describing a codebase. |
| 30 | +**Why not XML** |
| 31 | +XML can play too. |
| 32 | + |
| 33 | + |
| 34 | +### Project Plan, and Timeline |
| 35 | + |
| 36 | + |
| 37 | +Major work of the use cases will be on the following: |
| 38 | + |
| 39 | +1. Testing and exploring the use of different subject categories [i.e., we believe we can do better than the original proposal for using [PLoS's taxonomy](http://www.plosone.org/taxonomy) of academic subjects] |
| 40 | + |
| 41 | +2. Coming up with guidelines and suggest practices for describing the function of the software. |
| 42 | + |
| 43 | + |
| 44 | + |
| 45 | +### Names and Roles of Team Member |
| 46 | + |
| 47 | +| Name | Role | Description | |
| 48 | +|-----------|-----------|------------| |
| 49 | +| Nic Weber | Role | PhD candidate in Information Science at University of Illinois. Experience in developing standards and policy for Data Conservancy, and linked data applications. | |
| 50 | +| You | Your Role | Your qualifications | |
| 51 | +### Intended Outcomes and Contribution to ESIP (and broader) Community |
| 52 | +1. Our initial short term work will complete a community scan / cross-walking of existing standards. We'll build off of the existing work from Mozilla Science and contribute back to the community that is driving this work. |
| 53 | +2. A set of ESIP community use cases. |
| 54 | + |
| 55 | +3. A white paper recommendation [?]4. A formal publication - in an open access journal - that describes the use cases in detail, as well as the progress that we've made. |
| 56 | +### Works Cited |
| 57 | +Edwards, P., Mayernik, M. S., Batcheller, A., Bowker, G., & Borgman, C. (2011). Science friction: Data, metadata, and collaboration. Social Studies of Science, DOI: [10.1177/0306312711413314](http://sss.sagepub.com/content/early/2011/08/13/0306312711413314.abstract) |
|
0 commit comments