Skip to content

Metadata and DOI registration at DataCite (previously da|ra)

Theresa Möller edited this page May 13, 2025 · 1 revision

Introduction

This document describes how the MDM uses the DataCite1 API to publish its metadata and to register a DOI (Digital Object Identifier) for each of its data package or analysis package versions. Until 2025, da|ra2 was used to register DOIs, which is why it is possibly that da|ra is still mentioned in some parts of the documentation. Since da|ra itself registered the DOIs with DataCite, the information is to be understood synonymously.

The VerbundFDB3 harvests the published metadata of data packages from da|ra in order to make it available in its data search. It has therefore introduced some additional restrictions to the metadata schema of da|ra. These restrictions do not apply to analysis packages. With the switch to DataCite DOI registration the published metadata is currently not harvested.

This is an example of a registered data package at DataCite.

The following questions are answered in this document:

When does the MDM send metadata to DataCite? When does it register a DOI?

The MDM sends metadata of a data package or an analysis package to DataCite in case of the following events:

  1. Non-beta release of a new version of project (version >= 1.0.0):
    In this case the metadata of a data package or an anlysis package version is sent to DataCite for the first time (e. g. version 1.0.0 or 2.0.0). Therefore a new DOI (10.21249/DZHW:${projectId}:${version}) is registered at DataCite. The release is triggered by clicking the release button in the project cockpit of the MDM.
  2. Re-release of an existing version of a project (version >= 1.0.0):
    In this case the metadata at DataCite is updated (overwritten). The release is triggered by clicking the release button in the project cockpit of the MDM using the same version number as before.
  3. Related publications of a data package or an analysis package have changed (via import of the Citavi database):
    In this case the metadata at DataCite is updated (overwritten) automatically.
  4. Project version (aka shadow copy) has been hidden in the MDM:
    In this case, the data package or analysis package is updated (overwritten) at DataCite meaning it is marked as "withdrawn" at DataCite. The delivered metadata is still available at DataCite, but the version is no longer available for public users of the MDM at all.

Which metadata does the MDM send to DataCite?

In each of the above cases a complete JSON document is sent to DataCite. Depending on the project configuration either a data package is sent to DataCite or an analysis package.

Data Package

The following table describes the mapping from the MDM domain model to DataCite schema (version 4.6). When Da|ra was still used to register DOIs, there were additional mandatory fields required by VerbundFDB4. Since DataCite does not maintain a full mapping for VerbundFDB, the mandatory fields are no longer relevant here. For more information on the restrictions imposed by VerbundFDB, see the following section: Where can I find the specification of the additional restrictions introduced by the VerbundFDB?

DataCite Sequence DataCite Property MDM property DataCite restrictions
1. identifier Composition from dataAcquisitionProject.masterId and dataAcquisitionProject.release.version
2 creator List of dataPackage.projectContributors[], additionally one entry for dataPackage.institutions mandatory (1-n)
2.1 creators.name dataPackage.projectContributors[].lastName, dataPackage.projectContributors[].firstName, dataPackage.projectContributors[].middleName
2.1.a creators.nameType Personal for persons, Organizational for institutions
2.2 creators.givenName dataPackage.projectContributors[].lastName, dataPackage.projectContributors[].firstName, dataPackage.projectContributors[].middleName
2.3 creators.familyName dataPackage.projectContributors[].lastName
creators.nameIdentifiers List of identifiers of each creator
2.4 creators.nameIdentifiers.nameIdentifier dataPackage.projectContributors[].orcid
2.4.a creators.nameIdentifiers.nameIdentifierScheme static value ORCID
2.4.b creators.nameIdentifiers.schemeUri static value "https://orcid.org"
creators.affiliation List of affiliations for each creator (only specified if there is only one institution in the data package)
2.5 creators.affiliation.name dataPackage.institutions[].de or dataPackage.institutions[].de
titles List of dataPackage.title[] mandatory (1-n)
3. titles.title dataPackage.title[].de, dataPackage.title[].en
publisher List with constant entry for DZHW mandatory (1)
4 publisher.name constant value "German Centre for Higher Education Research and Science Studies (DZHW)"
4.a publisher.publisherIdentifier constant value "https://ror.org/01n8j6z65"
4.b publisher.publisherIdentifierScheme constant value "ROR"
4.c publisher.schemeUri constant value "https://ror.org/"
5 publicationYear dataAcquisitionProject.release.firstDate, if null: current year mandatory (1)
subjects List with entries for each tag and elsst tag in all languages
6. subjects.subject dataPackage.tags[] or dataPackage.elsstTags[].prefLabel
6.a subjects.subjectScheme constant value "CESSDA European Language Social Science Thesaurus (ELSST)" for all ELSST tags
6.b subjects.schemeUri constant value "https://thesauri.cessda.eu/elsst-4/en/" for all ELSST tags
6.c subjects.valueUri the values URI at https://thesauri.cessda.eu/elsst-4/en/
contributors List of dataPackage.dataCurators[], additionally constant entry for DZHW
7.a contributors.contributorType DataCurator for persons, Distributor for DZHW entry
7.1 contributors.name for persons dataPackage.dataCurators[].lastName, dataPackage.dataCurators[].firstName, dataPackage.dataCurators[].middleName
7.1.a contributors.nameType Personal for persons, Organizational for institutions
7.2 contributors.givenName dataPackage.dataCurators[].lastName, dataPackage.dataCurators[].firstName, dataPackage.dataCurators[].middleName
7.3 contributors.familyName dataPackage.dataCurators[].lastName
contributors.nameIdentifiers List of identifiers
7.4 contributors.nameIdentifiers.nameIdentifier dataPackage.dataCurators[].orcid
7.4.a contributors.nameIdentifiers.nameIdentifierScheme constant value ORCID
7.4.b contributors.nameIdentifiers.schemeUri constant value "https://orcid.org"
contributors.affiliation all contributors get DZHW as their affiliation
7.5 contributors.affiliation.name constant value Deutsches Zentrum für Hochschul- und Wissenschaftsforschung (DZHW)
7.5.a contributors.affiliation.affiliationIdentifier constant value "https://ror.org/01n8j6z65"
7.5.b contributors.affiliation.affiliationIdentifierScheme constant value ROR
7.5.c contributors.affiliation.schemeUri constant value "https://ror.org/"
dates a list of release related dates (release date, embargo date, date of withdrawal), and field periods for all surveys
8. dates.date dataAcquisitionProject.release.firstDate, if null: current date, for surveys: survey.fieldPeriod.start and survey.fieldPeriod.end, format: YYYY-MM-DD
8.a dates.dateType Withdrawn for hidden projects (date: always current date), Available for released (date: release date) or pre-released projects (date: embargo date), Accepted for pre-released projects (date: release date), Collected for survey dates
8.b dates.dateInformation only available for hidden projects or embargo date, survey.title for survey dates
types constant object mandatory (1)
10.a types.resourceTypeGeneral constant value Dataset
alternateIdentifiers list of alternate identifiers
11. alternateIdentifiers.alternateIdentifier constant 1 for VerbundFDB, constant 2 for QDN
11.a alternateIdentifiers.alternateIdentifierType constant value VerbundFDB or QDN
relatedIdentifiers list of DOI identifiers of previous or new versions of the data package
12. relatedIdentifiers.relatedIdentifier the DOI of the previous or new version
12.a relatedIdentifiers.relatedIdentifierType constant value "DOI"
12.b relatedIdentifiers.relationType constant value "isNewVersionOf" or "isPreviousVersionOf"
15. version dataAcquisitionProject.release.version
rightsList list of rights
16. rightsList.rights constant value that indicates the necessity of application to access data
descriptions list of data package and survey population descriptions
17. descriptions.description dataPackage.description or survey.population.description
17.a descriptions.descriptionType Abstract for dataPackage.description, Methods for survey.population.description
geoLocations list of survey population locations
18.3 geoLocations.geoLocationPlace country name of survey.population.geographicCoverage
fundingReferences list of dataPackage.sponsors
19.1 fundingReferences.funderName dataPackage.sponsors[].name

Analysis Package

The following table describes the mapping from the MDM domain model to DataCite schema (version 4.6).

DataCite Sequence DataCite Property MDM property DataCite restrictions
1. Identifier Composition from dataAcquisitionProject.masterId and dataAcquisitionProject.release.version
2 Creator List of analysisPackage.authors[], additionally one entry for analysisPackage.institutions mandatory (1-n)
2.1 creators.name analysisPackage.authors[].lastName, analysisPackage.authors[].firstName, analysisPackage.authors[].middleName
2.1.a creators.nameType Personal for persons, Organizational for institutions
2.2 creators.givenName analysisPackage.authors[].lastName, analysisPackage.authors[].firstName, dataPackage.authors[].middleName
2.3 creators.familyName analysisPackage.authors[].lastName
creators.nameIdentifiers List of identifiers of each creator
2.4 creators.nameIdentifiers.nameIdentifier analysisPackage.authors[].orcid
2.4.a creators.nameIdentifiers.nameIdentifierScheme static value ORCID
2.4.b creators.nameIdentifiers.schemeUri static value "https://orcid.org"
creators.affiliation List of affiliations for each creator (only specified if there is only one institution in the analysis package)
2.5 creators.affiliation.name analysisPackage.institutions[].de or analysisPackage.institutions[].de
titles List of analysisPackage.title[] mandatory (1-n)
3. titles.title analysisPackage.title[].de, analysisPackage.title[].en
publisher List with constant entry for DZHW mandatory (1)
4 publisher.name constant value "German Centre for Higher Education Research and Science Studies (DZHW)"
4.a publisher.publisherIdentifier constant value "https://ror.org/01n8j6z65"
4.b publisher.publisherIdentifierScheme constant value "ROR"
4.c publisher.schemeUri constant value "https://ror.org/"
5 publicationYear dataAcquisitionProject.release.firstDate, if null: current year mandatory (1)
subjects List with entries for each tag and elsst tag in all languages
6. subjects.subject analysisPackage.tags[] or analysisPackage.elsstTags[].prefLabel
6.a subjects.subjectScheme constant value "CESSDA European Language Social Science Thesaurus (ELSST)" for all ELSST tags
6.b subjects.schemeUri constant value "https://thesauri.cessda.eu/elsst-4/en/" for all ELSST tags
6.c subjects.valueUri the values URI at https://thesauri.cessda.eu/elsst-4/en/
contributors List of analysisPackage.dataCurators[], additionally constant entry for DZHW
7.a contributors.contributorType DataCurator for persons, Distributor for DZHW entry
7.1 contributors.name for persons analysisPackage.dataCurators[].lastName, analysisPackage.dataCurators[].firstName, analysisPackage.dataCurators[].middleName
7.1.a contributors.nameType Personal for persons, Organizational for institutions
7.2 contributors.givenName analysisPackage.dataCurators[].lastName, analysisPackage.dataCurators[].firstName, analysisPackage.dataCurators[].middleName
7.3 contributors.familyName analysisPackage.dataCurators[].lastName
contributors.nameIdentifiers List of identifiers
7.4 contributors.nameIdentifiers.nameIdentifier analysisPackage.dataCurators[].orcid
7.4.a contributors.nameIdentifiers.nameIdentifierScheme constant value ORCID
7.4.b contributors.nameIdentifiers.schemeUri constant value "https://orcid.org"
contributors.affiliation all contributors get DZHW as their affiliation
7.5 contributors.affiliation.name constant value Deutsches Zentrum für Hochschul- und Wissenschaftsforschung (DZHW)
7.5.a contributors.affiliation.affiliationIdentifier constant value "https://ror.org/01n8j6z65"
7.5.b contributors.affiliation.affiliationIdentifierScheme constant value ROR
7.5.c contributors.affiliation.schemeUri constant value "https://ror.org/"
dates a list of release related dates (release date, embargo date, date of withdrawal)
8. dates.date dataAcquisitionProject.release.firstDate, if null: current date, format: YYYY-MM-DD
8.a dates.dateType Withdrawn for hidden projects (date: always current date), Available for released (date: release date) or pre-released projects (date: embargo date), Accepted for pre-released projects (date: release date)
8.b dates.dateInformation only available for hidden projects or embargo date
types constant object mandatory (1)
10.a types.resourceTypeGeneral constant value Dataset
relatedIdentifiers list of DOI identifiers of previous or new versions of the analysis package
12. relatedIdentifiers.relatedIdentifier the DOI of the previous or new version
12.a relatedIdentifiers.relatedIdentifierType constant value "DOI"
12.b relatedIdentifiers.relationType constant value "isNewVersionOf" or "isPreviousVersionOf"
15. version dataAcquisitionProject.release.version
rightsList list of rights
16. rightsList.rights constant value that indicates the necessity of application to access data
descriptions list of analysis package
17. descriptions.description analysisPackage.description
17.a descriptions.descriptionType Abstract for analysisPackage.description
fundingReferences list of analysisPackage.sponsors
19.1 fundingReferences.funderName analysisPackage.sponsors[].name

Where can I find the current metadata schema of DataCite?

The metadata JSON schema is best described here. It also includes examples and controlled vocabulary.

The documentation of the DataCite API can be found here.

Where can I find the specification of the additional restrictions introduced by the VerbundFDB?

The additional restrictions introduced by the VerbundFDB (aka Kernset) are described here.


1 DataCite is an international consortium assigning persistent identifiers to data sets. It organizes the administration of DOI prefixes and the connection to the International DOI Foundation (IDF).
2 da|ra is a DOI registration agency for social and economic data in Germany. It is connected to DataCite which organizes the administration of prefixes and the connection to the International DOI Foundation (IDF).
3 The VerbundFDB is a research data infrastructure for empirical educational research collecting and sharing research data and information. The FDZ-DZHW is a network partner within the VerbundFDB.
4 If both da|ra restrictions and VerbundFDB restrictions have "---", this means we send these attributes voluntarily and/or the attributes are marked "optional" by the VerbundFDB Kernset

Clone this wiki locally