-
Notifications
You must be signed in to change notification settings - Fork 7
Metadata and DOI registration at DataCite (previously da|ra)
This document describes how the MDM uses the DataCite1 API to publish its metadata and to register a DOI (Digital Object Identifier) for each of its data package or analysis package versions. Until 2025, da|ra2 was used to register DOIs, which is why it is possibly that da|ra is still mentioned in some parts of the documentation. Since da|ra itself registered the DOIs with DataCite, the information is to be understood synonymously.
The VerbundFDB3 harvests the published metadata of data packages from da|ra in order to make it available in its data search. It has therefore introduced some additional restrictions to the metadata schema of da|ra. These restrictions do not apply to analysis packages. With the switch to DataCite DOI registration the published metadata is currently not harvested.
This is an example of a registered data package at DataCite.
The following questions are answered in this document:
- When does the MDM send metadata to DataCite? When does it register a DOI?
- Which metadata does the MDM send to DataCite?
- Where can I find the current metadata schema of DataCite?
- Where can I find the specification of the additional restrictions introduced by the VerbundFDB?
The MDM sends metadata of a data package or an analysis package to DataCite in case of the following events:
- Non-beta release of a new version of project (version >= 1.0.0):
In this case the metadata of a data package or an anlysis package version is sent to DataCite for the first time (e. g. version 1.0.0 or 2.0.0). Therefore a new DOI (10.21249/DZHW:${projectId}:${version}
) is registered at DataCite. The release is triggered by clicking the release button in the project cockpit of the MDM. - Re-release of an existing version of a project (version >= 1.0.0):
In this case the metadata at DataCite is updated (overwritten). The release is triggered by clicking the release button in the project cockpit of the MDM using the same version number as before. - Related publications of a data package or an analysis package have changed (via import of the Citavi database):
In this case the metadata at DataCite is updated (overwritten) automatically. - Project version (aka shadow copy) has been hidden in the MDM:
In this case, the data package or analysis package is updated (overwritten) at DataCite meaning it is marked as "withdrawn" at DataCite. The delivered metadata is still available at DataCite, but the version is no longer available for public users of the MDM at all.
In each of the above cases a complete JSON document is sent to DataCite. Depending on the project configuration either a data package is sent to DataCite or an analysis package.
The following table describes the mapping from the MDM domain model to DataCite schema (version 4.6). When Da|ra was still used to register DOIs, there were additional mandatory fields required by VerbundFDB4. Since DataCite does not maintain a full mapping for VerbundFDB, the mandatory fields are no longer relevant here. For more information on the restrictions imposed by VerbundFDB, see the following section: Where can I find the specification of the additional restrictions introduced by the VerbundFDB?
DataCite Sequence | DataCite Property | MDM property | DataCite restrictions |
---|---|---|---|
1. | identifier | Composition from dataAcquisitionProject.masterId and dataAcquisitionProject.release.version
|
|
2 | creator | List of dataPackage.projectContributors[] , additionally one entry for dataPackage.institutions
|
mandatory (1-n) |
2.1 | creators.name |
dataPackage.projectContributors[].lastName , dataPackage.projectContributors[].firstName , dataPackage.projectContributors[].middleName
|
|
2.1.a | creators.nameType |
Personal for persons, Organizational for institutions |
|
2.2 | creators.givenName |
dataPackage.projectContributors[].lastName , dataPackage.projectContributors[].firstName , dataPackage.projectContributors[].middleName
|
|
2.3 | creators.familyName | dataPackage.projectContributors[].lastName |
|
creators.nameIdentifiers | List of identifiers of each creator | ||
2.4 | creators.nameIdentifiers.nameIdentifier | dataPackage.projectContributors[].orcid |
|
2.4.a | creators.nameIdentifiers.nameIdentifierScheme | static value ORCID
|
|
2.4.b | creators.nameIdentifiers.schemeUri | static value "https://orcid.org"
|
|
creators.affiliation | List of affiliations for each creator (only specified if there is only one institution in the data package) | ||
2.5 | creators.affiliation.name |
dataPackage.institutions[].de or dataPackage.institutions[].de
|
|
titles | List of dataPackage.title[]
|
mandatory (1-n) | |
3. | titles.title |
dataPackage.title[].de , dataPackage.title[].en
|
|
publisher | List with constant entry for DZHW | mandatory (1) | |
4 | publisher.name | constant value "German Centre for Higher Education Research and Science Studies (DZHW)"
|
|
4.a | publisher.publisherIdentifier | constant value "https://ror.org/01n8j6z65"
|
|
4.b | publisher.publisherIdentifierScheme | constant value "ROR"
|
|
4.c | publisher.schemeUri | constant value "https://ror.org/"
|
|
5 | publicationYear |
dataAcquisitionProject.release.firstDate , if null: current year |
mandatory (1) |
subjects | List with entries for each tag and elsst tag in all languages | ||
6. | subjects.subject |
dataPackage.tags[] or dataPackage.elsstTags[].prefLabel
|
|
6.a | subjects.subjectScheme | constant value "CESSDA European Language Social Science Thesaurus (ELSST)" for all ELSST tags |
|
6.b | subjects.schemeUri | constant value "https://thesauri.cessda.eu/elsst-4/en/" for all ELSST tags |
|
6.c | subjects.valueUri | the values URI at https://thesauri.cessda.eu/elsst-4/en/ | |
contributors | List of dataPackage.dataCurators[] , additionally constant entry for DZHW |
||
7.a | contributors.contributorType |
DataCurator for persons, Distributor for DZHW entry |
|
7.1 | contributors.name | for persons dataPackage.dataCurators[].lastName , dataPackage.dataCurators[].firstName , dataPackage.dataCurators[].middleName
|
|
7.1.a | contributors.nameType |
Personal for persons, Organizational for institutions |
|
7.2 | contributors.givenName |
dataPackage.dataCurators[].lastName , dataPackage.dataCurators[].firstName , dataPackage.dataCurators[].middleName
|
|
7.3 | contributors.familyName | dataPackage.dataCurators[].lastName |
|
contributors.nameIdentifiers | List of identifiers | ||
7.4 | contributors.nameIdentifiers.nameIdentifier | dataPackage.dataCurators[].orcid |
|
7.4.a | contributors.nameIdentifiers.nameIdentifierScheme | constant value ORCID
|
|
7.4.b | contributors.nameIdentifiers.schemeUri | constant value "https://orcid.org"
|
|
contributors.affiliation | all contributors get DZHW as their affiliation | ||
7.5 | contributors.affiliation.name | constant value Deutsches Zentrum für Hochschul- und Wissenschaftsforschung (DZHW)
|
|
7.5.a | contributors.affiliation.affiliationIdentifier | constant value "https://ror.org/01n8j6z65"
|
|
7.5.b | contributors.affiliation.affiliationIdentifierScheme | constant value ROR
|
|
7.5.c | contributors.affiliation.schemeUri | constant value "https://ror.org/"
|
|
dates | a list of release related dates (release date, embargo date, date of withdrawal), and field periods for all surveys | ||
8. | dates.date |
dataAcquisitionProject.release.firstDate , if null: current date, for surveys: survey.fieldPeriod.start and survey.fieldPeriod.end , format: YYYY-MM-DD
|
|
8.a | dates.dateType |
Withdrawn for hidden projects (date: always current date), Available for released (date: release date) or pre-released projects (date: embargo date), Accepted for pre-released projects (date: release date), Collected for survey dates |
|
8.b | dates.dateInformation | only available for hidden projects or embargo date, survey.title for survey dates |
|
types | constant object | mandatory (1) | |
10.a | types.resourceTypeGeneral | constant value Dataset
|
|
alternateIdentifiers | list of alternate identifiers | ||
11. | alternateIdentifiers.alternateIdentifier | constant 1 for VerbundFDB, constant 2 for QDN |
|
11.a | alternateIdentifiers.alternateIdentifierType | constant value VerbundFDB or QDN
|
|
relatedIdentifiers | list of DOI identifiers of previous or new versions of the data package | ||
12. | relatedIdentifiers.relatedIdentifier | the DOI of the previous or new version | |
12.a | relatedIdentifiers.relatedIdentifierType | constant value "DOI"
|
|
12.b | relatedIdentifiers.relationType | constant value "isNewVersionOf" or "isPreviousVersionOf"
|
|
15. | version | dataAcquisitionProject.release.version |
|
rightsList | list of rights | ||
16. | rightsList.rights | constant value that indicates the necessity of application to access data | |
descriptions | list of data package and survey population descriptions | ||
17. | descriptions.description |
dataPackage.description or survey.population.description
|
|
17.a | descriptions.descriptionType |
Abstract for dataPackage.description , Methods for survey.population.description
|
|
geoLocations | list of survey population locations | ||
18.3 | geoLocations.geoLocationPlace | country name of survey.population.geographicCoverage
|
|
fundingReferences | list of dataPackage.sponsors
|
||
19.1 | fundingReferences.funderName | dataPackage.sponsors[].name |
The following table describes the mapping from the MDM domain model to DataCite schema (version 4.6).
DataCite Sequence | DataCite Property | MDM property | DataCite restrictions |
---|---|---|---|
1. | Identifier | Composition from dataAcquisitionProject.masterId and dataAcquisitionProject.release.version
|
|
2 | Creator | List of analysisPackage.authors[] , additionally one entry for analysisPackage.institutions
|
mandatory (1-n) |
2.1 | creators.name |
analysisPackage.authors[].lastName , analysisPackage.authors[].firstName , analysisPackage.authors[].middleName
|
|
2.1.a | creators.nameType |
Personal for persons, Organizational for institutions |
|
2.2 | creators.givenName |
analysisPackage.authors[].lastName , analysisPackage.authors[].firstName , dataPackage.authors[].middleName
|
|
2.3 | creators.familyName | analysisPackage.authors[].lastName |
|
creators.nameIdentifiers | List of identifiers of each creator | ||
2.4 | creators.nameIdentifiers.nameIdentifier | analysisPackage.authors[].orcid |
|
2.4.a | creators.nameIdentifiers.nameIdentifierScheme | static value ORCID
|
|
2.4.b | creators.nameIdentifiers.schemeUri | static value "https://orcid.org"
|
|
creators.affiliation | List of affiliations for each creator (only specified if there is only one institution in the analysis package) | ||
2.5 | creators.affiliation.name |
analysisPackage.institutions[].de or analysisPackage.institutions[].de
|
|
titles | List of analysisPackage.title[]
|
mandatory (1-n) | |
3. | titles.title |
analysisPackage.title[].de , analysisPackage.title[].en
|
|
publisher | List with constant entry for DZHW | mandatory (1) | |
4 | publisher.name | constant value "German Centre for Higher Education Research and Science Studies (DZHW)"
|
|
4.a | publisher.publisherIdentifier | constant value "https://ror.org/01n8j6z65"
|
|
4.b | publisher.publisherIdentifierScheme | constant value "ROR"
|
|
4.c | publisher.schemeUri | constant value "https://ror.org/"
|
|
5 | publicationYear |
dataAcquisitionProject.release.firstDate , if null: current year |
mandatory (1) |
subjects | List with entries for each tag and elsst tag in all languages | ||
6. | subjects.subject |
analysisPackage.tags[] or analysisPackage.elsstTags[].prefLabel
|
|
6.a | subjects.subjectScheme | constant value "CESSDA European Language Social Science Thesaurus (ELSST)" for all ELSST tags |
|
6.b | subjects.schemeUri | constant value "https://thesauri.cessda.eu/elsst-4/en/" for all ELSST tags |
|
6.c | subjects.valueUri | the values URI at https://thesauri.cessda.eu/elsst-4/en/ | |
contributors | List of analysisPackage.dataCurators[] , additionally constant entry for DZHW |
||
7.a | contributors.contributorType |
DataCurator for persons, Distributor for DZHW entry |
|
7.1 | contributors.name | for persons analysisPackage.dataCurators[].lastName , analysisPackage.dataCurators[].firstName , analysisPackage.dataCurators[].middleName
|
|
7.1.a | contributors.nameType |
Personal for persons, Organizational for institutions |
|
7.2 | contributors.givenName |
analysisPackage.dataCurators[].lastName , analysisPackage.dataCurators[].firstName , analysisPackage.dataCurators[].middleName
|
|
7.3 | contributors.familyName | analysisPackage.dataCurators[].lastName |
|
contributors.nameIdentifiers | List of identifiers | ||
7.4 | contributors.nameIdentifiers.nameIdentifier | analysisPackage.dataCurators[].orcid |
|
7.4.a | contributors.nameIdentifiers.nameIdentifierScheme | constant value ORCID
|
|
7.4.b | contributors.nameIdentifiers.schemeUri | constant value "https://orcid.org"
|
|
contributors.affiliation | all contributors get DZHW as their affiliation | ||
7.5 | contributors.affiliation.name | constant value Deutsches Zentrum für Hochschul- und Wissenschaftsforschung (DZHW)
|
|
7.5.a | contributors.affiliation.affiliationIdentifier | constant value "https://ror.org/01n8j6z65"
|
|
7.5.b | contributors.affiliation.affiliationIdentifierScheme | constant value ROR
|
|
7.5.c | contributors.affiliation.schemeUri | constant value "https://ror.org/"
|
|
dates | a list of release related dates (release date, embargo date, date of withdrawal) | ||
8. | dates.date |
dataAcquisitionProject.release.firstDate , if null: current date, format: YYYY-MM-DD
|
|
8.a | dates.dateType |
Withdrawn for hidden projects (date: always current date), Available for released (date: release date) or pre-released projects (date: embargo date), Accepted for pre-released projects (date: release date) |
|
8.b | dates.dateInformation | only available for hidden projects or embargo date | |
types | constant object | mandatory (1) | |
10.a | types.resourceTypeGeneral | constant value Dataset
|
|
relatedIdentifiers | list of DOI identifiers of previous or new versions of the analysis package | ||
12. | relatedIdentifiers.relatedIdentifier | the DOI of the previous or new version | |
12.a | relatedIdentifiers.relatedIdentifierType | constant value "DOI"
|
|
12.b | relatedIdentifiers.relationType | constant value "isNewVersionOf" or "isPreviousVersionOf"
|
|
15. | version | dataAcquisitionProject.release.version |
|
rightsList | list of rights | ||
16. | rightsList.rights | constant value that indicates the necessity of application to access data | |
descriptions | list of analysis package | ||
17. | descriptions.description | analysisPackage.description |
|
17.a | descriptions.descriptionType |
Abstract for analysisPackage.description
|
|
fundingReferences | list of analysisPackage.sponsors
|
||
19.1 | fundingReferences.funderName | analysisPackage.sponsors[].name |
The metadata JSON schema is best described here. It also includes examples and controlled vocabulary.
The documentation of the DataCite API can be found here.
The additional restrictions introduced by the VerbundFDB (aka Kernset) are described here.
1 DataCite is an international consortium assigning persistent identifiers to data sets. It organizes the administration of DOI prefixes and the connection to the International DOI Foundation (IDF).
2 da|ra is a DOI registration agency for social and economic data in Germany. It is connected to DataCite which organizes the administration of prefixes and the connection to the International DOI Foundation (IDF).
3 The VerbundFDB is a research data infrastructure for empirical educational research collecting and sharing research data and information. The FDZ-DZHW is a network partner within the VerbundFDB.
4 If both da|ra restrictions and VerbundFDB restrictions have "---", this means we send these attributes voluntarily and/or the attributes are marked "optional" by the VerbundFDB Kernset
Developer Docs
- Entwickler:innen-Doku
- Scrum Artifacts
- Architecture
- Domain
- Search
- UI
- DevOps Process
- Authentication and Authorization
User Docs
- ID-Generation
- Metadata Import
- Metadata and DOI registration at dara
- PID Registration
- Variable (Data Set) Reports
- Übersicht R-Scripte
- Web Analytics