Skip to content

Backfill publication identifiers #1359

@peetucket

Description

@peetucket

We have many publications for which we one identifier (PMID), but not other identifiers (DOI) for the same publication. This can be due to the fact that the original harvest source did not return all known identifier, or because the user entered the publication manually and didn't provide all identifiers.

The lack of identifiers can prevent us form pushing publications to ORCID or can cause duplicates when pushed to ORCID (see https://docs.google.com/document/d/1ZfNmfBzPTYm7aJpwrWAx6nXHvVvt1AfkOmSceOxCoXo)

The lack of identifiers also makes the dataset potentially less useful for research intelligence purposes.

It would be beneficial to backfill publications with other identifier where available. This would require using an API or other data source that could be a fed a known identifier from our database (e.g. a PMID) and return other known identifiers (e.g. DOI) for the same pubication. We would then augment our publication record with this identifier (in the PublicationIdentifier table, and then denormalized into the pub_hash).

Potential APIs to use:

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions