`write_citation_pairs` with less human intervention

This is all a bit of a mess, there is definitely a better way to do it.

`write_citation_pairs` takes a data frame with a column for article id and one for dataset id. It loops through each row and uses `crossref::cr_cn` to retrieve a full citation for the paper using the article id. We need the information such as authors, title, etc to send to the metrics service. 

`crossref::cr_cn` returns the citation in bibtex format (it can also return json and other formats, optionally). Then, that bibtex is passed to `bib2df:bib2df`, which parses the text string into a data frame. Parsing this text string is somewhat of a nightmare though, and I ended up refactoring bib2df to accommodate single line bibtex docs, which for some reason `crossref::cr_cn` started returning. So I did that [here](https://github.com/ropensci/bib2df/pull/62), but the method that I had to use requires that you know what the fields are for the bibtex entry are. Occasionally, a bibtex entry will come back with a really oddball field in it, and that field name has to be passed to the `extra_fields` argument of `bib2df`and the function run again to get the correct parsing, otherwise the rest of the document is thrown off. This is all especially frustrating because we only need certain fields to pass to the metrics service, but the ENTIRE doc needs to be processed correctly.

So some options to make this require no human intervention:

1. Capture the warning output from the first pass, parse it, feed the fields back in for a second pass
    - this seems ridiculous
2. Have crossref::cr_cn just return the json, parse it, and extract what we need, bypassing bib2df entirely
3. Find a more straightforward way to retrieve just the information we need, probably by querying the crossref API more directly

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`write_citation_pairs` with less human intervention #43

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

write_citation_pairs with less human intervention #43

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`write_citation_pairs` with less human intervention #43