Skip to content

Spot check results of the optional first initial variant to the WoS query #1294

@peetucket

Description

@peetucket

#840 adds first initial variants to WoS name queries when it can be determined that the first name is unique enough to allow this to happen

Before committing to this change, we should spot check a handful of users to be sure nothing unexpected is happening with the query.

  1. Identify some users with potentially common names (need to come up with ideas on how to do this) and some with not so common names
  2. Run the name query using the current code (i.e. no first initial variants at all) and then run the same name queries using the code in the PR (i.e. selective first initial variants)
  3. Compare the results (compare counts and compare pubs with IDs, and then scan any additional publications in the second result set to see if any obviously misidentified publications show up).

Can use stage for this purpose, by deploying main and the branch in #840 and then finding authors and manually running queries against WOS for those authors as documented here: https://github.com/sul-dlss/sul_pub/wiki/Useful-console-commands-(harvest,-WoS,-Pubmed,-others)#manually-query-wos-for-a-given-author-and-look-at-results-but-dont-processharvest

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions