Provide a mapping between original mentions and unified mentions #6
Labels
nice-to-have
Something that would be nice to have by the time we finish, but that is not strictly required
Milestone
We have to inherently create some sort of mapping between what the mentions originally looked like in CORD-19 (e.g.,
['Statistical Package for Social Sciences (SPSS)', 'SPSS', 'SPSS Statistics']
and what they look like in a normalized fashion in our new dataset (e.g.,SPSS
).It would probably be very useful for other projects that may reuse our dataset to also have access to the mapping. Therefore, it would be nice to provide this mapping in some consumable form.
The text was updated successfully, but these errors were encountered: