You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: software-mentions-linker-disambiguator/README.md
+9-9Lines changed: 9 additions & 9 deletions
Original file line number
Diff line number
Diff line change
@@ -1,8 +1,8 @@
1
1
# Software Mentions Linking + Disambiguation
2
2
3
3
The goal of this project is to produce a high quality dataset of software used in the biomedical literature to facilitate analysis of adoption and impact of open-source scientific software. Our overall methodology is the following:
4
-
1. Extract plain-text software mentions from the PMC-OA access using an [NER Machine Learning Algorithm](https://github.com/chanzuckerberg/software-mention-extraction) (developed by Ivana Williams)
5
-
2. Link the software mentions to repositories and generate metadata by querying a number of databases. We link mentions to: PyPI, Bioconductor, CRAN, Scicrunch and Github
4
+
1. Extract plain-text software mentions from the PMC-OA access using an [NER Machine Learning Algorithm](https://github.com/chanzuckerberg/software-mention-extraction) (developed by Ivana Williams) G
5
+
2. Link the software mentions to repositories and generate metadata by querying a number of databases. We link mentions to: PyPI, Bioconductor, CRAN, SciCrunch and GitHub
6
6
3. Disambiguate the software mentions
7
7
8
8
More detailed descriptions of the **[linking](#linking)** and **[disambiguation](#disambiguation)** steps can be found below, together with instructions on how to run the code.
@@ -14,11 +14,11 @@ More detailed descriptions of the **[linking](#linking)** and **[disambiguation]
14
14
15
15
## Linking Task description ##
16
16
1. We query the following databases, searching for exact matches for plain text sofware mentions in our dataset:
0 commit comments