Skip to content

Commit abf4fe7

Browse files
authored
Update README.md
fixing repo names capitalizations
1 parent c80d654 commit abf4fe7

File tree

1 file changed

+9
-9
lines changed
  • software-mentions-linker-disambiguator

1 file changed

+9
-9
lines changed

software-mentions-linker-disambiguator/README.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
# Software Mentions Linking + Disambiguation
22

33
The goal of this project is to produce a high quality dataset of software used in the biomedical literature to facilitate analysis of adoption and impact of open-source scientific software. Our overall methodology is the following:
4-
1. Extract plain-text software mentions from the PMC-OA access using an [NER Machine Learning Algorithm](https://github.com/chanzuckerberg/software-mention-extraction) (developed by Ivana Williams)
5-
2. Link the software mentions to repositories and generate metadata by querying a number of databases. We link mentions to: PyPI, Bioconductor, CRAN, Scicrunch and Github
4+
1. Extract plain-text software mentions from the PMC-OA access using an [NER Machine Learning Algorithm](https://github.com/chanzuckerberg/software-mention-extraction) (developed by Ivana Williams) G
5+
2. Link the software mentions to repositories and generate metadata by querying a number of databases. We link mentions to: PyPI, Bioconductor, CRAN, SciCrunch and GitHub
66
3. Disambiguate the software mentions
77

88
More detailed descriptions of the **[linking](#linking)** and **[disambiguation](#disambiguation)** steps can be found below, together with instructions on how to run the code.
@@ -14,11 +14,11 @@ More detailed descriptions of the **[linking](#linking)** and **[disambiguation]
1414

1515
## Linking Task description ##
1616
1. We query the following databases, searching for exact matches for plain text sofware mentions in our dataset:
17-
- pypi Index: https://pypi.org/simple/
17+
- PyPI Index: https://pypi.org/simple/
1818
- Bioconductor Index: https://www.bioconductor.org/packages/release/bioc/
1919
- CRAN Index: https://cran.r-project.org/web/packages/available_packages_by_name.html
20-
- Github API: https://github.com
21-
- Scicrunch API: https://scicrunch.org/resources
20+
- GitHub API: https://github.com
21+
- SciCrunch API: https://scicrunch.org/resources
2222

2323
2. We normalize the metadata files to a [common schema](#linking-schema).
2424
### Linking Schema
@@ -29,15 +29,15 @@ Metadata files are normalized to the following fields:
2929
| ID | unique ID of software mention (generated by us) |
3030
| software_mention | plain-text software mention |
3131
| mapped_to | value the software_mention is being mapped to |
32-
| source | source of the mapping - eg Bioconductor Index, Github API|
33-
| platform | platform of software_mention - eg pypi, CRAN |
32+
| source | source of the mapping - eg Bioconductor Index, GitHub API|
33+
| platform | platform of software_mention - eg PyPI, CRAN |
3434
| package_url | URL linking software_mention to source |
3535
| description | description of software_mention |
3636
| homepage_url | homepage_url of software_mention|
3737
| other_urls |other related URLs |
3838
| license | software license |
39-
| github_repo | Github repository |
40-
| github_repo_license | Github repository license |
39+
| github_repo | GitHub repository |
40+
| github_repo_license | GitHub repository license |
4141
| exact_match | whether or not this mapping was an exact match |
4242
| RRID | RRID for software_mention |
4343
| reference | journal articles linked to software_mention (identified either through DOI, pmid or RRID)|

0 commit comments

Comments
 (0)