adding links to preprint + dataset in notebook

chanzuckerberg · Sep 28, 2022 · 1a6dc9c · 1a6dc9c
1 parent 113568c
commit 1a6dc9c
Showing 1 changed file with 7 additions and 7 deletions.
diff --git a/sample_notebooks/Interacting with the dataset.ipynb b/sample_notebooks/Interacting with the dataset.ipynb
@@ -10,7 +10,7 @@
     "This notebook offers examples of **interacting** with the <b>CZI Software Mentions dataset </b><br>\n",
     "The <b>CZI Software Mentions dataset </b> is a large dataset of software mentions mined from the literature. \n",
     "\n",
-    "**Dataset Overview**: Plain-text software mentions are extracted with a trained [SciBERT](#references_scibert)model from several sources: the NIH PubMed Central collection and from papers provided by various publishers to the Chan Zuckerberg Initiative. The dataset provides sources, context and metadata, and, for a number of mentions, the disambiguated software entities and links. Full description of the dataset, methodology, algorihms and evaluation used to create the dataset can be found in our preprint, [A large dataset of software mentions in the biomedical literature](Link) and on our [Github page](https://github.com/chanzuckerberg/software-mentions). \n",
+    "**Dataset Overview**: Plain-text software mentions are extracted with a trained [SciBERT](#references_scibert)model from several sources: the NIH PubMed Central collection and from papers provided by various publishers to the Chan Zuckerberg Initiative. The dataset provides sources, context and metadata, and, for a number of mentions, the disambiguated software entities and links. Full description of the dataset, methodology, algorihms and evaluation used to create the dataset can be found in our preprint, [A large dataset of software mentions in the biomedical literature](https://arxiv.org/abs/2209.00693) and on our [Github page](https://github.com/chanzuckerberg/software-mentions). \n",
     "\n",
     "\n",
     "**The notebook is structured and offers the following information and examples, as follows:**\n",
@@ -35,11 +35,11 @@
     "There is a different notebook, [CZI Software Mentions Dataset - Sample Use Cases](#link_here), that offers sample use cases for the dataset.\n",
     "\n",
     "**The full list of resources we have available for the dataset is**:\n",
-    "1. [Preprint: A large dataset of software mentions in the biomedical literature](link)\n",
+    "1. [Preprint: A large dataset of software mentions in the biomedical literature](https://arxiv.org/abs/2209.00693)\n",
     "2. [Github Repository](https://github.com/chanzuckerberg/software-mentions)\n",
-    "3. [Dataset README.md](link)\n",
-    "4. [CZI Software Mentions Dataset - Interacting with the Dataset](#link_here) - Jupyter Notebook\n",
-    "5. [CZI Software Mentions Dataset - Sample Use Cases](#link_here) - Jupyter Notebook\n",
+    "3. [Dataset](https://datadryad.org/stash/dataset/doi:10.5061/dryad.6wwpzgn2c?)\n",
+    "4. [Interacting with the Dataset](https://github.com/chanzuckerberg/software-mentions/blob/main/sample_notebooks/Interacting%20with%20the%20dataset.ipynb) - Jupyter Notebook\n",
+    "5. [Sample Use Cases](https://github.com/chanzuckerberg/software-mentions/blob/main/sample_notebooks/Sample%20Use%20Cases.ipynb) - Jupyter Notebook\n",
     "\n",
     "For questions, please contact [email protected]"
    ]
@@ -62,7 +62,7 @@
     "<a id='dataset_interaction'></a>\n",
     "\n",
     "## Interacting with the dataset\n",
-    "We offer a brief overview of the dataset below. For a full description, including detailed information about the available files and fields, and how they were obtained, please consult the dataset [README.md](#Linkhere) file, or the Appendix section of our [preprint](link)"
+    "We offer a brief overview of the dataset below. For a full description, including detailed information about the available files and fields, and how they were obtained, please consult the dataset [README.md](https://datadryad.org/stash/dataset/doi:10.5061/dryad.6wwpzgn2c?) file, or the Appendix section of our [preprint](https://arxiv.org/abs/2209.00693)"
    ]
   },
   {
@@ -2431,7 +2431,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Note that for mentions that are marked as **unclear**, we don't recommend excluding them from analyses. They should rather be interpreted as *it cannot be assumed that this plain-text software mention will always be a true software mention when appearing in text*. The curators have only been provided with 5 sentences per software mention, and they did not curate each individual sentence in which a mention appears. The evaluations are based solely on those 5 sentences. We offer a more in-depth discussion about this in our [preprint](link) and [curation documents](link)"
+    "Note that for mentions that are marked as **unclear**, we don't recommend excluding them from analyses. They should rather be interpreted as *it cannot be assumed that this plain-text software mention will always be a true software mention when appearing in text*. The curators have only been provided with 5 sentences per software mention, and they did not curate each individual sentence in which a mention appears. The evaluations are based solely on those 5 sentences. We offer a more in-depth discussion about this in our [preprint](https://arxiv.org/abs/2209.00693) and [curation documents](link)"
    ]
   },
   {