Skip to content

Latest commit

 

History

History
88 lines (49 loc) · 4.61 KB

getting_started.md

File metadata and controls

88 lines (49 loc) · 4.61 KB

NNLBD System Description

Table of Contents

  1. System Description
  2. Getting Started
  3. Configuration File
  4. Models
    1. Base Multi-Label Models
    2. CD-2 Reduplication Model
  5. Model Output
  6. FAQ
  7. Reduplicating Our Published Work

System Description

NNLBD is a deep learning Python package which aims to automate the process of identifying implicit relationships for Literature-based Discovery. It includes and explores many neural network architectures for open and closed discovery. We provide comprehensive details of all integrated models below.

Getting Started

First, we recommend preparing your virtual environment and installing the package. For running LBD experiments, each model has specific usage instructions, requirements, configuration file settings, and methods for evaluating system performance. Consult the specific instructions under the Models section for further details.

Configuration File

To execute an experiment, we use JSON-formatted configuration files as an argument while executing the LBDDriver.py script. An example is shown below:

python LBDDriver.py config.json

We provide further configuration file details here.

Models

We provide details for our models included within the system. Each model section provides pertinent details which describes its use. These details include:

Model Description
Data Description
Term Representation and Vectorization
Model Output Description
Word Embedding Details
Miscellaneous Experimental Details

Base Multi-Label Models

The following figure shows the architecture of our base mulit-label deep learning multi-perceptron model. We train the model to identify implicit relations for closed discovery. Given explicit A-B-C relationship term triplets, we input A and C term embeddings into the model and train the model to predict all associated B terms.

alt text

We provide further details of this model here.

CD-2 Reduplication model

The following figure shows the neural architecture as propsed by Crichton, et al (2019). We reduplicate this model which predicts the likelihood between A-B-C relationship triplets, as links within a knowledge graph. All terms embeddings are provided as input and the model is trained using single-label binary crossentropy. (i.e. 0/1 loss).

alt text

We provide further details of this model here.

Model Output

Each model produces various forms of output (e.g. standard console output or output files) depending on the task the user specifies. However, during model runtime, training metrics are reported to the user via standard console output. If the user specifies the model to perform evalution, evaluation metrics will also be included. Lastly, we recommend saving all models by providing a model_save_path. Depending on the experimental task, all reported model metrics will be saved to plain text files (e.g. model_metrics.txt) and plotted as PNG images. We recommend consulting the model details for more information regarding expected model output.

FAQ

We provide answers to frequently asked questions here.

Reduplicating Our Published Work

In this section, we provide guide to reproduce our published works. We list each published manuscript by title and provide a further details.

Exploring a Neural Network Architecture for Closed Literature-based Discovery

  • This study focuses on deploying our Base Multi-Label Models to identify Hallmarks of Cancer over recent LBD discoveries as described here. Details to reduplicate our study can be found here.

  • NOTE: This manuscript is currently under review.