Structure Prediction Baselines Using AllenNLP

Implements baselines for tasks like POS tagging, NER and SRL.

Setup for Development

Clone the repo
Run setup enviroment bash script

/bin/bash setup_env.sh

From allennlp v2.5, additional steps are required:

pip install --upgrade allennlp==2.5.0

Install wandb allennlp to experiment with wandb (if not already installed):

pip install wandb_allennlp

Running the models

Download datasets
```
/bin/bash download_datasets.sh
```

Export enviroment variables

export CUDA_DEVICE=0 # 0 for GPU, -1 for CPU
export TEST=1 # for a dryrun and without uploading results to wandb
export WANDB_IGNORE_GLOBS=*\*\*\*.th,*\*\*\*.tar.gz,*\*\*.th,*\*\*.tar.gz,*\*.th,*\*.tar.gz,*.tar.gz,*.th
export DATA_DIR="./data/"

Training single models

Using slurm (on gypsum)

Open single_run.sh, make modifications as needed, close and submit job using sbatch single_run.sh. Do not push local updates to this file to the repo.

On you local machine

Without sending output to wandb

export TEST=1
export CUDA_DEVICE=-1
allennlp train <path_to_config> -s <path to serialization dir> --include-package structured_prediction_baselines

With output to wandb (see creating account and login into wandb for details on getting started with wandb.)

export TEST=0
export CUDA_DEVICE=-1
allennlp train-with-wandb --config_file=model_configs/<path_to_config_file> --include-package=structured_prediction_baselines --wandb_run_name=<some_informative_name_for_run>  --wandb_project structured_prediction_baselines --wandb_entity <your wandb account name or team name> -- some hyperparameters to add

Running hyperparameter sweeps
1. Create a sweep using a sweep config file. See sweep_configs directory for examples. Refer sweeps documentation here.
```
wandb sweep -e <your wandb account name or team name> -p baselines sweep_configs/<path/to/config.yaml>

< you will see an alpha numeric sweep_id as output here. Copy it.>
```
NOTE: some sweep config files use old allennlp command (e.g. 'allennlp train_with_wandb' or 'wandb_allenlp --subvommand=train'). please make sure your sweep config file is up to date as 3.
1. Start search agent on slurm using the following (This script will internally submit to sbatch. So you can run this command on the head node eventhough it is a python script because it exit withing seconds.)
```
export TEST=0
python slurm_wandb_agent.py <sweep_id> -p baselines -e <your wandb account name or team name> --num-jobs 5 -f --edit-sbatch --edit-srun
```
You can use squeue to see the running agents on nodes. You can rerun this command to start more agents.

Directory Structure

Modules

The following diagram give input/output and compositional structure of the base model.

Model(x, labels=None)
|-Sampler(x, labels, ScoreNN, CostFunction) -> (y_hat, probability)
|-ScoreNN(x,y) -> score
|-OracleValueFunction(y,labels) -> oracle cost
|-Loss(oracle costs, scores of labels, scores of y_hat, sample probabilities)

The model owns score_nn and oracle_value_function but during model construction, the references to these objects are passed to sampler and loss as shown below:

                              Model Construction



┌─────────────────────────────────────┐ ┌─────────────────────────────────────┐ ┌─────────────────────────────────────┐
│                                     │ │                                     │ │                                     │
│                                     │ │                                     │ │                                     │
│ inference_module:Sampler            │ │     sampler: Sampler                │ │        loss: Loss                   │
├─────────────┬───────────────────────┤ ├─────────────┬───────────────────────┤ ├─────────────┬───────────────────────┤
│Ref: score_nn│Ref:oracle_value_func. │ │Ref: score_nn│Ref:oracle_value_func. │ │Ref: score_nn│Ref:oracle_value_func. │
└──────▲──────┴──────────▲────────────┘ └──────▲──────┴──────────▲────────────┘ └────▲────────┴────────▲──────────────┘
       ▲                 ▲                     │                 │                   │                 │
       │                 │                     │                 └───────────────────┼──────┐          │
       │                 │                     │                                     │      │          │
       └─────────────────┼────────────────┐    │                                     │      │        ┌─┘
                         │                │    └────────────────────┐    ┌───────────┘      │        │
                         │                │                         │    │                  │        │
                         └────────────────┼─────────────────────────┼────┼───────────┐      │        │
                                          │                         │    │           │      │        │
                                        ┌─┴─────────────────────────┴────┴────┐  ┌───┴──────┴────────┴────────────────┐
                                        │                                     │  │                                    │
                                        │    score_nn : ScoreNN               │  │ oracle_value_function :            │
                                        │                                     │  │          OracelValueFunction       │
                                        │                                     │  │                                    │
                                        └─────────────────────────────────────┘  └────────────────────────────────────┘

Flow of data and computations happening in the model are as follows:

┌─────────────────────────────────────────────────────────┐  ┌───────────────────────────────────────────────────────────────────────────────┐
│         Training Flow                                   │  │    Validation/evaluation flow                                                 │
│                                                         │  │                                                                               │
│                                                         │  │                                                                               │
│                                                         │  │                                                                               │
│                   scalar Tensor                         │  │                scalar Tensor                           metrics                │
│                      ▲                                  │  │                   ▲                                       ▲                   │
│                      │                                  │  │                   │                                       │                   │
│     ┌────────────────┴────────────────────┐             │  │  ┌────────────────┴────────────────────┐    ┌─────────────┴────────────────┐  │
│     │                                     │             │  │  │                                     │    │                              │  │
│     │                                     │             │  │  │                                     │    │   metrics: List[Metric]      │  │
│     │        loss: Loss                   │             │  │  │        loss: Loss                   │    │                              │  │
│     ├─────────────┬───────────────────────┤             │  │  ├─────────────┬───────────────────────┤    └─────────────▲────────────────┘  │
│     │Ref: score_nn│Ref:oracle_value_func. │             │  │  │Ref: score_nn│Ref:oracle_value_func. │                  │                   │
│     └─────────────┴───────────────────────┘             │  │  └─────────────┴───────────────────────┘                  │                   │
│                      ▲                                  │  │                   ▲                                       │                   │
│                      │                                  │  │                   │                                       │                   │
│                      │                                  │  │                   └───────────────────────────────────────┘                   │
│         (y_hat: Tensor(batch,num_samples,...),          │  │   (y_pred:   Tensor(batch,num_samples,...),                                   │
│           y_probs: Optional[Tensor(batch,num_samples)]) │  │        y_probs: Optional[Tensor(batch,num_samples)])                          │
│                                                         │  │                                                                               │
│                      ▲                                  │  │                   ▲                                                           │
│                      │                                  │  │                   │                                                           │
│                      │                                  │  │                   │                                                           │
│      ┌───────────────┴─────────────────────┐            │  │                                                                               │
│      │                                     │            │  │   ┌─────────────────────────────────────┐                                     │
│      │                                     │            │  │   │                                     │                                     │
│      │     sampler: Sampler                │            │  │   │                                     │                                     │
│      ├─────────────┬───────────────────────┤            │  │   │ infelence_module:Sampler            │                                     │
│      │Ref: score_nn│Ref:oracle_value_func. │            │  │   ├─────────────┬───────────────────────┤                                     │
│      └─────────────┴───────────────────────┘            │  │   │Ref: score_nn│Ref:oracle_value_func. │                                     │
│                                                         │  │   └─────────────┴───────────────────────┘                                     │
│                     ▲                                   │  │                  ▲                                                            │
│                     │                                   │  │                  │                                                            │
│                     │                                   │  │                  │                                                            │
│                     │                                   │  │                  │                                                            │
│                     │                                   │  │                  │                                                            │
│              (x: Any, y: Tensor(batch, ...) )           │  │           (x: Any, y: None)                                                   │
│                                                         │  │                                                                               │
└─────────────────────────────────────────────────────────┘  └───────────────────────────────────────────────────────────────────────────────┘

Sampler:

Given input x, returns samples of shape (batch, num_samples or 1,...) and optionally their corresponding probabilities of shape (batch, num_samples). The sampler can do and return different things during training and test. We want the probabilities specifically in the [[Minimum Risk Training for Neural Machine Translation|MRT setting]]. The cases that sampler will cover include:

1. Inference network or `TaskNN`, where we just take the input x and produce either a relaxed output of shape `(batch, 1, ...)` or samples of shape `(batch, num_samples, ...)`. Note, when we include `inference_net: TaskNN` here, we also need to update its parameters, right here. So when sampler uses `inference_net: TaskNN`, we also need to give it an instance of `Optimizer` to update its parameters.

2. Cost-augmented inference module that uses `ScoreNN` and `OracleValueFunction` to produce a single relaxed output or samples.

3. Adversarial sampler which again uses `ScoreNN` and `OracleValueFunction` to produce adversarial samples. (I see no difference between this and the cost augmented inference)

4. Random samples biased towards `labels`.

5. In the case of MRT style training, it can be beam search.

6. In the case of vanilla feedforward model, one can just return the logits with shape `(batch, 1, ... )`

ScoreNN:

This is the parameterized value network or negative energy function that takes in (x,y) and produces a value or score or negative-energy (higher the better). The shape of y will be (batch, num_samples or 1, ...) and the shape of output score will be (batch, num_samples or 1).

OracelValueFunction:

Either a differentiable (w.r.t y) or non-differentiable function that takes in true label and an set of arbitrary y's(either discrete in case of non-differentiable cost) or a continuous relaxations. The shape of input y will be (batch, num_samples or 1, ...).

Loss:

Take in x, the output of the sampler, true labels, and references to ScoreNN and OracelValueFunction to produce a loss to back prop on.

Name		Name	Last commit message	Last commit date
Latest commit History 475 Commits
.github		.github
analysis		analysis
data/weizmann_horse_seg		data/weizmann_horse_seg
data_preprocessing		data_preprocessing
docs_source		docs_source
model_configs		model_configs
notebooks		notebooks
scripts		scripts
structured_prediction_baselines		structured_prediction_baselines
sweep_configs		sweep_configs
tests		tests
wiki		wiki
.allennlp_plugins		.allennlp_plugins
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.run_checks_local.sh		.run_checks_local.sh
.run_local_tests.sh		.run_local_tests.sh
CHANGELOG.md		CHANGELOG.md
README.md		README.md
core_requirements.txt		core_requirements.txt
doc_requirements.txt		doc_requirements.txt
download_datasets.sh		download_datasets.sh
init_env.sh		init_env.sh
jupytext.toml		jupytext.toml
lint_requirements.txt		lint_requirements.txt
noxfile.py		noxfile.py
pyproject.toml		pyproject.toml
resolve_nltk_err.py		resolve_nltk_err.py
setup.cfg		setup.cfg
setup.py		setup.py
setup_env.sh		setup_env.sh
single_run.sh		single_run.sh
slurm_wandb_agent.py		slurm_wandb_agent.py
test_requirements.txt		test_requirements.txt
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Structure Prediction Baselines Using AllenNLP

Setup for Development

Running the models

Directory Structure

Modules

Cite

About

Releases

Packages

Contributors 5

Languages

iesl/structured_prediction_baselines

Folders and files

Latest commit

History

Repository files navigation

Structure Prediction Baselines Using AllenNLP

Setup for Development

Running the models

Directory Structure

Modules

Cite

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages