GitHub - chrishokamp/dynamic-transformer-ensembles: Dynamic ensemble decoding with transformer-based models

DynE: Dynamic Ensemble Decoding for Multi-Document Summarization

This repo contains the code for DynE: Dynamic Ensemble Decoding for Multi-Document Summarization.

This code base can be used to add dynamic ensembling capability to models from the Huggingface transformers library.

Setup / Installation

# make a fresh environment
conda create -n dynamic-ensembles python=3.6
conda activate dynamic-ensembles

# Installation 
make dev

Multi-Document Summarization (MDS) Datasets

MDS datasets in the format required by the scripts in this repo:

WCEP (train, val, test)
MultiNews (train, val, test)
DUC2004 (test)

The original WCEP dataset used to generate the flat training data:

WCEP in .jsonl format

Model Checkpoints and Outputs

Model Checkpoints

We fine-tune the bart-large-cnn single-document summarization model from the transformers library

The best fine-tuned model checkpoints for WCEP and MultiNews are here

Fine-tuned Model Outputs

Download the outputs of fine-tuned models on the test sets of WCEP and MultiNews here

Evaluation

Prediction and evaluation are done by the script transformer_decoding/evaluate.py There is also a make task for evaluation which simply calls this script.

For example, to predict using a model id from transformers, or with a fine-tuned model checkpoint, and evaluate with the Ghalandari et al. 2020 evaluation workflow:

MODEL_ID=model_checkpoints/wcep_fine-tune-bart-large/checkpointepoch\=1.ckpt \
RUN_FLAGS='--max-articles-in-cluster 5 --max-src-length 512 --max-tgt-length 64 --num-beams 5 --eval-prefix wcep_5_articles_' \
make evaluate

pretrained model checkpoints can be downloaded from the links above.

For a quick test, use the --rows-to-eval argument, which will only predict the first N rows from the dataset:

MODEL_ID=model_checkpoints/wcep_fine-tune-bart-large/checkpointepoch\=1.ckpt \
RUN_FLAGS='--max-articles-in-cluster 5 --max-src-length 512 --max-tgt-length 64 --num-beams 5 --rows-to-eval 10 --eval-prefix wcep_5_articles_' \
make evaluate

To run evaluation only, using previously generated predictions, supply the --predictions argument to transformer_decoding/evaluate.py:

EVALUATION_DATASET=data/WCEP/test.jsonl \
RUN_FLAGS='--predictions outputs/wcep/wcep_5_articles_eval_predicted_summaries.out' \
make evaluate

Scoring Gold Summaries by Forced Decoding


EVALUATION_DATASET=data/WCEP/test.jsonl \
RUN_FLAGS='--force-decode-gold --max-articles-in-cluster 5 --max-src-length 512 --max-tgt-length 512 --num-beams 1 --rows-to-eval 10 --eval-prefix wcep_5_articles_' \
make evaluate

Citing

If you use ideas or code from this project, please cite:

@article{DynamicEnsembles,
    title = {DynE: Dynamic Ensemble Decoding for Multi-Document Summarization},
    author = {Chris Hokamp and Demian Gholipour Ghalandari and Nghia The Pham
              and John Glover},
    journal={arXiv preprint arXiv:2006.08748},
    year = {2020},
}

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
bin		bin
data/test_dataset		data/test_dataset
research		research
transformer_decoding		transformer_decoding
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
VERSION		VERSION
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DynE: Dynamic Ensemble Decoding for Multi-Document Summarization

Setup / Installation

Multi-Document Summarization (MDS) Datasets

Model Checkpoints and Outputs

Model Checkpoints

Fine-tuned Model Outputs

Evaluation

Scoring Gold Summaries by Forced Decoding

Citing

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

chrishokamp/dynamic-transformer-ensembles

Folders and files

Latest commit

History

Repository files navigation

DynE: Dynamic Ensemble Decoding for Multi-Document Summarization

Setup / Installation

Multi-Document Summarization (MDS) Datasets

Model Checkpoints and Outputs

Model Checkpoints

Fine-tuned Model Outputs

Evaluation

Scoring Gold Summaries by Forced Decoding

Citing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages