Skip to content

PyTorch implementation/experiments on Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond paper.

Notifications You must be signed in to change notification settings

alesee/abstractive-text-summarization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

6b0002a · Aug 4, 2018

History

39 Commits
Aug 2, 2018
Aug 2, 2018
Aug 2, 2018
Aug 2, 2018
Aug 2, 2018
Aug 2, 2018
Aug 4, 2018
Aug 3, 2018
Aug 2, 2018
Aug 2, 2018
Aug 2, 2018
Aug 2, 2018
Aug 2, 2018

Repository files navigation

abstractive-text-summarization

This repository and notebook contains code for in-progress implementation/experiments on Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond paper.

Requirements

  1. Create conda environment

conda env create -f environment.yml --gpu

conda env create -f environment-cpu.yml --cpu

  1. Activate environment

source activate abs-sum -- gpu

source activate abs-sum-cpu -- cpu

  1. Install dependencies (PyTorch, Fastai, etc) via:

pip install -r requirements.txt

  1. Download spacy english module

python -m spacy download en

Dataset

The dataset used is a subset of the gigaword dataset and can be found here.

It contains 3,803,955 parallel source & target examples for training and 189,649 examples for validation.

After downloading, we created article-title pairs, saved in tabular datset format (.csv) and extracted a sample subset (80,000 for training & 20,000 for validation). This data preparation can be found here.

An example article-title pair looks like this:

source: the algerian cabinet chaired by president abdelaziz bouteflika on sunday adopted the #### finance bill predicated on an oil price of ## dollars a barrel and a growth rate of #.# percent , it was announced here .

target: algeria adopts #### finance bill with oil put at ## dollars a barrel

Experimenting on the complete dataset (3M) would take a really long time (also $$$$). So in order to train and experiment faster we use our sample subset of 80,000.

Current Features

  • model architecture supports LSTM & GRU (biLSTM-to-uniLSTM or biGRU-to-uniGRU)
  • implements batch data processing
  • implements attention mechanism (Bahdanau et al. & Luong et al.(global dot))
  • implements scheduled sampling (teacher forcing)
  • implements tied embeddings
  • initializes encoder-decoder with pretrained vectors (glove.6B.200d)
  • implements custom training callbacks (tensorboard visualization for PyTorch, save best model & log checkpoint)
  • implements attention plots

To-Do

  • Implement additional linguistic features embeddings
  • Implement generator-pointer switch and replace unknown words by selecting source token with the highest attention score.
  • Implement large vocabulary trick
  • Implement sentence level attention
  • Implement beam search during inference
  • implement rouge evaluation

Baseline Training & Validation Loss

alt text

About

PyTorch implementation/experiments on Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond paper.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published