Skip to content

Commit 9e793dc

Browse files
committedNov 8, 2019
renew README
1 parent 5839712 commit 9e793dc

File tree

1 file changed

+20
-3
lines changed

1 file changed

+20
-3
lines changed
 

‎README.md

+20-3
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,22 @@ Overall, LAMAL outperforms previous methods by a considerable margin and is only
1616
2--3\% worse than multitasking, which is usually considered the LLL upper bound.
1717

1818
## Dataset
19-
We first ran the code in https://github.com/salesforce/decaNLP to get the dataset, and then converted them into Squad-like format.
19+
20+
| Task | Dataset (Original Data Link) |
21+
| ---- | ------- |
22+
| Question Answering | [SQuAD version 1.1](https://rajpurkar.github.io/SQuAD-explorer/) |
23+
| Machine Translation | [IWSLT](https://wit3.fbk.eu/mt.php?release=2016-01) |
24+
| Summarization | [CNN/DM](https://cs.nyu.edu/~kcho/DMQA/) |
25+
| Natural Language Inference | [CNN/DM](https://www.nyu.edu/projects/bowman/multinli/) |
26+
| Sentiment Analysis | [SST](https://nlp.stanford.edu/sentiment/treebank.html) |
27+
| Semantic Role Labeling | [QA‑SRL](https://dada.cs.washington.edu/qasrl/) |
28+
| Zero-Shot Relation Extraction | [QA‑ZRE](http://nlp.cs.washington.edu/zeroshot/) |
29+
| Goal-Oriented Dialogue | [WOZ](https://github.com/nmrksic/neural-belief-tracker/tree/master/data/woz) |
30+
| Semantic Parsing | [WikiSQL](https://github.com/salesforce/WikiSQL) |
31+
| Commonsense Reasoning | [MWSC](https://s3.amazonaws.com/research.metamind.io/decaNLP/data/schema.txt) |
32+
| Text Classification | [AGNews, Yelp, Amazon, DBPedia, Yahoo](http://goo.gl/JyCnZq) |
33+
34+
In order to unify the format of all the dataset, we first ran the code in https://github.com/salesforce/decaNLP to get the first 10 tranformed dataset, and then converted them into Squad-like format. For the last 5 dataset, we converted them directly. All converted dataset are available [here](https://drive.google.com/file/d/1rWcgnVcNpwxmBI3c5ovNx-E8XKOEL77S/view?usp=sharing).
2035

2136
## Dependencies
2237
- Ubuntu >= 16.04
@@ -31,7 +46,7 @@ We first ran the code in https://github.com/salesforce/decaNLP to get the datase
3146
1. Create the following two directories in wherever you want. (you can name the directories arbitrarily):
3247
- `data directory`: Where the dataset will be load by the model.
3348
- `model directory`: The place for the model to dump its outputs.
34-
2. Download the dataset: . After decompression, move all the files in the decompressed directory into `data directory`.
49+
2. Download the dataset: Download [here](https://drive.google.com/file/d/1rWcgnVcNpwxmBI3c5ovNx-E8XKOEL77S/view?usp=sharing) and decompress it. After decompression, move all the files in the decompressed directory into `data directory`.
3550
3. Make a copy of `env.example` and save it as `env`. In `env`, set the value of DATA_DIR as `data directory` and set the value of MODEL_ROOT_DIR as `model directory`.
3651
4. Before training or testing, load DATA_DIR and MODEL_ROOT_DIR variables into shell environment by the following command:
3752
```bash
@@ -98,4 +113,6 @@ This example test the model trained on sst, srl and woz.en by finetune method.
98113
After running testing program, the metrics: `metrics.json` will be dumped in the same directory of Training's outputs.
99114

100115
## Acknowledgements:
101-
116+
- We use the language model offered by [transformers](https://github.com/huggingface/transformers), a great state-of-the-art natural language processing models library by Thomas Wolf et al.
117+
- The implementation of MAS refer to [MAS-Memory-Aware-Synapses](https://github.com/rahafaljundi/MAS-Memory-Aware-Synapses), a great Memory Aware Synapses method implementation code by Aljundi R. et al.
118+
- Data format conversion refer to [decaNLP](https://github.com/salesforce/decaNLP), a great The Natural Language Decathlon: Multitask Learning as Question Answering implementation code by Bryan McCann et al.

0 commit comments

Comments
 (0)