Skip to content

Commit 775da57

Browse files
authored
added link to 20 newsgroups data website
1 parent 4f94425 commit 775da57

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

Readme.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ where
2424

2525
**Datasets**: A directory containing CSV files. There is expected to be 1 CSV file per set or collection, with separate sets for training, validation and test. The CSV files in the directory must be named accordingly: `training.csv`, `validation.csv`, `test.csv`. For this task, each CSV file (prior to preprocessing) consists of 2 string fields with a comma delimiter - the first is the label and the second is the document body.
2626

27-
**Vocabulary files**: A plain text file, with 1 vocabulary token per line (note that this must be created in advance, we do not provide a script for creating vocabularies). We do provide the vocabulary file used in our 20 Newsgroups experiment in [`data/20newsgroups.vocab`](data/20newsgroups.vocab).
27+
**Vocabulary files**: A plain text file, with 1 vocabulary token per line (note that this must be created in advance, we do not provide a script for creating vocabularies). We do provide the vocabulary file used in our 20 Newsgroups experiment in [`data/20newsgroups.vocab`](data/20newsgroups.vocab). If you wish to play with the actual 20 Newsgroups data, it's available [here](http://qwone.com/~jason/20Newsgroups/).
2828

2929

3030
## Training

0 commit comments

Comments
 (0)