This is the code for the SentencePiece Demo.
If you want to run the code locally you can follow the below steps to install the necessary libraries and get the test dataset we use to train our models. Alternatively, you can use this repo to run the code on FloydHub where the dataset is already configured and ready to go.
pip3 install sentencepiece
Download the data from the blog corpus.
You can download it via the notebook or the cmd line.
Remember, whatever you use, you need to unzip it and remember the relevant directory.
wget http://www.cs.biu.ac.il/~koppel/blogs/blogs.zip