This program parses Wikipedia database dumps for consumption by semanticizest.
Make sure you have a Go compiler (1.2 or newer) and Git. On Debian/Ubuntu/Mint, that's:
sudo apt-get install git golang-go
On CentOS:
sudo yum -y install git golang
Set up a Go workspace, if you haven't already. For example:
mkdir /some/where/go cd /some/where/go export GOPATH=$(pwd)
Fetch and compile:
go get github.com/semanticize/st go install github.com/semanticize/st/dumpparser go install github.com/semanticize/st/semanticizest
You now have a working parser at ${GOPATH}/bin/dumpparser
. Issue:
${GOPATH}/bin/dumpparser --help
to figure out how to generate a semanticizer model, then use this model from the REST API:
${GOPATH}/bin/semanticizest --http=:5002 your_model curl http://localhost:5002/all -d 'Does the entity linking work?'