Rust implementation of TinySegmenter, which is a compact Japanese tokenizer.
Adding the following to the Cargo.toml in your project:
[dependencies]
tinysegmenter = "0.1"
and import using extern crate
:
extern crate tinysegmenter;
let tokens = tinysegmenter::tokenize("私の名前は中野です");
println!("{}", &tokens.join("|")); // 私|の|名前|は|中野|です
Copyright (c) 2015 woxtu
Licensed under the MIT license.