New perl_parity:bool argument for MosesPunctNormalizer that fixes differences between the latest Perl implementation and sacremoses. In a future release this will probably become the default and only behaviour. #146
MosesTokenizer speed up thanks to precompiled regular expressions #133, #139. Same for MosesDetokenizer #143.
A couple of bugfixes: The order of the protected_patterns list passed to MosesTokenizer.tokenize() is no longer significant. Also, use_known now works as expected MosesTruecaser.truecase(). #121. Since this change changes the output, I've decided to bump the version to 0.1.0 to signal a possibly breaking change.
Finally, long gone but never released: No more Python 2 support code (bye six 👋)
This is the first release of sacremoses under HPLT stewardship 🎉