Incorrect detection

Hi guys!
I use py3langid==0.2.2 and I found that in some cases Chinese language has higher probability than it probably should be. For example 

```
identifier = LanguageIdentifier.from_pickled_model(MODEL_FILE, norm_probs=True)
identifier.rank("Al furjan")
```
outputs:
[('zh', 0.24405981600284576), ('fi', 0.16715779900550842), ('mt', 0.1392195224761963), ('et', 0.10675894469022751), ('sl', 0.07787516713142395), ('en', 0.05285739526152611)......]

I understand that the text is quite short and it may return languages other that English, but Chinese?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incorrect detection #17

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Incorrect detection #17

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions