Replies: 1 comment
-
Hi @puopg, thank you for using my library and for reaching out to me. Statistical approaches to language detection are never 100% correct. Based on the training data I've used, the ngrams in the word hello are more characteristic of Italian than English. If you compare some ordinary Italian and English texts, I'm sure you will encounter the ngrams 'ell' and 'llo' to be much more frequent in the Italian text. And that's where the probabilities come from. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I configured the Language detector with only English and Italian, and I guess to my naivety, I was surprised that it resulted in a 68% vs a 31% match for italian and English.
Im a bit curious as to why that is?
Beta Was this translation helpful? Give feedback.
All reactions