Is it expected behaviour that "What's up?" is classified as TSWANA? #59

Paethon · 2022-08-17T10:25:37Z

Paethon
Aug 17, 2022

Hi

I guess the title says it all. When I classify the sentence "What's up?" without restricting the languages to consider, the sentence is classified as being Tswana instead of English. I am of course aware that using statistical models etc. and very short texts, something like this can happen.

I just want to check if this is expected behaviour or a bug after all.

Cheers, Sebastian

pemistahl · 2022-08-18T09:41:05Z

pemistahl
Aug 18, 2022
Maintainer

Hi Sebastian, thanks for your question.

No, this is not a bug. For the text "What's up?", the sum of the ngram probabilities for Tswana is simply greater than the sum of the ngram probabilities for English. If you take the text "What is up?" instead, the detector will return English.

1 reply

Paethon Sep 9, 2022
Author

Thanks. That is what I expected. Just wanted to make sure 😁

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it expected behaviour that "What's up?" is classified as TSWANA? #59

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Is it expected behaviour that "What's up?" is classified as TSWANA? #59

Paethon Aug 17, 2022

Replies: 1 comment · 1 reply

pemistahl Aug 18, 2022 Maintainer

Paethon Sep 9, 2022 Author

Paethon
Aug 17, 2022

Replies: 1 comment 1 reply

pemistahl
Aug 18, 2022
Maintainer

Paethon Sep 9, 2022
Author