How to handle "e.g.", "i.e.", "viz.", etc. #13795
Unanswered
fi11222
asked this question in
Help: Other Questions
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
The standard English models (I currently use en_core_web_trf) do not seem to be able to properly identify these common abbreviations.
Currently, they are categorized as "foreign words" (pos_ = X, tag_ = FW) and the final "." is yanked off from "viz." and treated as a separate token.
Is there something I can do to avoid that?
I did not find anything about this issue either here or elsewhere on the Web
Beta Was this translation helpful? Give feedback.
All reactions