You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The text was updated successfully, but these errors were encountered:
ramSeraph
changed the title
Replacement of semicolon by visarga for normalization in Telugu creates invalid words
Replacement of semicolon by visarga for normalization in Telugu creates invalid/misformed words
Sep 17, 2024
The rule to replace ':' with 'ః" during normalization unconditionally, creates unicode sequences which are not valid.
Example: "హైదరాబాద్:" becomes "హైదరాబాద్ః", which is not a valid/wellformed Telugu unicode sequence.
This is the offending rule in the
TeluguNormalizer
:The text was updated successfully, but these errors were encountered: