Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update the strip_tashkeel and strip_diactricts to remove the alef after tanween al fateh #70

Open
Mansari opened this issue Apr 9, 2023 · 2 comments

Comments

@Mansari
Copy link

Mansari commented Apr 9, 2023

The strip_tashkeel and strip_diactricts functions are very helpful when preprocessing text that will be used for searches. With these functions, one can search for a word that like رحيم without tashkeel. However, one of the challenges is this will not match a word that had tanween al fateh at the end, as the word after removing the tashkeel will still be different in structure رحيما.

I suggest adding another optional flag (to support previous versions) that will also remove the alef if it comes after tanween al fateh. See https://en.rattibha.com/thread/1266046390439903234 for details

Thank you for the amazing library!

@linuxscout
Copy link
Owner

Thank You brother Mohamed,
It's a great suggestion, we can add it.
Thanks

@Mansari
Copy link
Author

Mansari commented Apr 16, 2023

Amazing - anything I can help with insha Allah?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants