-
Notifications
You must be signed in to change notification settings - Fork 156
Open
Description
This is relevant specifically to grc. Because modern books of Ancient Greek often has to mark out uncertain letters in ancient sources, letters with dot below are a common occurrence but are at present not recognised by tesseract.
A fairly complete list of letters with dot below (except for the lunate sigma ϲ̣) can be found here: https://titus.uni-frankfurt.de/unicode/unicsel/grkkadd.htm
I wonder if recognising dot below shouldn’t be a feature behind a flag to be manually turned on because it might also pick up stains in older books (which however tend not to have such dots & so don’t require this feature). But this could make it difficult to deploy the feature in downstream projects like Internet Archive.
Metadata
Metadata
Assignees
Labels
No labels