[models] Handle watermarks #405
charlesmindee
started this conversation in
Ideas
Replies: 2 comments
-
More samples below: |
Beta Was this translation helpful? Give feedback.
0 replies
-
Moving this to a discussion, since it is a deep and very opened long-term issue. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Handle watemarks: we don't want the main information of the document to be parasitized by watermarks, so we need an OCR able to distinguish and filter watermarks.
We could work on word occurrences and find areas covered by highly-repeated words, we could also work on relative contrasts.
Beta Was this translation helpful? Give feedback.
All reactions