Add ICDAR 2023 Datasets #1332
Replies: 2 comments 9 replies
-
Hi @HamzaGbada thanks a lot for the datasets list :) I will check this tonight or maybe tomorrow morning 👍🏼 |
Beta Was this translation helpful? Give feedback.
-
1. DocVQA2020-23 - Looks interesting (contains Azure Document AI (OCR) annotations) ~13K samples Additional: I bolded the interesting ones :) |
Beta Was this translation helpful? Give feedback.
-
Summary:
I would like to propose the addition of several datasets published in ICDAR 2023 to the
doctr
repository. These datasets are valuable resources for the document analysis and recognition community. Additionally, I'd like to suggest the inclusion of a new dataset named WildReceipt (Not published in ICDAR2023). This will enhance the repository's usefulness for researchers and practitioners working in this field.Datasets to be Added:
Why These Datasets?
ICDAR 2023 is a reputable conference in the field of document analysis and recognition. The datasets listed above are newly published and are crucial for advancing research and development in this area. They cover various aspects of document analysis, including text recognition, scene text detection, document layout analysis, and more.
Additional Information:
For more information about these datasets, check this link
What do you think @felixdittrich92 ?
Beta Was this translation helpful? Give feedback.
All reactions