-
Notifications
You must be signed in to change notification settings - Fork 477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add coco-text as a test/train set #1131
Comments
Hey @Thomas-MMJ 👋 , Thanks for the request do you want to add it maybe ? |
Hi @felixdittrich92 , has anybody worked on this? I'd love to hop into the project and contribute to this issue. :) |
Hey @dvando 👋, No it's still open. |
Hi @felixdittrich92 , my apology it took me a while to actually work on it, I've been dealing with some issues from work. I've got some questions about the URLs for download, COCO-text has 2 separate URLs, the first one is for the images, and the second is for the labels, but the I also checked the other datasets (funsd, cord, synttext, etc), and all of them initialized the Sorry, and thanks in advance. :) |
Hi @dvando 😄 Option 1: You could take a look at https://github.com/mindee/doctr/blob/main/doctr/datasets/imgur5k.py (here the user needs to provide the paths to the data and we provide only the loader) |
So with option 1, the user should download the images and the labels by themself? That sounds okay. Both sound fine to me, which one do you prefer @felixdittrich92 ? :) |
Option 1 👍 |
As reference PR: #1359 :) |
@felixdittrich92, thanks in advance. |
Sure @sarjil77 :) First download the dataset: https://bgshih.github.io/cocotext/ (annotations) & images http://images.cocodataset.org/zips/train2014.zip The following PR can be used as reference: https://github.com/mindee/doctr/pull/1359/files In In Line 675 in ebfc9f3
These fixture is used in doctr/tests/pytorch/test_datasets_pt.py Line 741 in ebfc9f3
and doctr/tests/tensorflow/test_datasets_tf.py Line 714 in ebfc9f3
You can place the coco_text tests directly after the wildreceipt test cases :)
As last step we add the documentation entries: If you need any further information feel free to ask 👍 |
Thanks @felixdittrich92, will further look into it. |
🚀 The feature
You might consider adding COCO-text as one of the supported datasets,
https://vision.cornell.edu/se3/coco-text-2/#download
Motivation, pitch
It is another high quality dataset, text on objects at various angles (sides of vehicles, signs, etc.)
Alternatives
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: