-
Notifications
You must be signed in to change notification settings - Fork 485
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot use builtin datasets for detection training #1830
Comments
Hi @KenjiTakahashi 👋, Yeah you are right currently only the validation scripts provides direct support for the built-in datasets .. so your "quick fix" is valid. Support for the built-in datasets is something we could add especially for pretraining on SynthText it would be useful. The issue here is Have you tried any of the other built-in datasets ? With every other it should work (only SVHN is a bit "special") Best regards, |
i am working on this, if you have any reference or any further info regarding this issue would be helpful. thanks :) |
Hi @sarjil77 👋, So the idea is that people can use the builtin datasets we provide (only the onces which are directly available via download where we don't need to provide the path to an images/labels file/folder) for the detection and recognition training I would suggest to start with the recognition that should be easier. Currently we have the train options:
A third option then would be something like:
Same for the Then modify: doctr/references/recognition/train_pytorch.py Line 214 in 3e21320
and doctr/references/recognition/train_pytorch.py Line 287 in 3e21320
If multiple builtin datasets are passed we init the first passed and extend the data from the other ones like here:
For detection it looks mostly similar but requires a little trick to transform the data format ...so let's do 2 PR's starting with the recognition Does this help ? :) |
yeah, this helps, i have already started working on this. thanks :) |
hey @felixdittrich92, i did some changes and was able to load the multiple builtin dataset, but i am facing some issues, using:
so should i pad here or resize or any other way you suggest. |
Yeah at a minimum we need to apply the Resize transform like here: doctr/references/recognition/train_pytorch.py Line 232 in 3e21320
:) |
Addition: Pseudo code (untested): train_datasets = args.train_datasets
train_set = datasets.__dict__[train_datasets[0]](
train=True,
download=True,
recognition_task=True,
use_polygons=True,
img_transforms=Compose([
T.Resize((args.input_size, 4 * args.input_size), preserve_aspect_ratio=True),
# Augmentations
T.RandomApply(T.ColorInversion(), 0.1),
]),
)
if len(train_datasets) > 1:
for dataset_name in train_datasets[1:]:
_ds = datasets.__dict__[dataset_name](
train=True,
download=True,
recognition_task=True,
use_polygons=True,
)
train_set.data.extend((np_img, target) for np_img, target in _ds.data) And for |
Bug description
I've modified the reference (PyTorch) training code to use
SVHN
dataset.That did not work (see traceback).
By looking at what
DetectionDataset
class is doing and some trial and error, I managed to get it working (or at least running) by restructuring the targets from the dataset as follows:Don't know if this is "valid" fix, though.
Code snippet to reproduce the bug
See https://gist.github.com/KenjiTakahashi/9bb22093d584bb2b203eb003a2bbb414.
Like mentioned, this is mostly the same code as
https://github.com/mindee/doctr/blob/e6bf82d6a74a52cedac17108e596b9265c4e43c5/references/detection/train_pytorch.py
with slight modifications to work with
SVHN
class instead ofDetectionDataset
.Error traceback
Environment
This script does not seem to work well on (my) MacOS, it mostly returns N/A's. Anyway, I run it currently on MacOS 14 with latest doctr (tried both
0.10
release and master ate6bf82d
) and PyTorch.Deep Learning backend
I'm using PyTorch, but same problem happens on TF as well.
The text was updated successfully, but these errors were encountered: