Evaluation of STL10 dataset #59
Replies: 2 comments 3 replies
-
Ohh I see... so 1000 samples are actually training and not validation. Thanks for clarifying this for me, also didn't know torchvision have the data already I would use it silly me 😁 Let me think through to guide in the best possible way about tricks. Usually these help:
|
Beta Was this translation helpful? Give feedback.
-
That's crazy indeed :) I've checked the performance of different algorithms on STL-10 here, and it seems like barrier is pretty high. I am not sure though which of these algorithms evaluate using the testing protocol:
Probably not the semi-supervised ones because for semi-supervised learning you would need some labeled data. I ran my training again with the correct 1,000 training samples per fold and the average test accuracy is around 74%. |
Beta Was this translation helpful? Give feedback.
-
Hi Kerem,
I went through the notebook on STL10 dataset and I feel there is a misunderstanding on how the evaluation should be reported in the downstream task.
My understanding is that for each fold, 1000 images are used for training. Evaluation is then performed on the test set of 8000 images.
I can confirm this by also by obtaining the data from torchvision as in
torchvision.datasets.STL10(root='/stl10/', split='train', transform=None, folds=0)
the shape of data in this case is (1000, 3, 96, 96).I've been working on testing FixMatch (semi-supervised) on the same dataset but without much success so far. The convergence seems very slow, and in the paper they reported using a crazy 1 million iterations.
I'm also impressed with score of self-supervised on imagewang.
I'll try to use your augmentation pipeline to speed up things. Any advice on tuning the parameters for augmentations?
Beta Was this translation helpful? Give feedback.
All reactions