Evaluation of STL10 dataset #59

feras-oughali · 2021-04-27T08:28:33Z

feras-oughali
Apr 27, 2021

Hi Kerem,

I went through the notebook on STL10 dataset and I feel there is a misunderstanding on how the evaluation should be reported in the downstream task.

My understanding is that for each fold, 1000 images are used for training. Evaluation is then performed on the test set of 8000 images.

I can confirm this by also by obtaining the data from torchvision as in torchvision.datasets.STL10(root='/stl10/', split='train', transform=None, folds=0) the shape of data in this case is (1000, 3, 96, 96).

I've been working on testing FixMatch (semi-supervised) on the same dataset but without much success so far. The convergence seems very slow, and in the paper they reported using a crazy 1 million iterations.

I'm also impressed with score of self-supervised on imagewang.

I'll try to use your augmentation pipeline to speed up things. Any advice on tuning the parameters for augmentations?

KeremTurgutlu · 2021-04-27T08:51:14Z

KeremTurgutlu
Apr 27, 2021
Maintainer

Ohh I see... so 1000 samples are actually training and not validation. Thanks for clarifying this for me, also didn't know torchvision have the data already I would use it silly me 😁

Let me think through to guide in the best possible way about tricks. Usually these help:

Find out what have others successfully used in the past having similar dataset/domain/task as yours.
Try out default augmentations and parameters from the papers.
Always visualize your data before training to see whether the augmentations make sense.
Try different combinations of augmentation or hyperparemeter tuning, similar to an ablation study to see which works better. Trying out different things is the key and don't focus too much on model arch, etc. It would be faster to do this on sample training and valid data to get fast results.

1 reply

feras-oughali Apr 27, 2021
Author

Thanks for the tips. Looking forward to see the updated results.

KeremTurgutlu · 2021-04-28T01:07:13Z

KeremTurgutlu
Apr 28, 2021
Maintainer

I've been working on testing FixMatch (semi-supervised) on the same dataset but without much success so far. The convergence seems very slow, and in the paper they reported using a crazy 1 million iterations.

That's crazy indeed :)

I've checked the performance of different algorithms on STL-10 here, and it seems like barrier is pretty high. I am not sure though which of these algorithms evaluate using the testing protocol:

- Perform unsupervised training on the unlabeled.
- Perform supervised training on the labeled data using 10 (pre-defined) folds of 100 examples from the training data. The indices of the examples to be used for each fold are provided.
- Report average accuracy on the full test set.

Probably not the semi-supervised ones because for semi-supervised learning you would need some labeled data. I ran my training again with the correct 1,000 training samples per fold and the average test accuracy is around 74%.

2 replies

feras-oughali Apr 28, 2021
Author

You are right. For semi-supervised, the first two points are done together somehow.

I noticed that, for self-supervised, they tend to freeze the model and then report the performance.

But from the above evaluation protocol, it may be okay to fine-tune the model after performing the unsupervised training.

So you could relax this condition especially if you want to compare with semi-supervised as well.

KeremTurgutlu Apr 28, 2021
Maintainer

Good point. I will take that into account.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation of STL10 dataset #59

{{title}}

Replies: 2 comments 3 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Evaluation of STL10 dataset #59

feras-oughali Apr 27, 2021

Replies: 2 comments · 3 replies

KeremTurgutlu Apr 27, 2021 Maintainer

feras-oughali Apr 27, 2021 Author

KeremTurgutlu Apr 28, 2021 Maintainer

feras-oughali Apr 28, 2021 Author

KeremTurgutlu Apr 28, 2021 Maintainer

feras-oughali
Apr 27, 2021

Replies: 2 comments 3 replies

KeremTurgutlu
Apr 27, 2021
Maintainer

feras-oughali Apr 27, 2021
Author

KeremTurgutlu
Apr 28, 2021
Maintainer

feras-oughali Apr 28, 2021
Author

KeremTurgutlu Apr 28, 2021
Maintainer