Open
Description
Thanks for the amazing collection of ViT implementations!
As far as I see from the CvT paper, it also allows for a class token (like in ViT
but which was removed in SimpleViT
by the authors of the SimpleViT paper). Whereas it seems that the CvT code in this repository does not implement it. Is this by design choice, or do you think it might be worth adding an option for a classification token? If it's the former, maybe that can be clarified in the README? I'm not sure if there have been benchmark tests in the literature with and without a classification token.
For reference, I was looking at the CvT code that came with the paper (this line) and there seems to be an option called with_cls_token
.
Activity