Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add more options to ViT #184

Draft
wants to merge 53 commits into
base: master
Choose a base branch
from
Draft

Conversation

theabhirath
Copy link
Member

This modifies the current ViT API to add more options - notably, there is now a optional prenorm/postnorm toggle. There's also an option to make class tokens disappear completely, and also to allow class tokens to be before the positional embedding as in DeIT-III. This makes some other cleanup changes as well. The API is more congested for now but I thought I'd get this in before I start working on the other ViTs - maybe there's some potential for extracting common stuff out there.
Needs #174 to land before this makes sense. Also documentation is pending

theabhirath and others added 30 commits June 27, 2022 06:38
1. Some docs
2. Basic tests for ResNet and ResNeXt now pass
1. Less keywords for the user to worry about
2. Delete `ResNeXt` just for now
`downsample_args` is actually redundant
Also add tests. A lot of tests
Also

1. Tweaks - II : Formatting + some docs
2. Groundwork for abstracting out the classifier
1. Reorganise layer imports for easy access
2. Get pooling to work
So much GC, might as well have a function for it
Co-authored-by: Brian Chen <[email protected]>
Neither does formatting, unfortunately. Also refactor `classifier` to separate out FC-layer creation and pooling
It really does never stop

Co-Authored-By: Kyle Daruwalla <[email protected]>
And make `downsample_opts` a smidge easier to work with. Also, a wee bit o' formatting and cleanup.
1. Some docs
2. Basic tests for ResNet and ResNeXt now pass
theabhirath and others added 23 commits July 22, 2022 06:30
1. Less keywords for the user to worry about
2. Delete `ResNeXt` just for now
`downsample_args` is actually redundant
Also add tests. A lot of tests
Also

1. Tweaks - II : Formatting + some docs
2. Groundwork for abstracting out the classifier
1. Reorganise layer imports for easy access
2. Get pooling to work
So much GC, might as well have a function for it
Co-authored-by: Brian Chen <[email protected]>
Neither does formatting, unfortunately. Also refactor `classifier` to separate out FC-layer creation and pooling
It really does never stop

Co-Authored-By: Kyle Daruwalla <[email protected]>
And make `downsample_opts` a smidge easier to work with. Also, a wee bit o' formatting and cleanup.
Closures is the name of the game
Simplify `conv_bn` to `conv_norm` and use it
And some more formatting
Also
1. Make class tokens optional
2. Allow class tokens to be before positional embedding as in DeIT-III
@theabhirath theabhirath marked this pull request as draft July 26, 2022 14:27
@zsz00
Copy link

zsz00 commented Jan 8, 2023

what's status now ?

@theabhirath
Copy link
Member Author

I've had less time to work on this in the recent past, but I'm going to try and push some of these refactors through in the next few months. However, there's some work that's also happening on the Attention implementations around the Flux ecosystem - I suspect any reforms to the ViT will wait on that work to land

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants