Vision_Transformer_Transfer_Learning

Vision Transformers (ViTs) have become a leading choice for various computer vision tasks due to their state-of-the-art performance. Among them, some models stand out for transfer learning in different scenarios.

ViT (Vanilla Vision Transformer)

Description:

Vision Transformers (ViTs) are a type of neural network architecture designed primarily for processing images. Unlike traditional convolutional neural networks (CNNs), which process images in a hierarchical manner, ViTs apply self-attention mechanisms to capture global dependencies between image patches. This allows them to achieve strong performance on various computer vision tasks without relying on convolutional layers. ViTs have gained popularity for their ability to handle long-range dependencies effectively, making them suitable for tasks like image classification, object detection, and segmentation. The original Vision Transformer developed by Google. It divides images into patches, processes them as tokens, and applies transformer layers.

Best For:

General-purpose vision tasks when large-scale pretraining is available.

Pre-trained Weights:

Available on datasets like ImageNet-21k and ImageNet-1k.

Transfer Learning Strength:

Performs well for classification, particularly with fine-tuning on smaller datasets.

Dataset

This dataset contains a diverse range of images featuring various types, styles, and designs of eyeglasses. Its primary objective is to serve as a comprehensive resource for training and evaluating machine learning models aimed at accurately categorizing and classifying different styles and attributes of glasses depicted in images.

Kaggle Link

Notebooks

vision-transformer-trainer-and-pytorch-lightning

Fine-tune Vision Transformer (ViT) models with PyTorch Lightning, leveraging its flexible and scalable framework for streamlined model training and experimentation.
vision-transforme-with-hugging-face-transformer-and-keras

This notebook includes tools for fine-tuning Vision Transformer (ViT) models using Keras, offering a simple and intuitive interface for building, training, and evaluating models.
vision-transforme-with-pytorch-trainer The repository utilizes Lightning Trainer to simplify training workflows, enabling efficient fine-tuning of Vision Transformer (ViT) models with features like automatic checkpointing, logging, and distributed training.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
dataset/Glasses Classification Dataset		dataset/Glasses Classification Dataset
notebooks		notebooks
src_image		src_image
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Vision_Transformer_Transfer_Learning

ViT (Vanilla Vision Transformer)

Dataset

Notebooks

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

KaushiML3/Vision_Transformer_Transfer_Learning

Folders and files

Latest commit

History

Repository files navigation

Vision_Transformer_Transfer_Learning

ViT (Vanilla Vision Transformer)

Dataset

Notebooks

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages