Skip to content

autodistill/autodistill-paligemma

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

12 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Autodistill PaLiGemma Module

This repository contains the code supporting the PaLiGemma base model for use with Autodistill.

PaLiGemma, developed by Google, is a computer vision model trained using pairs of images and text. You can label data with PaliGemma models for use in training smaller, fine-tuned models with Autodisitll.

Read the full Autodistill documentation.

Installation

To use PaLiGemma with autodistill, you need to install the following dependency:

pip3 install autodistill-paligemma

Quickstart

Auto-label with an existing model

from autodistill_paligemma import PaliGemma

# define an ontology to map class names to our PaliGemma prompt
# the ontology dictionary has the format {caption: class}
# where caption is the prompt sent to the base model, and class is the label that will
# be saved for that caption in the generated annotations
# then, load the model
base_model = PaliGemma(
    ontology=CaptionOntology(
        {
            "person": "person",
            "a forklift": "forklift"
        }
    )
)

# label a single image
result = PaliGemma.predict("test.jpeg")
print(result)

# label a folder of images
base_model.label("./context_images", extension=".jpeg")

Model fine-tuning

You can fine-tune PaliGemma models with LoRA for deployment with Roboflow Inference.

To train a model, use this code:

from autodistill_paligemma import PaLiGemmaTrainer

target_model = PaLiGemmaTrainer()

# train a model
target_model.train("./data/")

License

The model weights for PaLiGemma are licensed under a custom Google license. To learn more, refer to the Google Gemma Terms of Use.

๐Ÿ† Contributing

We love your input! Please see the core Autodistill contributing guide to get started. Thank you ๐Ÿ™ to all our contributors!

About

Use PaliGemma to auto-label data for use in training fine-tuned vision models.

Topics

Resources

Stars

Watchers

Forks