Rust implementation of Google Paligemma with Candle
cargo run --example inference --release -- -i ./data/gangsters.png -p "cap en" --sample-length 100
or with cuda pass the flag --features cuda
Usage: paligemma -i <image-path> -p <text-prompt> [--sample-length <sample-length>]
Generate a description of an image using Google Paligemma
Options:
-i, --image-path path to an input image
-p, --text-prompt prompt to ask the model
--sample-length the length of the generated text
--help, help display usage information
cap enTwo men are sitting under an umbrella, the left man is wearing sunglasses.
16 tokens generated (26.15 token/s)