LoRACaptioner

Image Captioning: Automatically generate detailed and structured captions for your LoRA dataset.
Prompt Optimization: Enhance prompts during inference to achieve high-quality outputs.

Installation

Prerequisites

Python 3.10 or higher
Together AI account and API key

Setup

Create the virtual environment:

python -m venv venv
source venv/bin/activate
python -m pip install -r requirements.txt

Run inference on one set of images:
```
python main.py --input examples/ --output output/
```
Arguments
- --input (str): Directory containing images to caption.
- --output (str): Directory to save images and captions (defaults to input directory).
- --partial_captions (str): JSON file containing partial captions for images that will be used to assist in generating full captions.
- --reference_image (str): Reference image for outfit consistency. The outfit from this image will be used in all captions.

Gradio Demo

Launch a user-friendly web interface for captioning and prompt optimization:

python demo.py

Notes

All images are processed individually for consistent results
Each caption is saved as a .txt file with the same name as the image
Use the reference image feature to maintain outfit consistency across captions

Troubleshooting

API errors: Ensure your Together API key is set and has funds
Image formats: Only .png, .jpg, .jpeg, and .webp files are supported

Manual Captioning with ChatGPT

Follow the instructions in my blog post and use system_prompt.txt as the system prompt.

Examples

Sukuna from Jujutsu Kaisen

User Prompt:
holding a bow and arrow in a dense forest

Optimized Prompt:
tr1gg3r anime-style, pink spiky hair and black markings on face, shirtless with dark arm bands, holding bow and arrow, focused expression, dense forest, soft dappled lighting, three-quarter view

User Prompt:
drinking coffee in a san francisco cafe, white cloak, side view

Optimized Prompt:
tr1gg3r anime-style, spiky pink hair and facial markings, white cloak, sitting with cup in hand, neutral expression, cafe interior with san francisco view, soft natural lighting, side profile

User Prompt:
playing pick-up basketball on a sunny day

Optimized Prompt:
tr1gg3r photorealistic, athletic build, sleeveless basketball jersey and shorts, jumping with ball, focused expression, outdoor basketball court with spectators, bright sunlight, low-angle view

A character generated by Flux.1-dev

User Prompt:
riding a horse on a prairie during sunset

Optimized Prompt:
tr1gger photorealistic, curly shoulder-length hair, floral button-up shirt, riding a horse, neutral expression, prairie during sunset, warm directional lighting, three-quarter view

User Prompt:
painting on a canvas in an art studio, side-view

Optimized Prompt:
tr1gg3r photorealistic, curly shoulder-length hair, floral button-up shirt, standing at an angle with brush in hand, neutral expression, art studio with canvas and paints, soft natural lighting, right side profile

User Prompt:
standing on a skyscraper in a dense city, dramatic stormy lighting, rear view

Optimized Prompt:
tr1gg3r photorealistic, curly shoulder-length hair, floral button-up shirt, standing upright, neutral expression, skyscraper rooftop in dense city, dramatic stormy lighting, back view

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
examples		examples
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
caption.py		caption.py
demo.py		demo.py
main.py		main.py
prompt.py		prompt.py
requirements.txt		requirements.txt
system_prompt.txt		system_prompt.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LoRACaptioner

Installation

Prerequisites

Setup

Gradio Demo

Notes

Troubleshooting

Manual Captioning with ChatGPT

Examples

Sukuna from Jujutsu Kaisen

A character generated by Flux.1-dev

About

Uh oh!

Releases

Packages

Uh oh!

Languages

RishiDesai/LoRACaptioner

Folders and files

Latest commit

History

Repository files navigation

LoRACaptioner

Installation

Prerequisites

Setup

Gradio Demo

Notes

Troubleshooting

Manual Captioning with ChatGPT

Examples

Sukuna from Jujutsu Kaisen

A character generated by Flux.1-dev

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages