Skip to content

Dolers/InpaintTranslate

 
 

Repository files navigation

InpaintTranslate

What is this?

TL;DR: A Python script to translate text in images with inpainting by your favorite generative AI models (Stable Diffusion, Midjourney, DALL·E).

Before After (hungarian translation)
before after

Under the hood

InpaintTranslate runs text detection on your image, masks the text boxes, and in-paints the masked regions until your image is text-free.

InpaintTranslate can be run entirely on your local machine using

or can call existing APIs

Usage

You can translate text from your image in just a few lines:

from inpaint_translate.text_detector import PaddleTextDetector
from inpaint_translate.inpainter import LocalSDInpainter
from inpaint_translate.inpaint_translator import InpaintTranslator

text_detector = PaddleTextDetector()
inpainter = LocalSDInpainter()
translator = MyMemoryTranslator(source="en-US", target="hu-HU")

inpaint_translator = InpaintTranslator(text_detector, inpainter, translator)
inpaint_translator.inpaint_translate("/my/input/image/path.png", "/my/output/image/path.png")

or throught the handy run.py script

python run.py "/my/input/image/path.png" -o "/my/output/image/path.png"

Use verbose mode to create intermediary images in a "debug" folder by -v flag in run.py or by logging library logger set to debug mode.

We provide multiple implementations for text detection and in-painting (both local and API-based), and you are also free to add your own.

Text Detectors

  1. TesseractTextDetector (based on Tesseract) runs locally. Follow this guide to install the tesseract library locally. On Ubuntu:
sudo apt install tesseract-ocr
sudo apt install libtesseract-dev

To find the path where it was installed (and pass it to the TesseractTextDetector constructor):

whereis tesseract
  1. AzureTextDetector calls a computer vision API from Microsoft Azure. You will first need to create a Computer Vision resource via the Azure portal. Once created, take note of the endpoint and the key.
AZURE_CV_ENDPOINT = "https://your-endpoint.cognitiveservices.azure.com"
AZURE_CV_KEY = "your-azure-key"
text_detector = AzureTextDetector(AZURE_CV_ENDPOINT, AZURE_CV_KEY)

Our evaluation shows that the two text detectors produce comparable results.

  1. PaddleTextDetector(based on PaddleOCR) runs locally. Follow this guide to install the paddlepaddle library locally. Or just use
pip install -r requirements_paddleocr.txt

In-painters

  1. LocalSDInpainter (implemented via Huggingface's diffusers library) runs locally and requires a GPU. Defaults to Stable Diffusion v2 for in-painting.
  2. ReplicateSDInpainter calls the Replicate API. Defaults to Stable Diffusion v2 for in-painting (and requires an API key).
  3. DalleInpainter calls the DALL·E 2 API from OpenAI (and requires an API key).
# You only need to instantiate one of the following:
local_inpainter = LocalSDInpainter()
replicate_inpainter = ReplicateSDInpainter("your-replicate-key")
dalle_inpainter = DalleInpainter("your-openai-key")

Translator

Translations are provided with deep-translator.

Keep in mind that different providers have different supported languages and restrictions on usage

Authors

This project was based on detexify by Mihail Eric and Julia Turc.

Created by Dolers for his own amusement.

About

Translate text in images with inpainting

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%