Question: Building a persistent service container based on OCRmyPDF

Hi OCRmyPDF team,

First, thanks for this excellent tool! I'm looking to build a persistent Docker service based on OCRmyPDF and have some questions.

My Use Case
I want to create a persistent Docker service (not ephemeral) that:

Runs as a long-lived container with a REST API
Handles multiple concurrent OCR requests with queuing
Will be published as open-source on Docker Hub and GitHub
Built with Quarkus (Java) that calls OCRmyPDF as the OCR engine

My Questions

Fork vs. Extension: Should I:

Fork the repository and modify it?
Build on top of your Docker image (FROM jbarlow83/ocrmypdf-alpine)?
Build a separate service that calls OCRmyPDF via Docker/CLI?


Production web service: Your documentation mentions that the included webservice.py is for demo/dev purposes only. Are there specific concerns or recommendations for building a production-grade service?
Attribution: What's the appropriate way to credit OCRmyPDF in my derivative work?

My Plan
Build a production-ready Quarkus-based service that adds:

RESTful HTTP API with production-grade server
Job queue for concurrent request handling
Health checks and monitoring
Proper error handling and logging
Calls OCRmyPDF for the actual OCR processing

The wrapper code would be fully open-source under a compatible license (MPL-2.0 or AGPL-3.0).
Is this approach acceptable? Any guidance would be appreciated!
Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Question: Building a persistent service container based on OCRmyPDF #1581

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Question: Building a persistent service container based on OCRmyPDF #1581

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions