Skip to content

RapidOCR ignores custom rec_keys.txt due to wrong parameter name in Docling #2698

@Samkal2612

Description

@Samkal2612

Bug

When using RapidOCR through Docling, specifying a custom latin rec_keys.txt file still results in incorrect Chinese characters being returned on French documents.

I tested this using the Torch engine with the PP-OCRv3 latin .pth files, but I also reproduced the issue using other engines — so the problem is not engine-specific.

After investigation, the issue comes from a mismatch between the parameter name Docling sends to RapidOCR and what RapidOCR actually expects.

Docling passes:

"Rec.keys_path": rec_keys_path

But RapidOCR expects:

"Rec.rec_keys_path"

(See RapidOCR code in rapidocr/ch_ppocr_rec/main.py, line 51.)

Because of this mismatch, RapidOCR ignores the provided latin keys file and instead falls back to the default dictionary inside the package:

ppocr_keys_v1.txt

This default file contains Chinese characters, which explains why Chinese characters appear in the OCR output even when using latin models.

I confirmed that manually overriding the parameter to "Rec.rec_keys_path" fixes the issue and yields correct latin OCR results.


Steps to reproduce

  1. Use Docling with RapidOCR enabled.
  2. Use the Torch engine with PP-OCRv3 latin .pth recognition models
    (also reproducible with other engines).
  3. Provide a latin recognition keys file (rec_keys.txt or equivalent).
  4. Run OCR on a French document.
  5. Observe that the output contains Chinese characters.
  6. Override RapidOCR parameters with {"Rec.rec_keys_path": "<path_to_keys>"}.
  7. Run OCR again → latin characters now appear correctly.

Docling version

2.63.0

Python version

3.12

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinggood first issueIssues and pull requests for new contributorsocr

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions