--psm 10 produces accurate multi-character output despite being single-character mode #4414

tomyjany · 2025-04-21T10:29:19Z

Current Behavior

Tested on this image:

Tesseract --psm 10 Returns Multiple Characters

Despite being designed to recognize only a single character, --psm 10 returns the full text from the image — and the output is surprisingly accurate. I discovered this while automatically testing all combinations of PSM and OEM settings. Unexpectedly, --psm 10 produced some of the best results overall. This behavior contradicts the documented purpose of the mode and appears inconsistent with its intended use case.

Commands and Outputs

tesseract example_image_cropped.jpg stdout --psm 10 --oem 1 -l ces
# Output: Šel jsem domů ze školy.

tesseract example_image_cropped.jpg stdout --psm 10 --oem 2 -l cesLEGACY
# Output: Šel jsem domů ze školy,

tesseract example_image_cropped.jpg stdout --psm 10 --oem 3 -l cesLEGACY
# Output: Šel jsem domů ze školy,

tesseract example_image_cropped.jpg stdout --psm 10 --oem 3 -l ces
# Output: Šel jsem domů ze školy.

Expected Behavior

Since --psm 10 is intended for single-character recognition, providing an image containing more than one character should result in incorrect output. However, the output should still be limited to a single predicted character, not a full sentence. In other words, when given a multi-character input, the expected (though incorrect) behavior would be for Tesseract to output just one character — not an entire string of text.

tesseract -v

tesseract 5.3.4
 leptonica-1.84.1
  libgif 5.2.1 : libjpeg 6b (libjpeg-turbo 2.1.4) : libpng 1.6.40 : libtiff 4.6.0 : zlib 1.3.0.zlib-ng : libwebp 1.5.0
 Found AVX2
 Found AVX
 Found FMA
 Found SSE4.1
 Found libcurl/8.6.0 OpenSSL/3.2.4 zlib/1.3.1.zlib-ng brotli/1.1.0 libidn2/2.3.8 libpsl/0.21.5 libssh/0.10.6/openssl/zlib nghttp2/1.59.0 OpenLDAP/2.6.8

Operating System

Linux Fedora

Other Information

Apologies if this issue is not formatted or worded correctly — I'm not very experienced with writing bug reports. If there's a better way to present this or improve clarity, I’d appreciate any feedback so I can improve.

The text was updated successfully, but these errors were encountered:

amitdo · 2025-04-23T11:08:55Z

You are right that the output in this case does not match the documentation, but I actually think it's good that Tesseract does not output a single letter when the input image contains multiple words.

I believe psm 10 was designed for cases like this:

input: 'n'

without the --psm 10 hint, Tesseract might wrongly split it to: 'rn' in some cases.

It's possible that the above use case does not work. This would make the bug more interesting.

tomyjany changed the title ~~--psm 10 produces accurate multi-character output despite being single-character mo~~ --psm 10 produces accurate multi-character output despite being single-character mode Apr 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

--psm 10 produces accurate multi-character output despite being single-character mode #4414

--psm 10 produces accurate multi-character output despite being single-character mode #4414

tomyjany commented Apr 21, 2025 •

edited

Loading

amitdo commented Apr 23, 2025 •

edited

Loading

Uh oh!

--psm 10 produces accurate multi-character output despite being single-character mode #4414

--psm 10 produces accurate multi-character output despite being single-character mode #4414

Comments

tomyjany commented Apr 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Current Behavior

Commands and Outputs

Expected Behavior

Expected Behavior

tesseract -v

Operating System

Other Information

amitdo commented Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tomyjany commented Apr 21, 2025 •

edited

Loading

amitdo commented Apr 23, 2025 •

edited

Loading