Skip to content

OCR Fails to Detect and Classify Checkbox States in Structured Forms #217

Closed
@likhithkumar98

Description

@likhithkumar98

🚀 The feature, motivation and pitch

Description:

The current OCR pipeline is unable to accurately detect and classify the states of checkboxes in scanned or photographed document forms. This affects data extraction quality where form checkboxes are used to capture user selections.

Expected Functionality:

OCR should:

  • Detect all checkbox elements in the form
  • Classify checkbox states:
  • Empty: Rectangular border with empty interior
  • Checked: Contains ✓, or “v” shape
  • Crossed: Contains “x” or diagonal line(s)
  • Filled: Darkened or shaded interior
  • Partial: Unclear or incomplete mark
  • Associate checkbox rows/columns with corresponding labels

Steps taken:

  • Tried prompt-engineering around layout inference and checkbox keyword detection – unsuccessful
  • OCR returns text only, ignoring graphic elements entirely.
Sample file tried

Image

Alternatives

No response

Additional context

No response

Activity

aman-17

aman-17 commented on Jul 10, 2025

@aman-17
Member

Hey @likhithkumar98, thanks for sharing this. We’ll work on all these issues and release a better model soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @aman-17@likhithkumar98

        Issue actions

          OCR Fails to Detect and Classify Checkbox States in Structured Forms · Issue #217 · allenai/olmocr