-
Notifications
You must be signed in to change notification settings - Fork 535
Open
Labels
type: enhancementImprovementImprovement
Description
🚀 The feature
Currently, builder.py has a paragraph_break
parameter for merging sub_lines
that are relatively close enough.
I would appreciate a similar parameter for merging stacked lines that are vertically close enough.
Motivation, pitch

Currently, when I run docTR on the above image and images with similar lower thirds, I get the following from result.render() with the \n\n
representing separating different blocks. I would like to be able to direct the builder to merge lines that are this close into one block containing two lines rather than getting two blocks that contain one line each.
REP. PAUL LEONARD\n\nD-DAYTON
here is the document object:
Document(
(pages): [Page(
dimensions=(360, 480)
(blocks): [
Block(
(lines): [Line(
(words): [
Word(value='REP.', confidence=0.99),
Word(value='PAUL', confidence=1.0),
Word(value='LEONARD', confidence=1.0),
]
)]
(artefacts): []
),
Block(
(lines): [Line(
(words): [Word(value='D-DAYTON', confidence=0.99)]
)]
(artefacts): []
),
]
)]
)
Alternatives
No response
Additional context
No response
Metadata
Metadata
Assignees
Labels
type: enhancementImprovementImprovement