Skip to content

Missing last line in a page but it's due to the segmentation model  #1213

@lfoppiano

Description

@lfoppiano

In this document there are two missing lines which are placed at the end of the page and they start with a digit.

The issue, which at first appeared to be a pdfalto issue, it's a segmentation model issue:

image

where the full line is tagged as <page> (see training data output):

image

Metadata

Metadata

Assignees

Labels

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions