In [this](https://github.com/user-attachments/files/18246576/pub.1158465915.pdf) document there are two missing lines which are placed at the end of the page and they start with a digit. The issue, which at first appeared to be a pdfalto issue, it's a segmentation model issue:  where the full line is tagged as `<page>` (see training data output): 