-
-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle per-object file identifiers for encryption #42
Comments
The issue occured on both Head (87ca4db 2023/10/02 18:27) and v 1.1.1. |
OK, so this is an encrypted PDF generated by what looks like an old MacOS 9 version of Acrobat. The object that isn't loading is a secondary xref stream, which is odd because the primary stream loaded just fine... Investigating... |
Thank you very much for the investigation. I would be very happy if this file could be read. |
It looks like there is a broken object reference. Need to do a little digging but I might need to allow for this and throw an error when you try to actually load the broken reference. |
Looking back, the first error is the unable to decompress error due to a bogus xref stream in object 451. |
and this object has a different file key than the rest of the file... |
Deferring this to "future" since it will require a re-implementation of the crypto handler and I have never seen a PDF file containing two different file IDs. |
Current code has an issue because the object dictionary is trying to be decrypted while it is being loaded; need to split out the code that decrypts string values from the code that loads the object dictionary. |
OK, so for this file it actually looks like the per-object ID is the same as the main file ID, but the object itself is actually damaged. Xpdf doesn't ever try to load it so maybe it is an object that doesn't need to be loaded to use the file? Will be looking at that tomorrow... |
Describe the bug
I got
Unable to decompress stream data: Data error.
from inflate.The return code of inflate is Z_DATA_ERROR.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Success to extract text.
System Information:
Additional context
st->predictor is 12 = _PDFIO_PREDICTOR_PNG_UP.
The error seems to occur with PDFs that contain images.
The text was updated successfully, but these errors were encountered: