Make ppt_record_parser.IterStream.readinto() always return desired length #715
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Found this bug while using
oleobj.pyon a PowerPoint file:The extracted embedded file was not matching the hash of the real embedded file, so I traced back the code starting from the warning message here:
oletools/oletools/oleobj.py
Lines 645 to 646 in a7d1050
The problem is that
olefile.pyis expectingread()to return all bytes (except for the last sector):https://github.com/decalage2/olefile/blob/cc0bdc07194fb7dc21e75a95c9e771e5240952b2/olefile/olefile.py#L666-L676
ppt_record_parser.IterStreamis derived from io.RawIOBase which is unfortunately not guaranteed to return the desired bytes duringread().Since
IterStreamimplementation was already buffered, I simply changedreadinto()to always return the desired length whenever possible; you might want to change that to io.BufferedIOBase