Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Stream.ReadAtLeast when loading ZipArchives #114256

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

edwardneal
Copy link
Contributor

Fixes #114026. Swaps references to Stream.Read out with Stream.ReadAtLeast, adds a test to prove the correct behaviour.

In every case except for ZipArchive, we use a pattern of calling Stream.ReadAtLeast(Span, Span.Length, throwOnEndOfStream: false) rather than Stream.ReadExactly(Span). The surrounding code checks the number of bytes read and returns false, and I didn't want to change that pattern.

The ZipArchive case takes place when reading the central directory, and it makes sure to always read enough bytes for the constant component of a ZipCentralDirectoryFileHeader.

/cc @carlossanlop

@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Apr 4, 2025
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-io-compression
See info in area-owners.md if you want to be subscribed.

Copy link
Member

@carlossanlop carlossanlop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this change, @edwardneal. I left some comments/questions for you to consider.

Stream clamped = new ClampedReadStream(stream, readSizeLimit: 1);

IsZipSameAsDir(clamped, zfolder(zipFolder), ZipArchiveMode.Read, requireExplicit: true, checkTimes: true);
Assert.False(clamped.CanRead, "Clamped stream should be closed at this point"); //check that it was closed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please explain this check? Is CanRead supposed to change in the BaseStream of ClampedReadStream if the stream is properly disposed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TestPartialReads test aligned with the TestStreamingRead test above it, but I agree - making sure that the stream is closed isn't really a focus of the test. I've removed it, so now it's equivalent to the ReadNormal test.

@@ -557,8 +557,10 @@ public static bool TrySkipBlock(Stream stream)
{
Span<byte> blockBytes = stackalloc byte[4];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need the assert below:

Suggested change
Span<byte> blockBytes = stackalloc byte[4];
Span<byte> blockBytes = stackalloc byte[FieldLengths.Signature];


Debug.Assert(blockBytes.Length == FieldLengths.Signature);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just make the blockBytes length the expected size:

Suggested change
Debug.Assert(blockBytes.Length == FieldLengths.Signature);

@@ -572,7 +574,8 @@ public static bool TrySkipBlock(Stream stream)
// Already read the signature, so make the filename length field location relative to that
stream.Seek(FieldLocations.FilenameLength - FieldLengths.Signature, SeekOrigin.Current);

bytesRead = stream.Read(blockBytes);
Debug.Assert(blockBytes.Length == FieldLengths.FilenameLength + FieldLengths.ExtraFieldLength);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand this debug. Wasn't blockBytes.Length fixed to 4 bytes? The debug above (which I suggested to remove) was checking a different length, but I don't see that blockBytes is getting replaced with an array of different size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added them because blockBytes is used twice: once to store the four signature bytes, and once to hold the filename length & extra field length (which also both fill four bytes.) The asserts were to make it clearer that this was happening, but I've added a comment too. I can remove the assertions and just use the comment if you prefer

@@ -61,7 +61,6 @@ public static async Task TestPartialReads(string zipFile, string zipFolder)
Stream clamped = new ClampedReadStream(stream, readSizeLimit: 1);

IsZipSameAsDir(clamped, zfolder(zipFolder), ZipArchiveMode.Read, requireExplicit: true, checkTimes: true);
Assert.False(clamped.CanRead, "Clamped stream should be closed at this point"); //check that it was closed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about this and since you're using your own custom clamped stream, You could override Dispose so that it sets the value of a public property called IsDisposed (besides also doing the usual disposing tasks) and then in this test you can check the value of that property to see if we disposed it. But it's ok as it is currently. Just a note for the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.IO.Compression community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Replace ZipArchive calls of stream.Read with stream.ReadExactly or stream.ReadAtLeast
2 participants