-
Notifications
You must be signed in to change notification settings - Fork 173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spanify various methods #406
Conversation
using var inflaterStream = new DeflateStream(new MemoryStream(compressedProfile), CompressionMode.Decompress); | ||
iccDirectory = new IccReader().Extract(new IndexedCapturingReader(inflaterStream)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This now reads the entire compressed profile upfront, which may result in an additional allocation. This should be offset with the reduced indirection / direct access to the underlying buffer.
@@ -36,17 +36,17 @@ public static bool IsValid(byte[] bplist) | |||
return true; | |||
} | |||
|
|||
public static PropertyListResults Parse(byte[] bplist) | |||
public static PropertyListResults Parse(ReadOnlySpan<byte> bplist) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General question: what do you think about passing ReadOnlySpan<byte>
as an in
parameter to avoid copies?
var iccBytes = profileSize <= 256 | ||
? stackalloc byte[profileSize] | ||
: new byte[profileSize]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would love to wrap this pattern up somehow, to:
- Have the constant in a single place
- Make it easy to use a pooled array when over the threshold
I haven't looked at this yet. We can file an issue for later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will tackle in this PR. Will look something like:
byte[]? rentedBuffer = null;
var iccBytes = profileSize <= 256
? stackalloc byte[profileSize]
: (rentedBuffer = ArrayPool.Shared.Rent(profileSize)).Slice(0, profileSize);
...
if (rentedBuffer != null)
{
ArrayPool.Shared.Return(rentedBuffer);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we can make the call site look like this:
using rental = BufferUtil.Rent(minimumSize: 1234);
var reader = new BufferReader(rental.Buffer);
I'm not familiar enough with ref struct
/scoped
here to know if this is possible but I think it is. I think we can also return a struct
disposable and the compiler won't box it to invoke Dispose
through the interface virtually.
If we had a pattern like that, we could look at making the stack allocation smarter. I read something somewhere about being able to test for stack space, which might mean the threshold could be dynamic. Even a threshold of 256 could blow the stack. However I don't know if such a capability exists, or if I just read a proposal for that.
public IccDirectory Extract(IndexedReader reader) | ||
public IccDirectory Extract(ReadOnlySpan<byte> bytes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is slightly less flexible than before, as it requires a contiguous chunk of memory while the previous could work on a stream. Maybe that's ok. Just calling it out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Left a bunch of comments that are mostly for future stuff. If you want to tackle any of them here, but all means do, otherwise I'll file follow-up items. Let me know when you think this is ready and I'll take another look (the description is WIP so I assume you're still playing with it).
using var iccBuffer = profileSize <= 256 | ||
? new BufferScope(stackalloc byte[profileSize]) | ||
: new BufferScope(profileSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running this over the test suite shows a StackOverflowException
here. Were you able to get the regression suite working locally? Let me know if you need a hand with that.
The issue is that profileSize
is negative.
Perhaps something like:
using var iccBuffer = profileSize <= 256 | |
? new BufferScope(stackalloc byte[profileSize]) | |
: new BufferScope(profileSize); | |
using var iccBuffer = (uint)profileSize <= 256 | |
? new BufferScope(stackalloc byte[profileSize]) | |
: new BufferScope(profileSize); |
That's a clever and performant way to validate the value is also greater than zero, but it's a less common pattern and doesn't read very naturally.
Wish there was a better way to wrap this pattern up, but I can't see it, at least not for all target frameworks we have. For .NET 8 I think we could use inline arrays with something like this:
internal readonly ref struct BufferScope
{
private const int StackBufferSize = 256;
private readonly Buffer _buffer;
private readonly byte[]? _array;
private readonly Span<byte> _span;
public BufferScope(int size)
{
if (size < 0)
{
throw new ArgumentOutOfRangeException(nameof(size), size, "Cannot be negative.");
}
else if (size < StackBufferSize)
{
_span = MemoryMarshal.CreateSpan(ref Unsafe.As<Buffer, byte>(ref _buffer), size);
}
else
{
_array = ArrayPool<byte>.Shared.Rent(size);
_span = _array.AsSpan(0, size);
}
}
public readonly Span<byte> Span => _span;
public readonly void Dispose()
{
if (_array is null) return;
ArrayPool<byte>.Shared.Return(_array);
}
[InlineArray(StackBufferSize)]
private struct Buffer
{
[SuppressMessage("CodeQuality", "IDE0051:Remove unused private members")]
private byte _element0;
}
}
MetadataExtractor/IO/BufferScope.cs
Outdated
|
||
namespace MetadataExtractor.IO; | ||
|
||
internal ref struct BufferScope |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could add implicit conversion to a Span<byte>
with:
public static implicit operator Span<byte>(BufferScope bufferScope)
{
return bufferScope._span;
}
That could clean up the usages a little. I'd also consider naming this ScopedBuffer
as I think it's more of a buffer than a scope.
I'm seeing a few regressions in the image suite. One I called out above (stack overflow on negative size), and another seems to be in ICC data, as quite a few files now give:
Let me know if you need a hand getting the regression suite set up on your machine. If you're not touching the Java version, you only need to set up the .NET runner (and can comment out the call to run the Java lib too). |
@iamcarbon let me know if you'd like me to pick this up. No real rush, but I'd like to get 2.9 out in the next couple of weeks. |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
@drewnoakes OK, I got the regression suite working. Couple regressions I still need to figure out. I'm going to break this PR apart few simpler PRs, and may back out of on the ICC span changes since this changes some error messages. I'll get that done this week. |
Sounds good. I suggest also running the suite without your changes, just to baseline. It's possible that different environmental factors create a diff. I try to minimise that, but it's hard. |
Closing this in favor of smaller incremental PRs. |
@drewnoakes Ready for review.