-
Notifications
You must be signed in to change notification settings - Fork 144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FSE_compress returns 0 even when maxDstSize is still equal to or slightly larger than final compression size #90
Comments
This is an expected behavior, even if badly documented.
The amount of additional buffer space is not specified, but "+8" rule is about right. |
Yeah, it's our internal machinery which is absolutely unintuitive for library users. I think that this issue may be considered as call for doc improvement. You can say that to ensure compression user should alloc exactly amount of memory computed by maxbound, and that compressor can write beyond size of final compressed data for the sake of speed, "up to 8 bytes in the x.x version, but it may be changed in future" PS: forgot to say that it may be applicable to [your] other compression libraries as well |
I guess that this is not that big of a deal anyway: the chance to guess the compressed size within 8 bytes is low, and if the caller decides to under-allocate (compared to what You'd only do this either because you are under tight memory constraints, or are dealing with data with very predictable compression ratio (but again, less than 8 bytes of headroom is tight!). I'll add a warning in my code about that, and update the test to round up the allocated buffer. Should I close this issue, or do you want to keep it around, if you ever decide to update the doc at a later time? |
While adding more unit tests, I found a corner case when trying to compress a document with a
maxDstSize
which is exactly the expected compressed size (found by a previous compression attempt): in this test,FSE_compress(..)
returns 0 (uncompressible data) while previously it was able to compress the same input (given a larger destination buffer).I'm doing a two-step process:
FSE_compressBound
), and compress the source by callingFSE_compress(..., maxDstSize: FSE_compressBound(ORIGINAL_SIZE))
, and measure its final compressed sizeCOMPRESSED_SIZE
.FSE_compress(..., maxDstSize: COMPRESSED_SIZE)
. I get a result code of 0 which means that data is uncompressible.I tried probing the minimum size that will allow compressing the buffer (which is known to be compressible), and each time I need to call
FSE_compress(..)
with at leastCOMPRESSED_SIZE + 8
. At first I thought it could be a pointer alignment issue, but it is always + 8 bytes whatever compressed size is (by changing the source a bit).In my test, raw original size is 4,288 bytes,
FSE_compressBound
return 4,833 bytes, and the compressed size is 2,821 bytes. I need to pass at least 2,823+8 = 2,829 bytes forFSE_compress
to succeed (return value > 0).Is this expected behavior? I'm not sure if the "+8 rule" is true, or if this is random chance with the inputs I'm passing in.
The text was updated successfully, but these errors were encountered: