-
Notifications
You must be signed in to change notification settings - Fork 4
Make chunk size configurable #13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR makes the chunk size configurable by introducing a new parameter and replacing the constant DEFAULT_CHUNK_SIZE with an instance variable in the WIM archive streaming logic.
- Updates the initializer to accept a configurable chunk size.
- Replaces all usages of DEFAULT_CHUNK_SIZE in chunk calculation with the new instance variable.
- Adjusts calculations in the _read method accordingly.
Comments suppressed due to low confidence (1)
dissect/archive/wim.py:444
- The attribute 'chuck_size' appears to be a typo. Consider renaming it to 'chunk_size' for clarity and consistency with the parameter name.
self.chuck_size = chunk_size
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #13 +/- ##
==========================================
+ Coverage 68.89% 68.92% +0.02%
==========================================
Files 9 9
Lines 1196 1197 +1
==========================================
+ Hits 824 825 +1
Misses 372 372
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Can move it to dissect.util. |
Will move and create a PR after this one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a unit test with an alternative cluster size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR makes the chunk size configurable in the WIM decompression logic and updates tests to support different chunk size configurations in preparation for upcoming POC contributions.
- Modified tests to parameterize different chunk sizes for verifying the header compression size.
- Added new fixtures in conftest.py to supply WIM files with various chunk sizes.
- Updated the WIM class to accept a configurable chunk_size and use it in chunk calculations.
Reviewed Changes
Copilot reviewed 3 out of 6 changed files in this pull request and generated 1 comment.
File | Description |
---|---|
tests/test_wim.py | Parameterized test now asserts the header’s CompressionSize against different chunk sizes. |
tests/conftest.py | Added fixtures for test files representing different chunk sizes. |
dissect/archive/wim.py | Updated WIM class to compute chunk-related values using the configurable chunk_size. |
Files not reviewed (3)
- tests/data/test16k.wim.gz: Language not supported
- tests/data/test4k.wim.gz: Language not supported
- tests/data/test8k.wim.gz: Language not supported
Comments suppressed due to low confidence (1)
dissect/archive/wim.py:444
- The header’s CompressionSize field is expected to match the configured chunk_size (as asserted in tests), but there is no visible update to the header. Ensure that the header is updated to reflect self.chunk_size if that is the intended behavior.
self.chunk_size = chunk_size
This PR makes the
chunk_size
variable configurable in preparation for contributing POC code mentioned in fox-it/dissect.ntfs#41. Decompression code largely overlapped with the code here.We might want to consider moving this streaming class to
dissect.ntfs
(or somewhere else better suited) in the future.