-
Notifications
You must be signed in to change notification settings - Fork 24
Optimise when we get access to b-tree by providing lazier view of datasets, access to b-tree location, and new p5dump #138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ould we have a v3 layout in the future.
…or file or group attributes yet, or phony dimensions.
… implementation of p5dump functionality (#134). Unit tests are failing due to a desire to get closer to (but not exactly) what ncdump will do.
…on unchunked data and some tests to keep V happy.
|
I have not added any documentation yet, I figured I'd do that once we had agreed the p5dump API and functionality, and agreed on the btree range. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #138 +/- ##
==========================================
+ Coverage 74.66% 76.10% +1.43%
==========================================
Files 12 14 +2
Lines 2712 2862 +150
Branches 407 450 +43
==========================================
+ Hits 2025 2178 +153
+ Misses 576 561 -15
- Partials 111 123 +12 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
This could serve to validate using external tools, although I think I'm more in favor of checking with hardcoded values here, just to avoid a false positive if the external tool is wrong. def test_get_chunk_info_chunked():
# start lazy, then go real
with pyfive.File(DATASET_CHUNKED_HDF5_FILE) as hfile, \
h5py.File(DATASET_CHUNKED_HDF5_FILE) as h5f, \
open(DATASET_CHUNKED_HDF5_FILE, "rb") as f:
ds = hfile.get_lazy_view('dataset1')
assert ds.id._DatasetID__index_built == False
si = StoreInfo((0, 0), 0, 4016, 16)
info = ds.id.get_chunk_info(0)
assert info == si
assert ds.id.get_num_chunks() == 88
assert h5f["dataset1"].id.get_num_chunks() == 88
assert h5f["dataset1"].id.get_chunk_info(0) == si
assert ds.id.btree_range == (1072, 8680)
f.seek(1072)
assert f.read(4) == b"TREE" # only v1 btrees
f.seek(8680)
assert f.read(4) == b"TREE" # only v1 btrees |
|
@zequihg50 Thanks. That's an excellent addition and makes me far happier. I'll push that up in a minute! |
Description
This pull request addresses three specific issues:
@zequihg50 Could you please have a look at this one too? Especially the b-tree range stuff and the related tests.
Checklist