current versions of pyfive (0.5-0.9) assume a v1 btree for chunked data and never check

In [DatasetID](https://github.com/NCAS-CMS/pyfive/blob/29b2b599cd526ce1b63df3ac7da92a9870077143/pyfive/h5d.py#L16) where we build the chunk index for chunked data:

https://github.com/NCAS-CMS/pyfive/blob/29b2b599cd526ce1b63df3ac7da92a9870077143/pyfive/h5d.py#L308-L309

We just assume a V1 btree, and do nothing to check it. This is probably a regression that I introduced because we had no data with any other kind of chunk layout.  

I've labelled this as a bug, even though we don't yet have an exemplar of this failing in anger, but I can't believe we wont get one soon.

As a bare minimum we need to check what kind of chunk-index is in place (on a per variable basis maybe, even if with V1 and V2 layouts the b-tree is fixed for a file, it is not for V3) and do something sensible (can we use V2 b-trees for data chunks, can we raise an error for something else right now,
or just fail over to metadata only)?

We need a test file with other kinds of chunk layout ... (and there are quite a few, as appendix C suggests).  A priority (for me and many of us) would be anything that NetCDF could create, even if is rare.


(Note that the actual way we handle building the chunk index will change in a pull request incoming in the next couple of days which addresses #134 and #135, but at the moment that will still persist this issue.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

current versions of pyfive (0.5-0.9) assume a v1 btree for chunked data and never check #137

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	chunk_btree = BTreeV1RawDataChunks(
	dataobject.fh, dataobject._chunk_address, dataobject._chunk_dims)

current versions of pyfive (0.5-0.9) assume a v1 btree for chunked data and never check #137

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions