Skip to content

Exposing b-tree layout information #130

@bnlawrence

Description

@bnlawrence

For CMIP7, we need a tool which can tell us whether or not the b-tree is contiguous for a particular variable. This can be trivially done (for any variable) with a very minor modification to pyfive which exposes information that is already being gathered for other purposes.

We propose to add a property method to a the DatasetID which exposes a small dictionary of b-tree statistics:

    {'contiguous':Boolean, 'start_address':int,'last_address':int}

The addresses could be trivially obtained in the existing build_index method which loops over all the addresses, and whether or not they aer contiguous can be calculated by a simple comparison of the range of addresses and the number of entries.

This will be useful for anyone considering whether or not a given HDF5 file will play nicely on object store (or behind a range-get http server).

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions