Can it be made a more transparent drop-in for ndarray?

I'm trying to see how far I can take my ~50 GB hdf5 datasets through my processing pipeline before explicitly creating an ndarray. My pipeline uses a framework (Neuropype) that puts the ndarray in a container along with some metadata and makes extensive use of ndarray functions returning views. I think I could get a lot further in this framework with my h5 dataset if a wrapper class like `DatasetViewh5py` reimplemented some of those ndarray functions that return views.

Are there any downsides to renaming `lazy_transpose` to `transpose`?

Do you foresee any problems with a lazy implementation of `reshape`?

I'm also considering a custom implementation of `squeeze`.

numpy users expect `flatten()` to return a copy so probably not that one.

What about `min`, `max`, `argmin`, `argmax`, `any` and `all` when an axis is provided? Even though all of the data will have to be loaded into memory eventually, it can be done sequentially row-by-row (or column-by-column) so maybe this will help avoid out-of-memory errors. I am fairly new to processing data cached-on-disk so I'm hoping others with more experience can tell me if this is a bad idea from the outset.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Can it be made a more transparent drop-in for ndarray? #21

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Can it be made a more transparent drop-in for ndarray? #21

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions