v3.1.0 Delayed Dask Readers
To support large file reading and parallel image processing, we have converted all image readers over to using dask for data management.
What this means for the user:
- No breaking changes when using the 3.0.* API (
AICSImage.data
,AICSImage.get_image_data
,imread
) all still return full image file reads back asnumpy.ndarray
. - New properties and functions for dask specific handling (
AICSImage.dask_data
,AICSImage.get_image_dask_data
,imread_dask
) return delayed dask arrays (dask.array.core.Array
) - When using either the
dask
properties and functions, data will not be read until requested. If you want just the first channel of an imageAICSImage.get_image_dask_data("STZYX", C=0)
will only read and return a five dimensional dask array instead of reading the entire image and then selecting the data down.
A single breaking change:
- We no longer support handing in file pointers or buffers.
If you want multiple workers to read or process the image, the context manager for AICSImage
and all Reader
classes now spawns or connects to a Dask cluster and client for the duration of the context manager. If you want to keep it open for longer than a single image, use the context manager exposed from dask_utils.cluster_and_client
.
Extras:
napari has been directly added as an "interactive" dependency and if installed, the AICSImage.view_napari
function is available for use. This function will launch a napari
viewer with some default settings that we find to be good for viewing the data that aicsimageio
generally interacts with.