Skip to content

Mosaics, Xarray, and Writers, Oh My!

Compare
Choose a tag to compare
@evamaxfield evamaxfield released this 07 Jun 16:03
· 168 commits to main since this release

AICSImageIO 4.0.0

We are happy to announce the release of AICSImageIO 4.0.0!

AICSImageIO is a library for image reading, metadata conversion, and image writing for microscopy formats in pure Python. It aims to be able to read microscopy images into a single unified API regardless of size, format, or location, while additionally writing images and converting metadata to a standard common format.

A lot has changed and this post will only have the highlights. If you are new to the library, please see our full documentation for always up-to-date usage and a quickstart README.

Highlights

Mosaic Tile Stitching

For certain imaging formats we will stitch together the mosaic tiles stored in the image prior to returning the data. Currently the two formats supported by this functionality are LIF and CZI.

from aicsimageio import AICSImage

stitched_image = AICSImage("tiled.lif")
stitched_image.dims  # very large Y and X

If you don't want the tiles to be stitched back together you can turn this functionality off with reconstruct_mosaic=False in the AICSImage object init.

Xarray

The entire library now rests on xarray. We store all metadata, coordinate planes, and imaging data all in a single xarray.DataArray object. This allows for not only selecting by indices with the already existing aicsimageio methods: get_image_data and get_image_dask_data, but additionally to select by coordinate planes.

Mosaic Images and Xarray Together

A prime example of where xarray can be incredibly powerful for microscopy images is in large mosaic images. For example, instead of selecting by index, with xarray you can select by spatial coordinates:

from aicsimageio import AICSImage

# Read and stitch together a tiled image
large_stitched_image = AICSImage("tiled.lif")

# Only load in-memory the first 300x300 micrometers (or whatever unit) in Y and X dimensions
larged_stitched_image.xarray_dask_data.loc[:, :, :, :300, :300].data.compute()

Writers

While we are actively working on metadata translation functions, in the meantime we are working on setting ourselves up for the easiest method for format converstion. For now, we have added a simple save function to the AICSImage object that will convert the pixel data and key pieces of metadata to OME-TIFF for all of (or a select set of) scenes in the file.

from aicsimageio import AICSImage

AICSImage("my_file.czi").save("my_file.ome.tiff")

For users that want greater flexibility and specificity in image writing, we have entirely reworked our writers module and this release contains three writers: TwoDWriter, TimeseriesWriter, and OmeTiffWriter. The objective of this rework was to make n-dimensional image writing much easier, and most importantly, to make metadata attachment as simple as possible. And, we now support multi-scene / multi-image OME-TIFF writing when providing a List[ArrayLike]. See full writer documentation here.

OME Metadata Validation

Our OmeTiffReader and OmeTiffWriter now validate the read or produced OME metadata. Thanks to the ome-types library for the heavy lifting, we can now ensure that all metadata produced by this library is valid to the OME specification.

Better Scene Management

Over time while working with 3.x, we found that many formats contain a Scene or Image dimension that can be entirely different from the other instances of that dimension in pixel type, channel naming, image shape, etc. To solve this we have changed the AICSImage and Reader objects to statefully manage Scene while all other dimensions are still available.

In practice, this means on the AICSImage and Reader objects the user no longer receives the Scene dimension back in the data or dimensions properties (or any other related function or property).

To change scene while operating on a file you can call AICSImage.set_scene(scene_id) while retrieving which scenes are valid by using AICSImage.scenes.

from aicsimageio import AICSImage

many_scene_img = AICSImage("my_file.ome.tiff")
many_scene_img.current_scene  # the current operating scene
many_scene_img.scenes  # returns tuple of available scenes
many_scene_img.set_scene("Image:2")  # sets the current operating scene to "Image:2"

RGB / BGR Support

Due to the scene management changes, we no longer use the "S" dimension to represent "Scene". We use it to represent the "Samples" dimension (RGB / BGR) which means we now have an isolated dimension for color data. This is great because it allows us to directly support multi-channel RGB data, where previously we would expand RGB data into channels, even when the file had a channel dimension.

So if you encounter a file with "S" in the dimensions, you can know that you are working with an RGB file.

FSSpec Adoption

Across the board we have adopted fsspec for file handling. With 4.x you can now provide any URI supported by the fsspec library or any implementations of fsspec (s3fs (AWS S3), gcsfs (Google Cloud Storage), adlfs (Azure Data Lake), etc.).

In many cases, this means we now support direct reading from local or remote data as a base part of our API. As well as preparing us for supporting OME-Zarr!

from aicsimageio import AICSImage

wb_img = AICSImage("https://www.your-site.com/your-file.ome.tiff")
s3_img = AICSImage("s3://your-bucket/your-dir/your-file.ome.tiff")
gs_img = AICSImage("gs://your-bucket/your-dir/your-file.ome.tiff")

To read from remote storage, you must install the related fsspec implementation library. For s3 for example, you must install s3fs.

Splitting up Dependencies

To reduce the size of AICSImageIO on fresh installs and to make it easier to manage environments, we have split up dependencies into specific format installations.

By default, aicsimageio supports TIFF and OME-TIFF reading and writing. If you would like to install support for reading CZI files, you would simply append [czi] to the aicsimageio pip install: pip install aicsimageio[czi]. For multiple formats, you would add them as a comma separated string: pip install aicsimageio[czi,lif]. And to simply install support for reading all format implentations: pip install aicsimageio[all].

A full list of supported formats can be found here.

Roadmap and Motivation

After many discussions with the community we have finally written a roadmap and accompanying documentation for the library. If you are interested, please feel free to read them. Roadmap and Accompanying Documentation

Full List of Changes

  • Added support for reading all image data (including metadata and coordinate information) into xarray.DataArray objects. This can be accessed with AICSImage.xarray_data, AICSImage.xarray_dask_data, or Reader equivalents. Where possible, we create and attach coordinate planes to the xarray.DataArray objects to support more options for indexed data selection such as timepoint selection by unit of time, or pixel selection by micrometers.
  • Added support for reading multi-channel RGB imaging data utilizing a new Samples (S) dimension. This dimension will only appear in the AICSImage, Reader, and the produced np.ndarray, da.Array, xr.DataArray if present in the file and will be the last dimension, i.e. if provided an RGB image, AICSImage and related object dimensions will be "...YXS", if provided a single sample / greyscale image, AICSImage and related object dimensions will be "...YX". (This change also applies to DefaultReader and PNG, JPG, and similar formats)
  • OmeTiffReader now validates the found XML metadata against the referenced specification. If your file has invalid OME XML, this reader will fail and roll back to the base TiffReader. (In the process of updating this we found many bugs in our 3.x series OmeTiffWriter, our 4.x OmeTiffReader fixes these bugs at read time but that doesn't mean the file contains valid OME XML. It is recommended to upgrade and start using the new and improved, and validated, OmeTiffWriter.)
  • DefaultReader now fully supports reading "many-image" formats such as GIF, MP4, etc.
  • OmeTiffReader.metadata is now returned as the OME object from ome-types. This change additionally removes the vendor.OMEXML object.
  • Dimensions received an overhaul -- when you use AICSImage.dims or Reader.dims you will be returned a Dimensions object. Using this object you can get the native order of the dimensions and each dimensions size through attributes, i.e. AICSImage.dims.X returns the size of the X dimension, AICSImage.dims.order returns the string native order of the dimensions such as "TCZYX". Due to these changes we have removed all size functions and size_{dim} properties from various objects.
  • Replaced function get_physical_pixel_size with attribute physical_pixel_sizes that returns a new PhysicalPixelSizes NamedTuple object. This now allows attribute axis for each physical dimension, i.e. PhysicalPixelSizes.X returns the size of each X dimension pixel. This object can be cast to a base tuple but be aware that we have reversed the order from XYZ to ZYX to match the rest of the library's standard dimension order. Additionally, PhysicalPixelSizes now defaults to (None, None, None) (was (1.0, 1.0, 1.0))as pixel physical size is optional in the OME metadata schema.
  • Replaced parameters named known_dims with dim_order.
  • Replaced function get_channel_names with attribute channel_names.
  • Replaced function dtype with attribute dtype.
  • Renamed the chunk_by_dims parameter on all readers to just chunk_dims.
  • Removed all distributed cluster and client spawning.
  • Removed context manager usage from all objects.

Contributors and Reviewers this Release (alphabetical)

Jackson Maxfield Brown (@JacksonMaxfield)
Ramón Casero (@rcasero)
Julie Cass (@jcass11)
Jianxu Chen (@jxchen01)
Bryant Chhun (@bryantChhun)
Rory Donovan-Maiye (@donovanr)
Christoph Gohlke (@cgohlke)
Josh Moore (@joshmoore)
Sebastian Rhode (@sebi06)
Jamie Sherman (@heeler)
Madison Swain-Bowden (@AetherUnbound)
Dan Toloudis (@toloudis)
Matheus Viana (@vianamp)