Mosaics, Xarray, and Writers, Oh My!
AICSImageIO 4.0.0
We are happy to announce the release of AICSImageIO 4.0.0!
AICSImageIO is a library for image reading, metadata conversion, and image writing for microscopy formats in pure Python. It aims to be able to read microscopy images into a single unified API regardless of size, format, or location, while additionally writing images and converting metadata to a standard common format.
A lot has changed and this post will only have the highlights. If you are new to the library, please see our full documentation for always up-to-date usage and a quickstart README.
Highlights
Mosaic Tile Stitching
For certain imaging formats we will stitch together the mosaic tiles stored in the image prior to returning the data. Currently the two formats supported by this functionality are LIF
and CZI
.
from aicsimageio import AICSImage
stitched_image = AICSImage("tiled.lif")
stitched_image.dims # very large Y and X
If you don't want the tiles to be stitched back together you can turn this functionality off with reconstruct_mosaic=False
in the AICSImage
object init.
Xarray
The entire library now rests on xarray. We store all metadata, coordinate planes, and imaging data all in a single xarray.DataArray
object. This allows for not only selecting by indices with the already existing aicsimageio
methods: get_image_data
and get_image_dask_data
, but additionally to select by coordinate planes.
Mosaic Images and Xarray Together
A prime example of where xarray can be incredibly powerful for microscopy images is in large mosaic images. For example, instead of selecting by index, with xarray
you can select by spatial coordinates:
from aicsimageio import AICSImage
# Read and stitch together a tiled image
large_stitched_image = AICSImage("tiled.lif")
# Only load in-memory the first 300x300 micrometers (or whatever unit) in Y and X dimensions
larged_stitched_image.xarray_dask_data.loc[:, :, :, :300, :300].data.compute()
Writers
While we are actively working on metadata translation functions, in the meantime we are working on setting ourselves up for the easiest method for format converstion. For now, we have added a simple save
function to the AICSImage
object that will convert the pixel data and key pieces of metadata to OME-TIFF for all of (or a select set of) scenes in the file.
from aicsimageio import AICSImage
AICSImage("my_file.czi").save("my_file.ome.tiff")
For users that want greater flexibility and specificity in image writing, we have entirely reworked our writers module and this release contains three writers: TwoDWriter
, TimeseriesWriter
, and OmeTiffWriter
. The objective of this rework was to make n-dimensional image writing much easier, and most importantly, to make metadata attachment as simple as possible. And, we now support multi-scene / multi-image OME-TIFF writing when providing a List[ArrayLike]
. See full writer documentation here.
OME Metadata Validation
Our OmeTiffReader
and OmeTiffWriter
now validate the read or produced OME metadata. Thanks to the ome-types library for the heavy lifting, we can now ensure that all metadata produced by this library is valid to the OME specification.
Better Scene Management
Over time while working with 3.x, we found that many formats contain a Scene
or Image
dimension that can be entirely different from the other instances of that dimension in pixel type, channel naming, image shape, etc. To solve this we have changed the AICSImage
and Reader
objects to statefully manage Scene
while all other dimensions are still available.
In practice, this means on the AICSImage
and Reader
objects the user no longer receives the Scene
dimension back in the data
or dimensions
properties (or any other related function or property).
To change scene while operating on a file you can call AICSImage.set_scene(scene_id)
while retrieving which scenes are valid by using AICSImage.scenes
.
from aicsimageio import AICSImage
many_scene_img = AICSImage("my_file.ome.tiff")
many_scene_img.current_scene # the current operating scene
many_scene_img.scenes # returns tuple of available scenes
many_scene_img.set_scene("Image:2") # sets the current operating scene to "Image:2"
RGB / BGR Support
Due to the scene management changes, we no longer use the "S"
dimension to represent "Scene". We use it to represent the "Samples" dimension (RGB / BGR) which means we now have an isolated dimension for color data. This is great because it allows us to directly support multi-channel RGB data, where previously we would expand RGB data into channels, even when the file had a channel dimension.
So if you encounter a file with "S"
in the dimensions, you can know that you are working with an RGB file.
FSSpec Adoption
Across the board we have adopted fsspec for file handling. With 4.x you can now provide any URI supported by the fsspec
library or any implementations of fsspec
(s3fs (AWS S3), gcsfs (Google Cloud Storage), adlfs (Azure Data Lake), etc.).
In many cases, this means we now support direct reading from local or remote data as a base part of our API. As well as preparing us for supporting OME-Zarr!
from aicsimageio import AICSImage
wb_img = AICSImage("https://www.your-site.com/your-file.ome.tiff")
s3_img = AICSImage("s3://your-bucket/your-dir/your-file.ome.tiff")
gs_img = AICSImage("gs://your-bucket/your-dir/your-file.ome.tiff")
To read from remote storage, you must install the related fsspec
implementation library. For s3
for example, you must install s3fs
.
Splitting up Dependencies
To reduce the size of AICSImageIO on fresh installs and to make it easier to manage environments, we have split up dependencies into specific format installations.
By default, aicsimageio
supports TIFF
and OME-TIFF
reading and writing. If you would like to install support for reading CZI files, you would simply append [czi]
to the aicsimageio
pip install: pip install aicsimageio[czi]
. For multiple formats, you would add them as a comma separated string: pip install aicsimageio[czi,lif]
. And to simply install support for reading all format implentations: pip install aicsimageio[all]
.
A full list of supported formats can be found here.
Roadmap and Motivation
After many discussions with the community we have finally written a roadmap and accompanying documentation for the library. If you are interested, please feel free to read them. Roadmap and Accompanying Documentation
Full List of Changes
- Added support for reading all image data (including metadata and coordinate information) into
xarray.DataArray
objects. This can be accessed withAICSImage.xarray_data
,AICSImage.xarray_dask_data
, orReader
equivalents. Where possible, we create and attach coordinate planes to thexarray.DataArray
objects to support more options for indexed data selection such as timepoint selection by unit of time, or pixel selection by micrometers. - Added support for reading multi-channel RGB imaging data utilizing a new
Samples
(S
) dimension. This dimension will only appear in theAICSImage
,Reader
, and the producednp.ndarray
,da.Array
,xr.DataArray
if present in the file and will be the last dimension, i.e. if provided an RGB image,AICSImage
and related object dimensions will be"...YXS"
, if provided a single sample / greyscale image,AICSImage
and related object dimensions will be"...YX"
. (This change also applies toDefaultReader
and PNG, JPG, and similar formats) OmeTiffReader
now validates the found XML metadata against the referenced specification. If your file has invalid OME XML, this reader will fail and roll back to the baseTiffReader
. (In the process of updating this we found many bugs in our 3.x seriesOmeTiffWriter
, our 4.xOmeTiffReader
fixes these bugs at read time but that doesn't mean the file contains valid OME XML. It is recommended to upgrade and start using the new and improved, and validated,OmeTiffWriter
.)DefaultReader
now fully supports reading "many-image" formats such as GIF, MP4, etc.OmeTiffReader.metadata
is now returned as theOME
object from ome-types. This change additionally removes thevendor.OMEXML
object.- Dimensions received an overhaul -- when you use
AICSImage.dims
orReader.dims
you will be returned aDimensions
object. Using this object you can get the native order of the dimensions and each dimensions size through attributes, i.e.AICSImage.dims.X
returns the size of theX
dimension,AICSImage.dims.order
returns the string native order of the dimensions such as"TCZYX"
. Due to these changes we have removed allsize
functions andsize_{dim}
properties from various objects. - Replaced function
get_physical_pixel_size
with attributephysical_pixel_sizes
that returns a newPhysicalPixelSizes
NamedTuple
object. This now allows attribute axis for each physical dimension, i.e.PhysicalPixelSizes.X
returns the size of eachX
dimension pixel. This object can be cast to a basetuple
but be aware that we have reversed the order fromXYZ
toZYX
to match the rest of the library's standard dimension order. Additionally,PhysicalPixelSizes
now defaults to(None, None, None)
(was(1.0, 1.0, 1.0)
)as pixel physical size is optional in the OME metadata schema. - Replaced parameters named
known_dims
withdim_order
. - Replaced function
get_channel_names
with attributechannel_names
. - Replaced function
dtype
with attributedtype
. - Renamed the
chunk_by_dims
parameter on all readers to justchunk_dims
. - Removed all
distributed
cluster and client spawning. - Removed context manager usage from all objects.
Contributors and Reviewers this Release (alphabetical)
Jackson Maxfield Brown (@JacksonMaxfield)
Ramón Casero (@rcasero)
Julie Cass (@jcass11)
Jianxu Chen (@jxchen01)
Bryant Chhun (@bryantChhun)
Rory Donovan-Maiye (@donovanr)
Christoph Gohlke (@cgohlke)
Josh Moore (@joshmoore)
Sebastian Rhode (@sebi06)
Jamie Sherman (@heeler)
Madison Swain-Bowden (@AetherUnbound)
Dan Toloudis (@toloudis)
Matheus Viana (@vianamp)