Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Tiff Loader #336

Closed
wants to merge 12 commits into from
Closed

Conversation

ilan-gold
Copy link
Collaborator

Fixes #312.

This is a first step towards a few things related to the discussion in #325:

  1. Having a standardized tiff interface: The new base Tiff loader only uses "standard" methods and attributes, like dimensions and getRasterSize, and the OMETiffLoader only uses metadata for metadata-relevant things, like resolving size or formatting the metadata. There is no omexml just a generic metadata
  2. An ImageJ TIFF loader: This seems like an easy win using the above as a base - now this loader only needs to implement its "indexing" method in getImages and its raster size in getRasterSize.
  3. Standardizing Loader: Related to the first point, because the TiffLoader is much cleaned up, we should be able to create a base loader now.

@@ -65,7 +65,7 @@ export async function createLoader(
// Non-Bioformats6 pyramids use Image tags for pyramid levels and do not have offsets
// built in to the format for them, hence the ternary.
const {
omexml: { SizeZ, SizeT, SizeC },
metadata: { SizeZ, SizeT, SizeC },
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A getDimensionSizes method for the loaders seems appropriate, right? Instead of the above, we could implement something that does this:

const sizeZ = this.dimensions.find(dim => dim.field === 'z').values.length;
const sizeT = this.dimensions.find(dim => dim.field === 'time').values.length;
const sizeC = this.dimensions.find(dim => dim.field === 'channel').values.length;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully I have time to review this over the week but in general I want to move away from dimensions on loaders.

This field isn't used by any of Viv's layers and instead only for the UI to create loaderSelections. I'd rather have an openOMETiff(url) function that returns: { loader, metadata }. Rather than encapsulating the metadata within the loader object, not all pixel sources have labeled axes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically I just want the loader to describe the nd-source, but not contain additional metadata. getDimensionSizes seems like an appropriate method, but requiring all dimensions to be labeled feels like an error in our design. I'd rather the loader just have a shape property which describes the shape of the nd-array source. Then, the metadata could describe what those dimensions are:

const metadata = {
  axis_labels: ['time', 'channel', 'z', 'y', 'x'],
  ...other_metadata,
}

Or we could simplify the dimensions the drastically:

const dimensions = [
  'time',
  { name: 'channel', values: [/*...*/] },  // if there are values, it's an object, otherwise just a string
  'z',
  'y',
  'x'
];

I think the original spec is overly complicated with type. What is it used for? Additionally, we just use range(SizeT) to create values that are only used in the UI. Maybe it isn't possible for OME-TIFF..

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this but I have two questions/comments:

  1. How will serializeSelection and getImages work without any notion of the labels for dimensions? Will the loader still have this?
  2. Can we merge this independent of that since I am probably not the right person to implement this (you seem to have a very strong concept in mind) and this PR has value on its own in the interim since it both aligns our loaders to some degree and resolves a separate outstanding issue?

Copy link
Member

@manzt manzt Jan 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1.) I probably need some more time to think on it... starting to think labeled dimensions is a reasonable, but still we could simplify the current dimensions object a lot. Moving towards just labels for dimensions on the loaders (no type and maybe optional values) would be a good step in my opinion. Perhaps take some inspiration from tensorstore.

2.) Yes. I'll review tomorrow in the AM. This doesn't change much of the interfaces that already exist on the loaders, I just want to voice some hesitance towards adding additional methods to the loader interface.

Comment on lines +182 to +191
const DTYPE_LOOKUP = {
uint8: '<u1',
uint16: '<u2',
uint32: '<u4',
float: '<f4',
// TODO: we currently need to cast these dtypes to their uint counterparts.
int8: '<u1',
int16: '<u2',
int32: '<u4'
};
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems OME-TIFF specific but could be in a constants.js file?

Comment on lines +174 to +181
const metadata = new OMEXML(omexmlString);
const dimensions = dimensionsFromOMEXML(metadata);
const channelNames = metadata.getChannelNames();
const isRgb =
metadata.SamplesPerPixel === 3 ||
(channelNames.length === 3 && metadata.Type === 'uint8') ||
(metadata.SizeC === 3 && channelNames.length === 1 && metadata.Interleaved);
const isInterleaved = metadata.Interleaved;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be helpful to wrap this in something so it could be re-used for zarr as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly copied from the old constructor. This function creates a GeoTIFF object, and derives all other constructor arguments (firstImage, metadata, dimensions) from that source. Then additional fields are parsed in the OMETiff constructor from metadata (which itself is derived from the original tiff?). I don't see how inheritance is helping us here.

Metadata are going to be completely different for each class so getMetadata isn't going to return the same thing (which is what you want in the first place).

Comment on lines +141 to +144
const dimensionOrder = this.dimensions
.map(dim => dim.field[0])
.reverse()
.join('');
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would not be opposed to specifying this convention in the docs explicitly - the whole "dimensions" concept is really strong and should probably have a schema, docs etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My feeling is that dimensions is perhaps too strong of a concept and I'd like to re-evaluate the schema considering how we use it in practice. My opinion is that it is overly complicated, and we only use a minor subset of the available fields.

@ilan-gold ilan-gold marked this pull request as ready for review January 2, 2021 17:03
@ilan-gold ilan-gold requested a review from manzt January 2, 2021 17:03
@ilan-gold ilan-gold mentioned this pull request Jan 4, 2021
Copy link
Member

@manzt manzt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR was much larger than I anticipated. Overall I think I see the motivation, but I'm struggling to see what we are gaining here via this implementation. The diff is so substantial (not only are things moved around but a lot of logic has changed). It doesn't feel like we have gained much in terms of abstraction, and instead just shuffled around a ton of code.

My primary concern is that so much code has been moved around, that I worry about our lack of testing even more. How can we be sure that all previous examples still work with a change like this? For instance, I noticed some missing logic in the new OMETIFFLoader.

Comment on lines 67 to -68
const {
omexml: { SizeZ, SizeT, SizeC },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the pattern I'm talking about below that we can move away from. We have a utility that creates a loader and then we derive these properties immediately from the loader for use in the UI. I'm ok with this for right now, but let's think about moving metadata off of the loader.

Instead our utility could be along the lines: openTIFF(url) -> { loader, metadata }

Comment on lines +12 to +13
* @param {Boolean} args.isRgb Whether or not this tiff represents an rgb image.
* @param {Boolean} args.isInterleaved Whether or not this tiff represents an interleaved image.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General curiosity: what is the difference between these two? it seems like if an image is interleaved we can assume it is RGB/A (e.g. needs our BitmapLayer). If it isn't interleaved, then its up to the UI to determine colors to apply to each channel, no? So ultimately, isRgb here is conveying something about the rendering metadata and how colorValues prop should be set?

}) {
this.physicalSizes = physicalSizes;
// get first image's description, which contains OMEXML
this.metadata = metadata;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See, we are just directly assigning metadata as a class attribute and it isn't referenced at all by any class method.

* @returns {Object} Tiff Image object containing parsed IFD.
*/
// eslint-disable-next-line class-methods-use-this,no-unused-vars,no-empty-function
async getImages(loaderSelection, z) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default implementation should throw an error. This just returns a promise for now that resolves to undefined which might be hard to debug.

* @returns {Object} width: number, height: number
*/
// eslint-disable-next-line class-methods-use-this,no-unused-vars,no-empty-function
getRasterSize({ z }) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same. Throw error.

getRasterSize({ z }) {
const { width, height } = this;
const { metadata } = this;
const { SizeX: width, SizeY: height } = metadata;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only place where metadata is used other than getMetadata.

Comment on lines -356 to -357
if (data.BYTES_PER_ELEMENT === 2) {
return new Uint16Array(new Int16Array(data.buffer));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These diffs are really hard to read through. Where did this logic go? It's super hard to tell what got changed or if something didn't get copied.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the same logic as:

      const T = b === 1 ? Uint8Array : b === 2 ? Uint16Array : Uint32Array;
      data = rasters.data.map(r => new T(r));

Comment on lines -233 to -235
if (this.dtype === '<f4') {
// GeoTiff.js returns 32 bit uint when the tiff has 32 significant bits.
data = rasters.map(r => new Float32Array(r.buffer));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this logic in the new class? Really hard to read these diffs.

Comment on lines +21 to +30
tiff,
pool,
firstImage,
dimensions,
offsets,
metadata,
isRgb,
isInterleaved,
dtype,
physicalSizes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a general comment. This is a huge constructor for a base class.

For example, do all Tiffs have physicalSizes? It seems like logic that used to be in the OMETIFFLoader constructor has been moved to createTiffLoader.

Comment on lines +174 to +181
const metadata = new OMEXML(omexmlString);
const dimensions = dimensionsFromOMEXML(metadata);
const channelNames = metadata.getChannelNames();
const isRgb =
metadata.SamplesPerPixel === 3 ||
(channelNames.length === 3 && metadata.Type === 'uint8') ||
(metadata.SizeC === 3 && channelNames.length === 1 && metadata.Interleaved);
const isInterleaved = metadata.Interleaved;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly copied from the old constructor. This function creates a GeoTIFF object, and derives all other constructor arguments (firstImage, metadata, dimensions) from that source. Then additional fields are parsed in the OMETiff constructor from metadata (which itself is derived from the original tiff?). I don't see how inheritance is helping us here.

Metadata are going to be completely different for each class so getMetadata isn't going to return the same thing (which is what you want in the first place).

@manzt
Copy link
Member

manzt commented Jan 5, 2021

The new base Tiff loader only uses "standard" methods and attributes, like dimensions and getRasterSize, and the OMETiffLoader only uses metadata for metadata-relevant things, like resolving size or formatting the metadata. There is no omexml just a generic metadata

It seems to me that we have just split the logic of the OMETiffLoader between itself and the base class. I currently struggling to see what is a core part of the loader and what OMETIFF extends that loader with. It seems like metadata is unique to the ometiff loader, or any other type of tiff for that matter, and yet this is a part of the base class?

@ilan-gold
Copy link
Collaborator Author

@manzt The idea here is to create a standardized interface for creating TiffLoader classes. It is possible that getImages and getRasterSize can be shared among all instances somehow, but I won't really know until I make a PR for creating a ImageJ Tiff loader. I didn't want to do both the abstraction of the class and the ImageJ Tiff loader in one PR (because the diff would be even larger probably) but perhaps that would help clarify why I am doing this.

@ilan-gold
Copy link
Collaborator Author

ilan-gold commented Jan 8, 2021

So I think you are correct that a lot of this is just reshuffled code, because the OmeTiffLoader was actually not so terrible to begin with (methods abstracted away into functions that are clearly reusable/generalizable because, as you pointed out, a lot of this is just reshuffling code) but I wanted to create something more generalizable for other formats.

@ilan-gold
Copy link
Collaborator Author

Closing this in favor of coming loader changes

@ilan-gold ilan-gold closed this Jan 11, 2021
@manzt manzt deleted the ilan-gold/refactor_tiff_loader branch July 29, 2022 19:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Default ColorMap from TIFF File
2 participants