-
Notifications
You must be signed in to change notification settings - Fork 93
Add raster index #846
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add raster index #846
Conversation
Based on coordinate transform examples copied and adapted from pydata/xarray#9543.
Use a mapping of coordinate name(s) to underlying index(es) behind the hood.
I encountered pre-commit/pre-commit#1375 as I didn't have python 3.10 installed in my environment so pre-commit couldn't find it. I applied this suggestion: pre-commit/pre-commit#1375 (comment).
Create a RasterIndex if option use_raster_index is True.
| return False | ||
|
|
||
| return all( | ||
| index.equals(other._wrapped_indexes[k]) # type: ignore[arg-type] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to relax the typing constraints of xarray.core.indexes.CoordinateTransformIndex.equals() (the other argument should be any xarray.Index) so we can remove the # type: ignore here.
| from xarray.core.indexing import IndexSelResult, merge_sel_results | ||
|
|
||
|
|
||
| class AffineTransform(CoordinateTransform): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the Affine* classes should live in Xarray. They are useful for both geo and bio-imaging domains at least (and possibly astro too).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I agree these are useful in many domains! The wrapped affine.Affine object is here restricted to 2D affine transformation of the plane so it might be limited for general usage, though. These classes are also thin wrappers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are "thin" in code but 'deep' in understanding of the Xarray model. That's why I think we should vendor it. Even so, vendoring will save a bunch of your time as you iterate on this since you wouldn't need to keep code in two places in-sync.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the whole content of raster_index.py including the RasterIndex class could (eventually) be moved into Xarray or some dedicated package? There's nothing IO specific and it could be useful in other downstream packages like odc-geo, geoxarray, etc.
| self.axis_transform = transform | ||
| self.dim = transform.dim | ||
|
|
||
| def isel( # type: ignore[override] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again here we should make the return type of xarray.core.Index.isel() more permissive (i.e., any Index object or None) so we can remove the # type: ignore comment.
| if new_indexes: | ||
| # TODO: if there's only a single PandasIndex can we just return it? | ||
| # (maybe better to keep it wrapped if we plan to later make RasterIndex CRS-aware) | ||
| return RasterIndex(new_indexes) | ||
| else: | ||
| return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When it is possible, it would be convenient to return a PandasIndex instead of a RasterIndex for reusing the selection results in further Xarray operations that require alignment with other Xarray objects. The reason is that Xarray alignment is based on strict Index types, e.g., it cannot compare or align a RasterIndex (encapsulating a single PandasIndex) with another PandasIndex.
However, we can only return a PandasIndex in a limited number of cases, e.g.,
- it is possible for
da.isel(x=[...], y=0)(orda.isel(x=0, y=[...])) to return a "x" (or "y") coordinate with a PandasIndex and to drop the index for the "y" (or "x") scalar coordinate. - it is not possible for
da.isel(x=[...], y=[...])to return both "x" and "y" coordinates each with a PandasIndex. This is becausexarray.Index.isel()doesn't currently allow returning multiple index objects.
Returning a PandasIndex in some cases and a RasterIndex in other cases would make the behavior complex and pretty confusing.
Alternatively, we could always return a RasterIndex for both "x" and "y" coordinates even when selecting it with scalar values. This would provide simple behavior and might facilitate CRS-aware operations even with scalar spatial coordinates, but this would be even less Xarray-alignment friendly.
So in short: it is not straightforward to choose the right behavior here. It is a trade-off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but this would be even less Xarray-alignment friendly.
I think this is fine. Users can reset_index once they are sure what they're about to do makes sense
|
This has been an in-progress MR for a while. Any plans to keep working on it? |
|
Yes, we landed on keeping the index separate since it's generally useful outside just rioxarray. Are you ok with adding another dependency for this feature? A few of us are meeting Monday morning 4/21 8.30 MT. Would you like to join? |
Sure, I would be happy to join. |
|
I'm closing this PR. RasterIndex has been implemented in https://github.com/dcherian/rasterix. See #855 for integration within Rioxarray. |
docs/history.rstfor all changes anddocs/rioxarray.rstfor new APIThis work in progress PR adds a custom
RasterIndexXarray index that is built on top of pydata/xarray#9543 (available in Xarray's main branch).rioxarray.set_option(use_raster_index=True)RasterIndexshould work with both x/y independent axis 1D coordinates (rectilinear affine transform with no rotation) vs. x/y dependent axis 2D coordinates. In the latter case data selection is more limited.RasterIndexare lazy (in both 1D and 2D cases): they wrap the affine transform.TODO on the Xarray Index API implementation (could also be done after this PR, if needed):
RasterIndex.from_variables: compute the affine transformation from explicit coordinate values, although maybe not very useful?RasterIndex.concat: possible to compute a new affine transformations by combining the transformations of multiple raster indexes?RasterIndex.join: like forconcatwe could perhaps combine the transformations in some simple cases (check that the two affine transforms are compatible)?RasterIndex.reindex_like: maybe this could be implemented using a forward transformation of the current index chained with a reverse transformation of the other index?RasterIndex.rename: must propagate the new coordinate and/or dimension names to the underlyingAffineTransformobjects!RasterIndex.copy(not sure we need to override Index.copy)RasterIndex.__getitem__(not sure we really need to implement it)Here is an example notebook that I'll update as I continue working on this PR.
There are a few issues that we need to fix in Xarray so this PR can work properly (I've added a few temporary workarounds when possible):
cc @scottyhq @dcherian @maxrjones