-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mosaic multiple rasters #4
Comments
@TomAugspurger - to get you started the command line GDAL approach is typically to use https://gdal.org/programs/gdalbuildvrt.html (or rio merge https://rasterio.readthedocs.io/en/latest/api/rasterio.merge.html) , which have a few common built-in options for overlapping pixel selection rules (first, last, min, max) If you want some sort of custom pixel selection scheme a python script is the way to go. I really like Xarray syntax for creating mosaics. See this notebook where I load a stack of many swath images into an xarray dataarray, then extract a monthly mean mosaic (https://github.com/scottyhq/sentinel1-rtc/blob/main/Sentinel1-RTC-example.ipynb), in particular: And ongoing discussion in intake-stac to make this process simpler intake/intake-stac#65 |
Thanks for sharing that @scottyhq, it's very helpful. Question about building the vrt. I don't see the code for generating the file with the list of paths
and then using the CLI like you did, or Python API?
Just using the keys |
Thanks very much for all this work... state scale sentinel mosaics are what I am aiming for. Starting with a median I think. |
So starting with a STAC of sentinel 2 cogs - get catalog info and a dataframe to get the datetimes |
Something i have been wondering being new to this, does computing a list of datasets try to bring all that into memory? |
@TomAugspurger - if trying to stack rasters (typically time - creating dims {x,y,t,band}) - I have used three approaches with STAC collections (alluded to in @scottyhq post above).
I think this approach would also work with spatial mosaics as well (possibly using pixel functions to mosaic overlapping rasters), but that is something I have not yet worked on. I would be interested to hear more about the solution/workflow you develop for mosaicing/stacking. I think this is a common bottlekneck in working with EO data (Landsat, Sentinel, Aster, etc....). Adding this sort of functionality to intake-stac would be great, but I have not been able to carve out any time to work on that yet. |
Thanks for sharing @rmg55. I'll be sure to update this thread once we get a design together. I'm glad to hear about creating a VRT from the Catalog items. I've been pleased with building VRTs from the COGs themselves, but as you say this can be a bit slow. |
Quick status update: I have a prototype for building a VRTs from a STAC query (thanks for the idea and code @rmg55). This VRT can then be fed into This approach can be much faster than using I'll share more soonish. |
Here's the repo: https://github.com/stac-utils/stac-vrt and docs: https://stac-vrt.readthedocs.io/en/latest. Pretty rough right now (likely only works for NAIP STAC items / COGs in Azure), but happy to hack on it with people. |
Wow - this looks great @TomAugspurger. Will dig into it deeper and see if/how i might be able to contribute. I think this has the potential to really improve a lot of existing workflows. Seems like getting existing/new catalogs to incorporating the STAC projection extension would really help this effort. |
Hey folks! Haven't met many of you before, but I've been tinkering with this question as well, of how to go from STAC items to a ndarray-ish stack of imagery. Sorry for not getting in touch sooner! So I came up with this library: https://github.com/gjoseph92/stackstac (docs at https://stackstac.readthedocs.io). The overall idea is to generate a dask array representing the whole stack of imagery—only using the STAC metadata, to avoid the latency of opening datasets. But I've also added a bit of logic to:
There's a ton of overlap here with @TomAugspurger's work on |
Thanks, sounds interesting - I will take a look! |
Thanks for sharing @gjoseph92! Making a "dask native" way of doing this was on my list too, as these VRTs can get somewhat large. I'll take a closer look later. |
@gjoseph92 - awesome project and agreed there is a lot of overlap. Hopefully I will have some more time to explore the project over the next week. I had some quick questions that I was hoping you might be able to clarify:
|
@rmg55 thanks for the questions!
Not quite—I don't use But overall, it's actually okay if it's a little inaccurate, because we're just estimating a good output transform. When actually reading the data, GDAL will still use whatever geotrans is set on the dataset, then warp it to the output transform we picked. So it's never "inaccurate", per se (pixels will always be correctly georeferenced), but it's true that the output resolution might not always match the native resolution of the data by default. But since getting things to a common grid is the goal anyway, that's kind of the point. And if you have a specific resolution and CRS you want, you're always free to set that too.
I opened gjoseph92/stackstac#1 to discuss this, but the short answer is: you can instead do the mosaicking with some |
Thanks for opening this up @gjoseph92, I had a brief look at stackstac (love the name by the way, and couldn't help thinking the palindrome stackcats would also be fun :) I'm really impressed by the docs and appreciate your links to the dask scaling issues. There is definitely a lot of overlap, but to be honest I think you're a fair ways ahead on the path that we've just been starting down! Will definitely be good to think about how to consolidate effort and collaborate where possible on this, I think it's a really exciting time to push these tools forward with STAC1.0 coming out and more and more COGs+STAC available to make the most of. |
Anyone else interested in joining a call to hash out next steps? I feel like we've had a good amount of exploration in the area of STAC + xarray + Dask, but we might be due for some consolidation, or at least understanding where different projects have different visions or different approaches. If you are interested, give a +1 to this and I'll send out an invite sometime later this week. |
cc @snowman2 as well: #4 (comment), if you're interested in joining a call on this topic. #4 (comment) is a good summary. I'll send out a doodle poll tomorrow to find a time for next week. I suspect we'll cover STAC + Dask + xarray first (and stac-vrt / stackstac specifically). But we'll inevitably spill into larger operations so it'd be good to have someone from rioxarray and xarray-spatial there. (and maybe @jorisvandenbossche or someone else from geopandas, if we get into vector / raster operations? Sounds like a full meeting 😄) |
I sent the following email to people who indicated interest. Anyone is welcome to join. Link to the doodle poll: https://doodle.com/poll/kpz9rtrqmk4tiqy2. I think we have a pretty wide range of timezones, but I gave options in hourly increments for the US working day for next week. I think we'll use an hour productively, and people of course are welcome to come and go. Here's a link to a proposed agenda: https://docs.google.com/document/d/1IXnk2fvpjWaRZHAwZ1i_d6l0B7iIvV_6Kmjy2IXK5Vo/edit?usp=sharing. Feel free to add other items. |
ping @dcherian and @JessicaS11 would be great if you have availability to join! |
When folks converge on a solution for reading STAC rasters into xarray Datasets, I'd like to showcase that solution in this lesson: https://github.com/carpentries-incubator/geospatial-python |
👋 I'm a bit late here, but Mosaic (+ STAC) is something we have worked a lot in the last years. rio-tiler (which I'm one of the core dev) has native support for mosaic (and for STAC), you may found this example interesting https://cogeotiff.github.io/rio-tiler/examples/Using-rio-tiler-mosaic/ On the mosaic side, we have developed a simple spec: https://github.com/developmentseed/mosaicjson-spec. The spec was made to support dynamic tiling, but help created https://github.com/developmentseed/cogeo-mosaic which can in fact work for other purpose than map tiles but we recently introduced the concept of rio-tiler/cogeo-mosaic are just one solution for some specific problem and it's nice to see other tooling begin developed on this subject. |
Looks like Monday the 22nd at 1:00 US Central time worked for the most people (reminder: the US just had a daylight savings time transition). I sent an invite to people whose email I had. If you're interested in joining send me an email at [email protected] and I'll add you, or just show up then. Agenda: https://docs.google.com/document/d/1IXnk2fvpjWaRZHAwZ1i_d6l0B7iIvV_6Kmjy2IXK5Vo/edit?usp=sharing |
I updated #4 (comment) with a link to the video call. |
Thanks to everyone who joined. The agenda doc has some detailed notes for those who missed: https://docs.google.com/document/d/1IXnk2fvpjWaRZHAwZ1i_d6l0B7iIvV_6Kmjy2IXK5Vo/edit?usp=sharing Some takeaways:
Anything else to add? |
Hi all, First time joining a github conversation (!) but just thought I'd jump in as a way to keep an eye on where this goes. Very interested as I've been banging my head against the wall trying to use these tools to mosaic rasters for a while now. Thanks for everyone's contributions to it! |
Median composites...no median in dask, how is this working? A 'make me one of these' from whatever generic or user function mean, median, mode, hdstats once parallelized et al would be beneficial? |
Improved performance...eg stackstac |
Plotting of large outputs/dimensions |
Thanks for the notes for the middle of the night people |
Thanks, @TomNicholas to bring the issue here. A little bit more context, for an institutional reason I'm not able to use the Synergise/AWS products. This brings me to deal with the overlapping issue that has been already "solved" (with an average ? I've no info about this)in the AWS product.
Has been more accurately described here |
@pl-marasco see gjoseph92/stackstac#1 for a quick mosaic implementation (not quite what you're looking for, since it doesn't take coordinates into account, but may work).
stackstac may be helpful for this task in general. |
Yes, don't think I could have done the SA dataset without an AWS (GCS/Azure) or whatever machine with 300+ GB of RAM. |
Hi, Just passing through looking for answer to something but thought I'd mention this in case it is relevant/of interest. |
Updated location: https://github.com/opendatacube/odc-geo |
I'm investigating the best way to mosiac multiple arrays into a single array.
https://desktop.arcgis.com/en/arcmap/10.3/manage-data/raster-and-images/what-is-a-mosaic.htm has a nice description. I'll follow up with what I learn. If anyone has additional resources I'd love to hear about them.
The text was updated successfully, but these errors were encountered: