Issue on page /data_storage.html should we cover kerchunk? #49

paolap · 2022-02-20T23:57:08Z

https://fsspec.github.io/kerchunk/

kerchunk is an interesting option for cloud optimised storage of netcdf, hdf and grib data. It seems to work more as a virtual aggregation that creates a single .zarr or .json reference file that points to all the individual files as a single dataset. I think it might be also indexing the actual chunks. Compared to Zarr on its own there's no data duplication. Someone just mentioned to me this morning and I had a quick look to the documentation.

paigem · 2022-02-21T00:01:23Z

Great, this definitely looks like something we can add! I have heard some chatter about it in Pangeo circles, but hadn't actually taken the time to look at what it is just yet.

Is this something you want to add @paolap?

paolap · 2022-02-21T01:53:24Z

I guess if no one has tried it already, I can try to have a go at using it before writing more about it

paigem · 2022-02-21T02:34:15Z

That would be great @paolap - sounds like you're already a step ahead of me in knowing what it is! But happy to help if needed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue on page /data_storage.html should we cover kerchunk? #49

Issue on page /data_storage.html should we cover kerchunk? #49

paolap commented Feb 20, 2022

paigem commented Feb 21, 2022

paolap commented Feb 21, 2022

paigem commented Feb 21, 2022

Issue on page /data_storage.html should we cover kerchunk? #49

Issue on page /data_storage.html should we cover kerchunk? #49

Comments

paolap commented Feb 20, 2022

paigem commented Feb 21, 2022

paolap commented Feb 21, 2022

paigem commented Feb 21, 2022