Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue on page /data_storage.html should we cover kerchunk? #49

Open
paolap opened this issue Feb 20, 2022 · 3 comments
Open

Issue on page /data_storage.html should we cover kerchunk? #49

paolap opened this issue Feb 20, 2022 · 3 comments

Comments

@paolap
Copy link
Contributor

paolap commented Feb 20, 2022

https://fsspec.github.io/kerchunk/

kerchunk is an interesting option for cloud optimised storage of netcdf, hdf and grib data. It seems to work more as a virtual aggregation that creates a single .zarr or .json reference file that points to all the individual files as a single dataset. I think it might be also indexing the actual chunks. Compared to Zarr on its own there's no data duplication. Someone just mentioned to me this morning and I had a quick look to the documentation.

@paigem
Copy link
Contributor

paigem commented Feb 21, 2022

Great, this definitely looks like something we can add! I have heard some chatter about it in Pangeo circles, but hadn't actually taken the time to look at what it is just yet.

Is this something you want to add @paolap?

@paolap
Copy link
Contributor Author

paolap commented Feb 21, 2022

I guess if no one has tried it already, I can try to have a go at using it before writing more about it

@paigem
Copy link
Contributor

paigem commented Feb 21, 2022

That would be great @paolap - sounds like you're already a step ahead of me in knowing what it is! But happy to help if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants