-
Notifications
You must be signed in to change notification settings - Fork 61
Best Practices guide for creation of good GeoParquet files (focused on distribution) #254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is great and id love to see it merged at some point |
Curious, what’s needed to merge this PR? To me it's looking really nice. |
Sorry, it's felt like 90% done for way too long, I just haven't found the time to polish it off.
I was hoping to cover more tools, and that held me up. And then I wanted to give review on the Sedona one, as it wasn't quite the way I wanted it to look. And I think DuckDB needs a bit more work. And then was hoping to add geoparquet-tools as an option that does everything right, but I need to rename it to get a pip release. But clearly I just need to cut scope and ship. I'll try to get it in a ready state sometime this week - it's been too long for sure. |
Co-authored-by: Dewey Dunnington <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just took a read through for grammar and typos. All optional fixes from me...looks great!
## Spatial Partitioning | ||
|
||
Most tools don't yet provide any way to do automatic spatial partitioning across files, when you have larger datasets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll rejig this example to look more like the other examples...the other examples actually are doing spatial partitioning, too, they're just repurposing a non-spatial mechanism (sorting and file rotation) to do so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(after this merges!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good - and yes, feel free to update the language on how we describe it.
Co-authored-by: Dewey Dunnington <[email protected]>
Co-authored-by: Dewey Dunnington <[email protected]>
Co-authored-by: Dewey Dunnington <[email protected]>
Co-authored-by: Dewey Dunnington <[email protected]>
Co-authored-by: Dewey Dunnington <[email protected]>
Co-authored-by: Dewey Dunnington <[email protected]>
Co-authored-by: Dewey Dunnington <[email protected]>
Co-authored-by: Dewey Dunnington <[email protected]>
Co-authored-by: Dewey Dunnington <[email protected]>
Co-authored-by: Dewey Dunnington <[email protected]>
Co-authored-by: Dewey Dunnington <[email protected]>
Co-authored-by: Dewey Dunnington <[email protected]>
Co-authored-by: Dewey Dunnington <[email protected]>
Co-authored-by: Dewey Dunnington <[email protected]>
Co-authored-by: Dewey Dunnington <[email protected]>
Thanks for the clean up, I accepted all the changes. Will merge in now. |
Attempt to pull together recommendations / best practices as discussed in #251.
More work needed, feedback / help is very welcome. Likely more to discuss to get the recommendations right, but wanted to put up something for people to react to.