-
Notifications
You must be signed in to change notification settings - Fork 916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Aggregation
serialization in pylibcudf
#17469
base: branch-25.02
Are you sure you want to change the base?
Support Aggregation
serialization in pylibcudf
#17469
Conversation
Aggregation
serialization in pylibcudf
The CI errors from custreamz look related to this PR but I don't think they are in fact. I can locally reproduce them only with upstream Dask/Distributed, but not with |
cdef correlation_aggregation *correlation_cast | ||
cdef covariance_aggregation *covariance_cast | ||
|
||
if self.kind() is Kind.SUM: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we replace this cascade with a singledispatch helper function? It's a lot of clauses to fail through in the worst case.
@@ -170,3 +174,66 @@ cdef extern from "cudf/aggregation.hpp" namespace "cudf" nogil: | |||
null_policy null_handling, | |||
null_order null_precedence, | |||
rank_percentage percentage) except +libcudf_exception_handler | |||
|
|||
cdef extern from "cudf/detail/aggregation/aggregation.hpp" \ | |||
namespace "cudf::detail" nogil: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm we don't expose pretty much any cudf detail APIs to pylibcudf, and I don't want to start here. Can we open an issue about these? If these are attributes that are absolutely necessary to reconstruct the serialized types, then we should discuss exposing them publicly in libcudf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comment Vyas, I've opened #17630. Please let me know if there's anything else I can do.
Description
Support
Aggregation
serialization in pylibcudf. This is required to provide distributed support for cudf-polars.Unfortunately, serialization of the
Aggregation
class requires access to implementation details in libcudf, so those needed to be exposed to Cython. Fortunately, all the attributes required are already public so there are no changes required in libcudf.Checklist