-
Notifications
You must be signed in to change notification settings - Fork 1k
Refactor groupby aggregation, removing simple_aggregations_collector and aggregation_finalizer classes
#21064
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR refactors the groupby aggregation system by removing visitor pattern base classes (simple_aggregations_collector and aggregation_finalizer) and replacing them with template functors that use explicit specializations. This eliminates boilerplate code from aggregation class definitions and localizes special handling logic.
Changes:
- Removed visitor pattern infrastructure and virtual methods from aggregation classes
- Introduced template functors with specializations for preprocessing and finalizing aggregations
- Replaced visitor dispatch with
cudf::detail::aggregation_dispatcher()calls
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| cpp/src/rolling/detail/rolling.cuh | Refactored rolling window preprocessing/postprocessing from visitor classes to template functors with specializations |
| cpp/src/groupby/hash/hash_compound_agg_finalizer.hpp | Changed from visitor class to context struct plus template functor with specialization declarations |
| cpp/src/groupby/hash/hash_compound_agg_finalizer.cu | Implemented template functor specializations for compound aggregation finalization |
| cpp/src/groupby/hash/extract_single_pass_aggs.cpp | Replaced visitor class with template functor for collecting simple aggregations |
| cpp/src/groupby/hash/compute_groupby.cu | Updated to use new functor-based dispatch instead of visitor pattern |
| cpp/src/aggregation/aggregation.cpp | Removed visitor pattern base class implementations (400+ lines of boilerplate) |
| cpp/include/cudf/detail/aggregation/aggregation.hpp | Removed visitor base class declarations and virtual method declarations from aggregation classes |
| cpp/include/cudf/aggregation.hpp | Removed forward declarations of visitor classes and virtual method declarations from base aggregation class |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
9bd8fc5 to
923bc25
Compare
Signed-off-by: Nghia Truong <[email protected]> # Conflicts: # cpp/include/cudf/detail/aggregation/aggregation.hpp
…derived classes Signed-off-by: Nghia Truong <[email protected]>
Signed-off-by: Nghia Truong <[email protected]>
| /** | ||
| * @brief No-parameter constructor. | ||
| * | ||
| * This constructor is never called anywhere, and should never be called at all. However, it | ||
| * cannot be be declared as deleted due to the usage of CRTP (Curiously Recurring Template | ||
| * Pattern) helper to automatically implement `clone()` method for derived aggregation classes. As | ||
| * such, definition of this constructor is just to satisfy the compiler requirements. | ||
| */ | ||
| aggregation() : kind{static_cast<Kind>(-1)} | ||
| { | ||
| CUDF_FAIL("No-parameter aggregation constructor should never be called"); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was not an issue before, until this PR that removes all virtual function implementation from the most derived classes. Now, only the middle base class (derive_from) has virtual function implementation, causing the compiler to require a default constructor for it, which in turn requires to call into the default constructor of aggregation.
This PR removes the
simple_aggregations_collectorandaggregation_finalizervisitor-pattern classes and replaces them with template functors specialized only for aggregation types requiring special handling.Changes
simple_aggregations_collectorandaggregation_finalizerbase classesget_simple_aggregations()andfinalize()virtual methods from all aggregation classessimple_aggregation_collector_fn– decomposes compound aggregations (MEAN→SUM+COUNT, etc.)hash_compound_agg_finalizer_fn– postprocesses compound aggregation resultsrolling_preprocessor_fn/rolling_postprocessor_fn– rolling window aggregation operatorscudf::detail::aggregation_dispatcher()with these functorsBenefits
Closes #21059.