Skip to content

feat(optimizer): simplify filter predicate before converting row_number + filter to topn #22295

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 20, 2025

Conversation

stdrc
Copy link
Member

@stdrc stdrc commented Jun 19, 2025

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Note

This PR is prompt-engineered by @stdrc, coded by Cursor Agent (Claude Sonnet 4), reviewed by @stdrc.

Convert filter predicate like row_number - 1 = 0 to row_number = 1 so that such queries can be optimized to using TopN operator. Fix #22290.

Checklist

  • I have written necessary rustdoc comments.
  • I have added necessary unit tests and integration tests.
  • I have added test labels as necessary.
  • I have added fuzzing tests or opened an issue to track them.
  • My PR contains breaking changes.
  • My PR changes performance-critical code, so I will run (micro) benchmarks and present the results.
  • I have checked the Release Timeline and Currently Supported Versions to determine which release branches I need to cherry-pick this PR into.

Documentation

  • My PR needs documentation updates.
Release note

Copy link
Member Author

stdrc commented Jun 19, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@stdrc stdrc changed the title simplify filter predicate before converting row_number + filter to topn feat(optimizer): simplify filter predicate before converting row_number + filter to topn Jun 19, 2025
@github-actions github-actions bot added type/feature Type: New feature. and removed Invalid PR Title labels Jun 19, 2025
@stdrc stdrc marked this pull request as ready for review June 19, 2025 16:57
@stdrc stdrc requested review from xxchan, kwannoel and Copilot June 19, 2025 17:07
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR simplifies filter predicates with arithmetic on window functions to enable TopN optimizations by converting expressions like (row_number - 1) = 0 to row_number = 1. Key changes include:

  • Adding methods in OverWindowToTopNRule to simplify arithmetic expressions within filter conditions.
  • Introducing FilterArithmeticRewriter that rewrites the arithmetic comparison expressions.
  • Updating test cases to validate the new optimization patterns.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
src/frontend/src/optimizer/rule/over_window_to_topn_rule.rs Added logic to simplify arithmetic expressions in filter predicates and refactored expression rewriting
src/frontend/planner_test/tests/testdata/output/over_window_function.yaml Updated expected logical and optimized plans for window function predicate optimization
src/frontend/planner_test/tests/testdata/input/over_window_function.yaml Added new test cases validating the arithmetic simplification on window functions

@stdrc stdrc requested a review from chenzl25 June 19, 2025 17:20
impl OverWindowToTopNRule {
/// Simplify arithmetic expressions in filter conditions before TopN optimization
/// For example: (`row_number` - 1) = 0 -> `row_number` = 1
fn simplify_filter_arithmetic(&self, filter: &LogicalFilter) -> Option<LogicalFilter> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe simplify_comparison_arithmetic?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplify_filter_arithmetic is called for Filter node with no brain, I guess the name works.

LogicalProject { exprs: [t.a, t.b, t.c, $expr1] }
└─LogicalFilter { predicate: ($expr1 = 2:Int32) }
└─LogicalProject { exprs: [t.a, t.b, t.c, (row_number + 1:Int32) as $expr1] }
└─LogicalOverWindow { window_functions: [row_number() OVER(PARTITION BY t.a ORDER BY t.b ASC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)] }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential for future optimization? Seems like rn_plus_one is always 2. I guess this can just be a project

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, but I guess if the condition is like this one, the user won't want to keep the row number in the final select list.

Signed-off-by: Richard Chien <[email protected]>
@stdrc stdrc enabled auto-merge June 20, 2025 06:09
@stdrc stdrc disabled auto-merge June 20, 2025 06:10
@stdrc stdrc enabled auto-merge June 20, 2025 06:15
@stdrc stdrc added this pull request to the merge queue Jun 20, 2025
Merged via the queue into main with commit 12da00c Jun 20, 2025
29 of 31 checks passed
@stdrc stdrc deleted the rc/support-ibis-row-number-topn branch June 20, 2025 07:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/feature Type: New feature.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ibis row_number() generating RW row_number() - 1 prevents RW from optimizing OverWindow to GroupTopN
3 participants