-
Notifications
You must be signed in to change notification settings - Fork 6.1k
planner: lateral join support | tidb-test=pr/2664 #65656
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Hi @terry1purcell. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds initial support for LATERAL joins in TiDB's query planner. LATERAL allows derived tables (subqueries) to reference columns from tables that appear to their left in the FROM clause, enabling correlated references in join contexts. The PR is marked as "WIP" (Work In Progress).
Changes:
- Added LATERAL keyword parsing support in the SQL parser
- Implemented LATERAL join planning using LogicalApply nodes
- Prevented join reordering for LATERAL joins to preserve execution dependencies
- Added error handling for invalid LATERAL join usage
Reviewed changes
Copilot reviewed 11 out of 12 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| test_lateral_simple.go | Basic smoke test for LATERAL parsing and restoration |
| pkg/parser/parser.y | Added LATERAL keyword and grammar rule for LATERAL derived tables |
| pkg/parser/misc.go | Registered LATERAL token in keyword map |
| pkg/parser/ast/dml.go | Added Lateral boolean field to TableSource with Restore support |
| pkg/parser/mysql/errcode.go | Added ErrInvalidLateralJoin error code (3809) |
| pkg/errno/errcode.go | Added ErrInvalidLateralJoin constant |
| pkg/errno/errname.go | Added error message for invalid LATERAL usage |
| pkg/util/dbterror/plannererrors/planner_terror.go | Registered ErrInvalidLateralJoin in planner errors |
| pkg/planner/core/logical_plan_builder.go | Implemented buildLateralJoin using LogicalApply, disabled join reorder for LATERAL |
| pkg/planner/core/rule_join_reorder.go | Prevented reordering of LogicalApply nodes used by LATERAL |
| DEPS.bzl | Dependency updates (not reviewed in detail) |
| pkg/parser/parser.go | Generated parser code (not reviewed) |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #65656 +/- ##
================================================
+ Coverage 77.7655% 79.3589% +1.5933%
================================================
Files 2001 1949 -52
Lines 545901 537589 -8312
================================================
+ Hits 424523 426625 +2102
+ Misses 119716 109527 -10189
+ Partials 1662 1437 -225
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
This file is auto-generated by genkeyword and needs to be updated when new keywords are added to misc.go. The change adds LATERAL to the reserved keywords list. Also includes formatting changes to logical_plan_builder.go from go fmt alignment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 14 out of 15 changed files in this pull request and generated 5 comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 15 out of 16 changed files in this pull request and generated 8 comments.
Update keyword counts in tests to reflect the addition of the LATERAL keyword: - Total keywords: 673 → 674 - Reserved keywords: 232 → 233 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Add comprehensive integration tests covering: - Basic LATERAL join with comma syntax - LATERAL with ORDER BY and LIMIT - LATERAL with aggregates - LEFT JOIN LATERAL - LATERAL with recursive CTEs - Multiple left-side tables - Nested subqueries in LATERAL - LATERAL without correlation - RIGHT JOIN with LATERAL error (3809) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Regenerate errors.toml to include the new ErrInvalidLateralJoin error code (3809) added for LATERAL join support. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Change error code validation to use ErrInvalidLateralJoin.Equal(err) instead of comparing expectedErrorCode against the static error code. This ensures the test actually validates the returned error rather than just checking if expectedErrorCode equals 3809. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 22 out of 23 changed files in this pull request and generated 2 comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 26 out of 27 changed files in this pull request and generated 1 comment.
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 27 out of 28 changed files in this pull request and generated no new comments.
|
@pantheon-bot please review. |
|
P1: The Evidence: Fix: Add P1: Optimizer can mis-decorrelate correlated LATERAL Apply into plain Join The decorrelation logic in Evidence:
Fix: Extract correlated columns against the same schema used for name resolution (FullSchema when present), or ensure redundant columns are included in the extraction schema P2: FullSchema/CorCols extraction mismatch causes NULL values for redundant columns When building a LATERAL join where the left side is a Evidence:
Fix: Align CorCols extraction with the outer scope used for name resolution P2: Nested LATERAL loses visibility of redundant columns when left becomes LogicalApply When building nested LATERAL joins, if the left plan is a Evidence: Fix: Extend the FullSchema check to also handle P2: LogicalApply statistics assume 1 row per outer row
Evidence: Fix: For |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 27 out of 28 changed files in this pull request and generated 3 comments.
Co-authored-by: Copilot <[email protected]>
|
@pantheon-bot please review |
Code Review ResultsP1 Issues (Correctness - Wrong Results)Issue 1: Correlated columns dropped during pruning for LATERAL over USING/NATURAL joins
Issue 2: Nested LATERAL loses correlation when outer side is LogicalApply
P2 Issues (Quality/Diagnostics)Issue 3: LATERAL outer-scope resolution breaks with wrapped left side
Issue 4: LATERAL LEFT JOIN stats under-estimation
Issue 5: ONLY_FULL_GROUP_BY error formatting regression
Summary: Identified 2 P1 correctness issues and 3 P2 quality issues. The P1 issues involve correlated column handling in LATERAL joins with USING/NATURAL - both can silently produce wrong results (NULLs instead of correlated values) in specific patterns. Recommend fixing these before merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 28 out of 29 changed files in this pull request and generated 2 comments.
| // Clone output names before modifying to avoid mutating shared structs | ||
| if x.AsName.L != "" { | ||
| clonedNames := make([]*types.FieldName, len(p.OutputNames())) | ||
| for i, name := range p.OutputNames() { | ||
| if name.Hidden { | ||
| clonedNames[i] = name | ||
| continue | ||
| } | ||
| // Clone the field name and update table name | ||
| // For derived tables (subqueries), clear DBName to avoid confusion with actual tables | ||
| // For base tables with aliases, preserve DBName for proper metadata and DEFAULT() resolution | ||
| dbName := ast.NewCIStr("") | ||
| if isTableName { | ||
| dbName = name.DBName | ||
| } | ||
| clonedNames[i] = &types.FieldName{ | ||
| DBName: dbName, | ||
| OrigTblName: name.OrigTblName, | ||
| OrigColName: name.OrigColName, | ||
| TblName: x.AsName, | ||
| ColName: name.ColName, | ||
| NotExplicitUsable: name.NotExplicitUsable, | ||
| Redundant: name.Redundant, | ||
| Hidden: name.Hidden, | ||
| } | ||
| } | ||
| p.SetOutputNames(clonedNames) |
Copilot
AI
Feb 3, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes at lines 461-487 introduce a behavior change where derived tables (subqueries with aliases) now have their DBName cleared to an empty string. This affects error messages and potentially other functionality that depends on DBName being set.
Looking at the test result changes in tests/integrationtest/r/planner/core/integration.result (lines 2350, 2352) and tests/integrationtest/r/executor/aggregate.result (line 1519), error messages that previously showed fully qualified column names like planner__core__integration.t1.v2 now show .t1.v2 (with empty database name).
While this change may be intentional to avoid confusion between derived tables and actual database tables, it's a breaking change in error message format. Consider:
- Documenting this behavior change in the PR description as it affects user-visible error messages
- Verifying that this doesn't break any tooling or scripts that parse these error messages
- Ensuring that the empty DBName doesn't cause issues elsewhere in the codebase that might assume DBName is always set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Claude code response to this is that we should correct the empty DB name separately in another PR.
| prevSchema := cte.cteClass.SeedPartLogicalPlan.Schema().Clone() | ||
| lp.SetSchema(getResultCTESchema(cte.cteClass.SeedPartLogicalPlan.Schema(), b.ctx.GetSessionVars())) |
Copilot
AI
Feb 3, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change from cte.seedLP.Schema() to cte.cteClass.SeedPartLogicalPlan.Schema() at lines 4513-4514 appears to be accessing the same value through an indirection, since cte.cteClass.SeedPartLogicalPlan is set to cte.seedLP at line 4500. However, this indirection only exists if cte.cteClass == nil at line 4497.
If cte.cteClass is not nil (i.e., it was already created in a previous call), then cte.seedLP and cte.cteClass.SeedPartLogicalPlan might be different if cte.seedLP was somehow modified after the cteClass was created.
While the change seems functionally correct (using the cached value from cteClass), this subtle change could affect CTE behavior in edge cases. Consider adding a comment explaining why this indirection is used instead of the direct reference, or verify that both values are always identical at this point.
| prevSchema := cte.cteClass.SeedPartLogicalPlan.Schema().Clone() | |
| lp.SetSchema(getResultCTESchema(cte.cteClass.SeedPartLogicalPlan.Schema(), b.ctx.GetSessionVars())) | |
| // Use the seed plan stored in cteClass as the canonical source of the schema. | |
| // cteClass.SeedPartLogicalPlan is initialized from cte.seedLP when the CTEClass | |
| // is first created and should remain stable across subsequent uses. | |
| seedSchema := cte.cteClass.SeedPartLogicalPlan.Schema() | |
| prevSchema := seedSchema.Clone() | |
| lp.SetSchema(getResultCTESchema(seedSchema, b.ctx.GetSessionVars())) |
What problem does this PR solve?
Issue Number: ref #40328
Problem Summary:
What changed and how does it work?
Add LATERAL join support to MySQL compatible behavior. Include LATERAL + ORDER BY + LIMIT within a recursive query - where the ORDER BY + LIMIT is within a lateral join.
Claude code review:
LATERAL Join Implementation - Code Review
Overview
The implementation adds MySQL 8.0+ LATERAL join support to TiDB, allowing derived tables to
reference columns from tables to their left in the FROM clause. The approach reuses the
existing LogicalApply infrastructure.
Files Modified
Positive Findings
The choice to reuse LogicalApply is appropriate - LATERAL joins are semantically equivalent to
correlated subqueries, and this avoids duplicating execution infrastructure.
defer func() {
b.outerSchemas = b.outerSchemas[:len(b.outerSchemas)-1]
b.outerNames = b.outerNames[:len(b.outerNames)-1]
b.buildingLateralSubquery = saveBuildingLateral
}()
Consolidating all cleanup in a single defer ensures consistent cleanup in both success and
error paths.
leftLen := leftPlan.Schema().Len()
for i := range rightPlan.Schema().Len() {
ap.Schema().Columns[leftLen+i].Index = leftLen + i
}
Critical fix to update column indices after schema merge - prevents incorrect data reads from
projections.
Properly handles FullSchema and FullNames for LogicalJoin compatibility.
Deep cloning output names prevents unintended mutations, including hidden names.
Potential Issues and Suggestions
Issue 1: Missing Join Reorder Block for LogicalApply
Location: rule_join_reorder.go lines 58-67
Observation: The current code disables join reorder flag in buildJoin (line 593-595), but
extractJoinGroup still checks for LogicalJoin only:
join, isJoin := p.(*logicalop.LogicalJoin)
if isJoin && join.PreferJoinOrder { ...
Risk: Low - LogicalApply embeds LogicalJoin, so type assertion may succeed but the subsequent
conditions (e.g., join.StraightJoin, join type checks) should prevent extraction into reorder
groups. However, an explicit check would be safer.
Recommendation: Add explicit check at the beginning:
if _, isApply := p.(*logicalop.LogicalApply); isApply {
return &joinGroupResult{group: []base.LogicalPlan{p}, ...}
}
Issue 2: ON Condition Subquery Check May Be Too Restrictive
Location: logical_plan_builder.go line 883
if newPlan != ap {
return nil, plannererrors.ErrInvalidLateralJoin.GenWithStackByArgs("ON condition contains
subqueries")
}
Risk: Medium - This check prevents ALL subqueries in ON conditions, not just correlated ones.
MySQL allows uncorrelated subqueries in ON clauses with LATERAL.
Recommendation: Consider allowing uncorrelated subqueries in ON conditions in a future
enhancement.
Issue 3: curClause Not Reset After ON Condition Processing
Location: logical_plan_builder.go line 877
The code sets b.curClause = onClause before processing ON condition but doesn't reset it after.
This was addressed in commit 78f1034, but verify it doesn't affect subsequent processing if
buildLateralJoin returns early due to an error.
Observation: Since the function returns after ON condition processing (either success or
error), this is acceptable.
Issue 4: Parser Tests Needed for LATERAL Keyword Restoration
Location: ast/dml.go lines 556-558
if n.Lateral {
ctx.WriteKeyWord("LATERAL ")
}
Recommendation: Ensure there are parser tests that verify SQL with LATERAL is properly
restored/printed (round-trip testing).
Issue 5: buildingLateralSubquery Flag Usage
Location: planbuilder.go and logical_plan_builder.go
The buildingLateralSubquery flag is set but I don't see where it's consumed to change behavior.
If it's used to allow ORDER BY/LIMIT in recursive CTE parts, ensure there are tests covering
this interaction.
Minor Code Quality Observations
Line 750: Good function-level comment explaining LATERAL semantics.
lateral_join_test.go line 1: Uses "Copyright 2026" which is incorrect (should be 2025 or
current year).
All error messages use the same format pattern via ErrInvalidLateralJoin.GenWithStackByArgs() -
good consistency.
Security Considerations
No security issues identified. The implementation:
Performance Considerations
expected behavior but may be slow for large datasets. The decorrelation optimizer
(FlagDecorrelate) is enabled to convert to joins when safe.
behavior.
Summary
The LATERAL join implementation is well-designed and thoroughly implemented. Key strengths:
Suggested improvements:
Overall Assessment: Ready for merge with minor cleanup.
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.