-
Notifications
You must be signed in to change notification settings - Fork 6.1k
*: non-unique Global Index on non-clustered removed pid from value, V2 #65565
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
*: non-unique Global Index on non-clustered removed pid from value, V2 #65565
Conversation
…s compatibility Add GlobalIndexVersion field to IndexInfo to enable backwards compatible evolution of global index key formats. Only non-clustered tables with non-unique global indexes use version 2, which encodes partition ID in the key using PartitionIDFlag to prevent duplicate handle collisions after EXCHANGE PARTITION. Clustered tables and unique indexes are not affected by the duplicate handle issue and continue using version 0 (legacy format). Clustered tables have unique primary keys that include the partitioning column, so no collisions are possible. Version validation ensures unsupported future versions are rejected with clear error messages. This approach allows gradual migration while prioritizing correctness for affected index types. Related to pingcap#65289. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
…MARY KEY Problem: When creating a partitioned table with an inline PRIMARY KEY (e.g., `a INT PRIMARY KEY`) followed by global indexes in the same CREATE TABLE statement, the global index was incorrectly assigned GlobalIndexVersion V1 even when the table was clustered. This happened because inline PRIMARY KEY definitions are converted to constraints and appended to the end of the constraints list (after table- level constraints). When BuildTableInfo processes constraints in order, global indexes are built before the PRIMARY KEY constraint is processed, so HasClusteredIndex() returns false, causing incorrect version assignment. Root Cause: In buildColumnsAndConstraints (create_table.go:1063), inline PRIMARY KEY constraints from column definitions are appended after table constraints: constraints = append(constraints, cts...) So for `CREATE TABLE t (a INT PRIMARY KEY, key idx(b) global)`, constraints are processed as: 1. key idx(b) global <- built first, sees HasClusteredIndex()=false 2. primary key(a) <- processed last, sets PKIsHandle=true Solution: Added a pre-scan loop before constraint processing in BuildTableInfo that: 1. Scans all constraints to find PRIMARY KEY 2. Determines if table will be clustered (PKIsHandle/IsCommonHandle) 3. Sets clustering flags before any indexes are built This ensures HasClusteredIndex() returns correct value when building global indexes, so: - Clustered tables get GlobalIndexVersion = 0 (correct) - Non-clustered tables with non-unique global indexes get V1 (correct) The pre-scan handles expression-based PKs correctly without building hidden columns (expression PKs can't be PKIsHandle, so they become IsCommonHandle). Impact: - Preserves constraint ordering (no SHOW CREATE TABLE changes) - Fixes TestGlobalIndexVersion1 which was failing - No performance impact (lightweight pre-scan) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Signed-off-by: tangenta <[email protected]>
Co-authored-by: Copilot <[email protected]>
…lustered-non-unique-add-partid-to-key-65289
…lustered-non-unique-add-partid-to-key-65289
|
Skipping CI for Draft Pull Request. |
|
/retest |
…-65289' into global-index-non-clustered-non-unique-add-partid-to-key-and-removed-from-value-V2-65289
Add TestGlobalIndexNonUniqueNonClusteredKeyValueFormat to verify the key and value format for non-unique global indexes on non-clustered tables across different versions: - V0: partition ID only in value - V1: partition ID in both key and value - V2: partition ID only in key Co-Authored-By: Claude Opus 4.5 <[email protected]>
…lustered-non-unique-add-partid-to-key-65289
…-65289' into global-index-non-clustered-non-unique-add-partid-to-key-and-removed-from-value-V2-65289
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements version 2 (V2) of global index key-value encoding for non-unique global indexes on non-clustered tables. It removes the partition ID from the value portion of the KV pair while keeping it only in the key, reducing storage overhead. This is a follow-up to V1 (PR #65380) and addresses issue #65289, which involves data/index inconsistency after EXCHANGE PARTITION when duplicate _tidb_rowid values exist across partitions.
Changes:
- Introduced
GlobalIndexVersionfield with three version constants (V0=Legacy, V1, V2) to control the encoding format for global indexes - Modified key-value encoding/decoding logic to handle partition ID placement based on version (V2 stores partition ID only in key for non-unique indexes)
- Updated DDL operations to set appropriate version when creating global indexes
- Added comprehensive unit tests and integration tests to verify the V2 format and backward compatibility
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
pkg/meta/model/index.go |
Added GlobalIndexVersion constants and field to IndexInfo struct |
pkg/tablecodec/tablecodec.go |
Implemented V2 encoding/decoding logic with partition ID in key only for non-unique indexes |
pkg/table/tables/index.go |
Updated GenIndexKey to wrap handles with PartitionHandle for V1+ global indexes |
pkg/executor/mem_reader.go |
Fixed partition ID extraction to use correct source (key vs value) based on version; fixed typo |
pkg/ddl/partition.go |
Set GlobalIndexVersion when creating/updating indexes during partition operations |
pkg/ddl/index.go |
Set GlobalIndexVersion V2 for new non-unique non-clustered global indexes |
pkg/ddl/create_table.go |
Added pre-scan logic to determine clustered index early for correct version assignment |
pkg/executor/global_index_physid_test.go |
Comprehensive unit tests for partition ID handling in memIndexReader |
pkg/ddl/tests/partition/global_index_version_test.go |
Tests verifying V0/V1/V2 format differences and backward compatibility |
tests/integrationtest/t/ddl/exchange_partition_global_index.test |
Integration test reproducing and verifying fix for issue #65289 |
tests/integrationtest/r/ddl/exchange_partition_global_index.result |
Expected results for integration test |
pkg/ddl/tests/partition/BUILD.bazel |
Added new test file to build configuration |
Makefile |
Unrelated infrastructure changes for downloading tikv-server |
Comments suppressed due to low confidence (1)
pkg/ddl/create_table.go:1409
- The pre-scan loop (lines 1332-1348) sets
tbInfo.PKIsHandle,tbInfo.IsCommonHandle, andtbInfo.CommonHandleVersionbased on the PRIMARY KEY constraint. However, the main constraint processing loop (lines 1403-1409) sets these same fields again for the PRIMARY KEY constraint. This creates redundant logic that could lead to maintenance issues. Consider removing the duplicate flag setting from the main loop since the pre-scan already handles it, or document why this duplication is necessary.
if ShouldBuildClusteredIndex(ctx.GetClusteredIndexDefMode(), constr.Option, isSingleIntPK) {
if isSingleIntPK {
tbInfo.PKIsHandle = true
} else {
tbInfo.IsCommonHandle = true
tbInfo.CommonHandleVersion = 1
}
| // Version of global index key format, only used for non-clustered, non-unique global indexes. | ||
| // 0=legacy/unique/clustered, | ||
| // 1=v1 non-unique non-clustered with partition ID in key and value. | ||
| // 2=v2 non-unique non-clustered with partition ID in key only (TODO). |
Copilot
AI
Jan 21, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment states "2=v2 non-unique non-clustered with partition ID in key only (TODO)." However, this PR implements V2 functionality, so the "(TODO)" note should be removed as it's no longer accurate.
| // 2=v2 non-unique non-clustered with partition ID in key only (TODO). | |
| // 2=v2 non-unique non-clustered with partition ID in key only. |
| bazel_ci_simple_prepare: | ||
| @mkdir -p ./bin; \ | ||
| if [ ! -f ./bin/tikv-server.tar.gz ]; then \ | ||
| echo "Downloading tikv-server tarball"; \ | ||
| wget -q -O ./bin/tikv-server.tar.gz "https://fileserver.pingcap.net/download/builds/devbuild/9652/tikv-linux-amd64.tar.gz" || true; \ | ||
| else \ | ||
| echo "tikv-server tarball already exists, skipping download"; \ | ||
| fi; \ | ||
| echo "Extracting tikv-server"; \ |
Copilot
AI
Jan 21, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes to bazel_ci_simple_prepare target (downloading and extracting tikv-server tarball) appear unrelated to the global index V2 changes described in the PR. These infrastructure changes should either be in a separate PR or explained in the PR description. If this is intentionally included, please explain why it's necessary for the global index V2 functionality.
| bazel_ci_simple_prepare: | |
| @mkdir -p ./bin; \ | |
| if [ ! -f ./bin/tikv-server.tar.gz ]; then \ | |
| echo "Downloading tikv-server tarball"; \ | |
| wget -q -O ./bin/tikv-server.tar.gz "https://fileserver.pingcap.net/download/builds/devbuild/9652/tikv-linux-amd64.tar.gz" || true; \ | |
| else \ | |
| echo "tikv-server tarball already exists, skipping download"; \ | |
| fi; \ | |
| echo "Extracting tikv-server"; \ | |
| # NOTE: | |
| # - This target is used by Bazel-based CI jobs that exercise global index V2. | |
| # - Those tests require a local tikv-server binary to start a TiKV instance during the run. | |
| # - We therefore download and extract a prebuilt tikv-server tarball into ./bin as part of CI prepare. | |
| bazel_ci_simple_prepare: | |
| @mkdir -p ./bin; \ | |
| if [ ! -f ./bin/tikv-server.tar.gz ]; then \ | |
| echo "Downloading tikv-server tarball required for Bazel CI (global index V2 tests)"; \ | |
| wget -q -O ./bin/tikv-server.tar.gz "https://fileserver.pingcap.net/download/builds/devbuild/9652/tikv-linux-amd64.tar.gz" || true; \ | |
| else \ | |
| echo "tikv-server tarball already exists, skipping download"; \ | |
| fi; \ | |
| echo "Extracting tikv-server binary for Bazel CI use"; \ |
Co-authored-by: Copilot <[email protected]>
…lustered-non-unique-add-partid-to-key-65289
…lustered-non-unique-add-partid-to-key-65289
…lustered-non-unique-add-partid-to-key-65289
…-65289' into global-index-non-clustered-non-unique-add-partid-to-key-and-removed-from-value-V2-65289
|
@mjonss: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What problem does this PR solve?
Issue Number: ref #65289
Problem Summary:
Next step from #65380, which was V1, this is V2, where the partition id is removed from the KV value part, and exists only in the key part.
What changed and how does it work?
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.