Skip to content

Conversation

@wjhuang2016
Copy link
Member

What problem does this PR solve?

Issue Number: close #65502

Problem Summary:
Global analyze memory "in-use" can temporarily become negative during Analyze v2 due to cleanup order, which may lead to confusing metrics and violates invariants in internal-check builds.

What changed and how does it work?

  • Adjust AnalyzeColumnsExecV2 worker cleanup so buffered Consume is applied before buffered Release.
  • Add an internal assertion to ensure LabelForGlobalAnalyzeMemory (in-use) is never negative.
  • Add a regression test for Analyze v2 memory usage invariant.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

@ti-chi-bot ti-chi-bot bot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/needs-triage-completed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 9, 2026
@codecov
Copy link

codecov bot commented Jan 9, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 66.3495%. Comparing base (3455e86) to head (3c99c89).
⚠️ Report is 94 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #65503        +/-   ##
================================================
- Coverage   70.8020%   66.3495%   -4.4526%     
================================================
  Files          1901       1958        +57     
  Lines        518502     538869     +20367     
================================================
- Hits         367110     357537      -9573     
- Misses       126860     158634     +31774     
+ Partials      24532      22698      -1834     
Flag Coverage Δ
integration 41.5173% <100.0000%> (-6.6495%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 52.8700% <ø> (ø)
parser ∅ <ø> (∅)
br 36.4013% <ø> (-21.8285%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Comment on lines -687 to -688
defer e.memTracker.Consume(bufferedMemSize)
defer e.memTracker.Release(bufferedReleaseSize)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this version, we always consume 0 and release 0, as arguments of defer is computed before defer function called

@wjhuang2016
Copy link
Member Author

Addressed the review comment from @D3Hunter: the worker cleanup now uses a deferred closure so bufferedMemSize/bufferedReleaseSize are evaluated at defer execution time (not at defer registration time), and we execute Consume before Release to avoid transient negative in-use values.

@wjhuang2016
Copy link
Member Author

/retest-required

@wjhuang2016
Copy link
Member Author

/retest

@ti-chi-bot
Copy link

ti-chi-bot bot commented Jan 9, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: winoros

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Jan 9, 2026
@ti-chi-bot
Copy link

ti-chi-bot bot commented Jan 9, 2026

[LGTM Timeline notifier]

Timeline:

  • 2026-01-09 09:21:30.720288112 +0000 UTC m=+3734.782153021: ☑️ agreed by winoros.

@ti-chi-bot ti-chi-bot bot added the approved label Jan 9, 2026
@winoros
Copy link
Member

winoros commented Jan 9, 2026

/hold for discussing changes around the test case

@ti-chi-bot ti-chi-bot bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 9, 2026
tk.MustExec("set @@tidb_build_sampling_stats_concurrency=1")
tk.MustExec("use test")
tk.MustExec("drop table if exists t_mem_usage")
tk.MustExec("create table t_mem_usage(a text collate utf8mb4_general_ci)")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can

  1. Change the type to varchar or
  2. Do an explicit set for the var tidb_analyze_skip_column_types to ensure that we will not skip the text type when building stats.

@tiprow
Copy link

tiprow bot commented Jan 9, 2026

@wjhuang2016: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
tidb_parser_test 3c99c89 link true /test tidb_parser_test
fast_test_tiprow 3c99c89 link true /test fast_test_tiprow

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@winoros
Copy link
Member

winoros commented Jan 9, 2026

/unhold

@ti-chi-bot ti-chi-bot bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 9, 2026
@ti-chi-bot
Copy link

ti-chi-bot bot commented Jan 9, 2026

@wjhuang2016: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
idc-jenkins-ci-tidb/mysql-test 3c99c89 link true /test mysql-test
pull-unit-test-next-gen 3c99c89 link true /test pull-unit-test-next-gen
idc-jenkins-ci-tidb/check_dev_2 3c99c89 link true /test check-dev2
idc-jenkins-ci-tidb/unit-test 3c99c89 link true /test unit-test

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved needs-1-more-lgtm Indicates a PR needs 1 more LGTM. release-note-none Denotes a PR that doesn't merit a release note. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

analyze: global analyze memory in-use metric can become negative (Analyze v2)

3 participants