Skip to content

analyze: global analyze memory in-use metric can become negative (Analyze v2) #65502

@wjhuang2016

Description

@wjhuang2016

Bug Report

1. Minimal reproduce step (Required)

  1. Create a table with large TEXT rows (to stress sampling collector memory).
  2. Set analyze v2 and limit concurrency to increase chance of cleanup reordering:
    • set @@tidb_analyze_version=2;
    • set @@tidb_build_sampling_stats_concurrency=1;
  3. Run:
    • analyze table <tbl> with 1.0 samplerate;

(Optionally) enable internal check/assert to catch the invariant violation:

  • build/test with --tags=intest, or
  • export GO_FAILPOINTS="/enableInternalCheck=return(true)" (with failpoint-enabled build)

2. What did you expect to see? (Required)

Global analyze memory usage (label: LabelForGlobalAnalyzeMemory, metric: analyze/inuse) should never be negative.

3. What did you see instead (Required)

The in-use value can become temporarily negative during Analyze v2 execution (observed via metrics / internal check).

4. Root cause analysis (Optional)

In AnalyzeColumnsExecV2.subBuildWorker, buffered consume/release are deferred in a way that may apply Release before Consume (because multiple defers execute LIFO),
causing transient negative bytesConsumed for the global analyze memory tracker.

5. Proposed fix (Optional)

  • Ensure buffered Consume is applied before buffered Release in worker cleanup.
  • Add an internal invariant check for LabelForGlobalAnalyzeMemory.
  • Add regression test for Analyze v2 to avoid future regressions.

6. Affected versions (Optional)

master

7. Related PR (Optional)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions