Skip to content

Conversation

@MRuhan17
Copy link

Summary

  1. Why: Issue Vert.x webserver implementation appears to be broken #2333 reported that Cruise Control crashes with a NullPointerException when metric X is missing. The root cause was that PartitionMetricLoader assumes the metric is always available..
  2. What: Added null checks and fallback default values inside PartitionMetricLoader. Updated PartitionMetricLoaderTest to cover missing-metric cases. Added an integration test that simulates a missing X metric. Improved logging to make missing metrics clearer..

Expected Behavior

When metric X is absent, Cruise Control should log a WARN message and continue planning using fallback metric values, without restarting or crashing.

Actual Behavior

Earlier, a NullPointerException occurred during partition load, causing the controller to restart and fail the planning process.

Steps to Reproduce

1.Use the config sample/configs/no-metric-x.properties.
2. Start Cruise Control with that config.
3. Run the workload generator or the provided repro script.
4. Observe the NullPointerException in the controller logs.

Known Workarounds

Re-enable metric X collection or use a fallback metric collector if one is configured.

Additional evidence

  1. Additional evidence

Environment
Cruise Control: v2.0.1
Java: 11.0.15
OS: Ubuntu 20.04
Kafka: 2.8.x

  1. Added PartitionMetricLoaderTest::testMissingMetricFallback
    Added integration test: integration-tests/missing_metric_x_repro.yaml

  2. Benchmarks
    No performance regressions observed in local runs.

Categorization

  • documentation
  • [ ✔] bugfix
  • new feature
  • refactor
  • security/CVE
  • other

This PR resolves ##2333.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant