Skip to content

Conversation

@vivek8420
Copy link
Collaborator

@vivek8420 vivek8420 commented Jul 22, 2025

Issues

  • My PR addresses the following Helix issues and references them in the PR description:
  • We encountered an issue where an instance assumed leadership of the cluster. However, after a leadership change, the cleanup process completed, but the leader information was not properly removed. As a result, no other instance was able to acquire leadership.
  • This failure was silent — there were no notifications or alerts generated.
  • To address this, we are adding sensors to detect this scenario, enabling us to build visualizations and improve observability.

(apache#200 - Link your issue number here: You can write "Fixes #XXX". Please use the proper keyword so that the issue gets closed automatically. See https://docs.github.com/en/github/managing-your-work-on-github/linking-a-pull-request-to-an-issue
Any of the following keywords can be used: close, closes, closed, fix, fixes, fixed, resolve, resolves, resolved)

Description

  • Here are some details about my PR, including screenshots of any UI changes:

(Write a concise description including what, why, how)

Tests

  • The following tests are written for this issue:

(List the names of added unit/integration tests)

  • testLeadershipFailureMetrics: Tests that ClusterStatusMonitor correctly tracks and reports leadership failure events through JMX metrics.
  • testStillLeaderDuringResetMetrics: Tests that ClusterStatusMonitor correctly tracks controllers that remain leader during reset operations.
  • testOnBecomeLeaderFromStandby_whenMultipleInstancesTrigger: Tests that when multiple controllers compete for leadership, only one succeeds and leadership failures are properly tracked.
  • The following is the result of the "mvn test" command on the appropriate module:

(If CI test fails due to known issue, please specify the issue and test PR locally. Then copy & paste the result of "mvn test" to here.)

Changes that Break Backward Compatibility (Optional)

  • My PR contains changes that break backward compatibility or previous assumptions for certain methods or API. They include:

(Consider including all behavior changes for public methods or API. Also include these changes in merge description so that other developers are aware of these changes. This allows them to make relevant code changes in feature branches accounting for the new method/API behavior.)

Documentation (Optional)

  • In case of new functionality, my PR adds documentation in the following wiki page:

(Link the GitHub wiki you added)

Commits

  • My commits all reference appropriate Apache Helix GitHub issues in their subject lines. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Code Quality

  • My diff has been formatted using helix-style.xml
    (helix-style-intellij.xml if IntelliJ IDE is used)

zpinto and others added 22 commits March 14, 2024 12:26
…#2763)"

This reverts commit 53b5889.

This dependency is banned at li.
Revert "upgrade xstream to 1.4.20 to pick up fixes for 2 CVEs (apache#2763)" dep is banned internally
New Release Snapshot with several fixes:

[apache/helix] -- Issue during onboarding resources without instances apache#2782
[apache/helix] -- Provide JDK 1.8 (backward) compatibility of helix-core apache#2775
Do not start the server if user uses the default SECRET_TOKEN env value apache#2783
Delete expected version apache#2759
[apache/helix] -- Fix PreferenceList Ordering Changes during Maintenance Mode apache#2778
…compat

[Linkedin/Helix] -- Provide JDK 1.8 (backward) compatibility for meta-client
[Linkedin/Helix] -- Provide JDK 1.8 (backward) compatibility for helix modules
Release for helix 1.3.2-dev-202406121430
Release for helix 1.3.2-dev-202406131130
Merge `master` into `release`
Release for helix 1.4.3-dev-202412052251
Release for helix 1.4.3-dev-202502211050 
Merge branch gspencer/release-20250221 into release branch
Release for Release for helix 1.4.3-dev-202505071400-hf-HELIX-5658
@vivek8420 vivek8420 changed the base branch from release to master July 23, 2025 05:07
@vivek8420 vivek8420 changed the base branch from master to dev July 23, 2025 07:29
Copy link
Collaborator

@proud-parselmouth proud-parselmouth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just made a minor comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants