Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement-16872][Master] Select a coordinator from masters to wake up task group #16873

Merged
merged 3 commits into from
Dec 18, 2024

Conversation

ruanwenjun
Copy link
Member

@ruanwenjun ruanwenjun commented Dec 3, 2024

Purpose of the pull request

close #16872

Brief change log

  • Add MasterCoordinator to control TaskGroupCoordinator

Verify this pull request

This pull request is code cleanup without any test coverage.

(or)

This pull request is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(or)

Pull Request Notice

Pull Request Notice

If your pull request contain incompatible change, you should also add it to docs/docs/en/guide/upgrede/incompatible.md

@ruanwenjun ruanwenjun force-pushed the dev_wenjun_addMasterCoordinator branch 2 times, most recently from b9327bb to 680b125 Compare December 3, 2024 12:49
@ruanwenjun ruanwenjun added this to the 3.3.0 milestone Dec 3, 2024
@ruanwenjun ruanwenjun added the improvement make more easy to user or prompt friendly label Dec 3, 2024
@ruanwenjun ruanwenjun marked this pull request as draft December 4, 2024 01:47
@ruanwenjun ruanwenjun force-pushed the dev_wenjun_addMasterCoordinator branch from 680b125 to 0b0f97b Compare December 5, 2024 12:15
@github-actions github-actions bot added the test label Dec 5, 2024
@ruanwenjun ruanwenjun force-pushed the dev_wenjun_addMasterCoordinator branch 2 times, most recently from ce0e0b9 to 2cf0153 Compare December 5, 2024 12:58
@ruanwenjun ruanwenjun marked this pull request as ready for review December 5, 2024 12:58
@ruanwenjun ruanwenjun force-pushed the dev_wenjun_addMasterCoordinator branch 6 times, most recently from 66c1a8b to dd78545 Compare December 9, 2024 10:55
@ruanwenjun ruanwenjun force-pushed the dev_wenjun_addMasterCoordinator branch from dd78545 to e967b9b Compare December 16, 2024 06:27
Copy link
Member

@SbloodyS SbloodyS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines +90 to 101
try {
if (registry.acquireLock(electionLock)) {
if (!registry.exists(selectorPath)) {
registry.put(selectorPath, serverIdentify, true);
return true;
}
return serverIdentify.equals(registry.get(selectorPath));
}
return false;
} finally {
registry.releaseLock(electionLock);
}
Copy link
Contributor

@caishunfeng caishunfeng Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the issue for avoid to use lock, but it's actually still using lock, what's the difference?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lock is only used when cluster changed, after selected an active server, the lock will be released, so the lock will only hold for a short time. This is only used for election, we use the lock here to make sure all registry plugins can implement the election method in an easy way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we add another coordintor, will all distributed services be executed on this one active master? It seems the Master HA but not some logic coordinator HA, I'm not sure if this adjustment will cause one active master to be busy while others are idle.

public class MasterCoordinator extends AbstractHAServer {
    private final ITaskGroupCoordinator taskGroupCoordinator;
}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the coordinator master shouldn't do a lot of business work, it would be great if it only deals with coordination work. At our usage case, the coordinator master will only coordinate task group, deal with serial wait this will not affect even though the master is busy, these work will not be affect.

BTW, after fixing a lot of memory leak/thread leak bug, I haven't found a master status change to BUSY in our cluster, currently, there is basically no heavy workload in the masters, most case is db busy.

@ruanwenjun ruanwenjun merged commit 9f67cc4 into apache:dev Dec 18, 2024
73 checks passed
@ruanwenjun ruanwenjun deleted the dev_wenjun_addMasterCoordinator branch December 18, 2024 12:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend improvement make more easy to user or prompt friendly test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Improvement][Master] Select a coordinator from masters to wake up task group
4 participants