Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release-3.5] Auto sync members in v3store if Islearner is the only field that differs between v2store and v3store #19586

Open
wants to merge 1 commit into
base: release-3.5
Choose a base branch
from

Conversation

ahrtr
Copy link
Member

@ahrtr ahrtr commented Mar 13, 2025

Provide a fix for the users who have already been affected by #19557

cc @fuweid @ivanvc @jmhbnz @serathius @siyuanfoundation

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahrtr

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ahrtr
Copy link
Member Author

ahrtr commented Mar 13, 2025

@ivanvc once this PRs get merged and a couple of days workflow/pipeline soak, then we are ready to release 3.5.20.

@fuweid the new Kubeadm style e2e test should pass once 3.5.20 is out.

@ahrtr ahrtr force-pushed the learner_sync_20250313 branch from 01f8a38 to 966938d Compare March 13, 2025 22:00
@ahrtr
Copy link
Member Author

ahrtr commented Mar 13, 2025

Hi folks, please take a look when you get time, thx

@@ -299,6 +299,9 @@ func StartEtcd(inCfg *Config) (e *Etcd, err error) {
zap.Strings("listen-metrics-urls", e.cfg.getMetricsURLs()),
)
serving = true

e.Server.SyncLearnerPromotionIfNeeded()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing cluster membership after the apply loop has been started seems like a bad idea. My expectation was that we shouldn't modify cluster state outside apply loop. Any reason to do that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expectation is good, but not feasible.

I already clarified it in the comment for the method. We must wait for etcd finishes replaying the WAL records.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We must wait for etcd finishes replaying the WAL records.

Yes, we can also do that in apply loop.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can also do that in apply loop.

If we intentionally issue a member promotion request, then only one member should do it. Then we must wait for leader gets elected?

I really hate the leader's privilege mode; it causes some uncertainties, i.e. leader change etc. I agree that doing this outside of apply loop isn't ideal, but actually it's safe for this specific case. And we would never let such code/implementation go into main branch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not proposing to do a member promotion request.

Just a local patching of v3store:

  • When bootstrapping read v2 and v3, if there is a unpromoted member in v3 that is promoted in v2, we can mark this member in v3 as promoted. This is correct because it member can only be promoted once and v2 has always older data than v3.
  • To handle Promote Member in WAL, depending on etcd version:
    • For v3.5, it should just work with fixes for PromoteMember. This should work as v2 store is the source of truth.
    • For v3.6, when bootstrapping we read WAL since v2 snapshot, if we find any MemberPromote, just mark it as promoted in v3. This needs to be done before we start raft as v3 store should be source of truth.

Proposed actions are based on assumption (would need to confirm as I might remember it wrongly), that in v3.6 the v3 store is source of truth of membership even when raft is just started and even during replaying of WAL entries between v2 snapshot and v3 consistent index.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • For v3.5, it should just work with fixes for PromoteMember. This should work as v2 store is the source of truth.

Normally it won't work because the index might be < consistent_index. It means that we need to make an exception for learner promotion request, apply it for v3store anyway no matter shouldApplyV3 is true or not.

More important is that it may cause data inconsistence in mix version case, i.e. 3.5.19 vs 3.5.20. In 3.5.19 it won't apply, but 3.5.20 apply, then they may end of having different consisitent index.

in v3.6 the v3 store is source of truth of membership even when raft is just started and even during replaying of WAL entries between v2 snapshot and v3 consistent index.

It's true. But still I don't think we should do any such hacky workaround in 3.6.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally it won't work because the index might be < consistent_index. It means that we need to make an exception for learner promotion request, apply it for v3store anyway no matter shouldApplyV3 is true or not.

Right, forgot about it.

More important is that it may cause data inconsistence in mix version case, i.e. 3.5.19 vs 3.5.20. In 3.5.19 it won't apply, but 3.5.20 apply, then they may end of having different consisitent index.

Not sure how. We just override the field without bumping CI.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's true. But still I don't think we should do any such hacky workaround in 3.6.

Heh, would love to have robustness tests work for membership information. It's just hard to test non linearizable requests and dynamic clusters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how. We just override the field without bumping CI.

You are also creating the second big exception: use LookOutsideApply in apply. Also you won't advance the consistent_index even in normal case (promoting a new learner which is added in 3.5.20).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in #19557 (comment), doing such hacky auto-sync in 3.6 is almost infeasible.

I am updating the e2e test right now, still following the existing approach...

v2Members, _ := membersFromStore(c.lg, c.v2store)
v3Members, _ := membersFromBackend(c.lg, c.be)

for id, v2Member := range v2Members {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For v3.6 we care about v3membership information. If for some reason v2 and v3 has different members (this is what happen in previous attempt to make v3 authoritative), it's safer to check v3 members.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is for 3.5 instead of 3.6. In 3.5, v2store is the source of truth. Also we are syncing from v2store to v3store.

Also note that we need to check both v2 and v3store, the member ID must match. Not sure what's your concern here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we resolve this comment?

clonedV3Member := v3Member.Clone()
sort.Strings(clonedV3Member.PeerURLs)

if reflect.DeepEqual(clonedV2Member.RaftAttributes, clonedV3Member.RaftAttributes) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you just compare IsLearner? We don't care about other information

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RaftAttribute includes both peerURLs and IsLearner, both of them should always match between v2 and v3store. I think it's more prudent to double check peerURL matches.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we resolve this comment?

@ahrtr ahrtr force-pushed the learner_sync_20250313 branch 2 times, most recently from ecc4da2 to cc73f83 Compare March 14, 2025 15:25
…ers between v2store and v3store

Signed-off-by: Benjamin Wang <[email protected]>
@ahrtr ahrtr force-pushed the learner_sync_20250313 branch from cc73f83 to e7e5358 Compare March 14, 2025 15:26
@ahrtr
Copy link
Member Author

ahrtr commented Mar 14, 2025

e2e test updated, PTAL, thx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants