-
Notifications
You must be signed in to change notification settings - Fork 9.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[release-3.5] Auto sync members in v3store if Islearner is the only field that differs between v2store and v3store #19586
base: release-3.5
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ahrtr The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
3f23337
to
01f8a38
Compare
01f8a38
to
966938d
Compare
Hi folks, please take a look when you get time, thx |
@@ -299,6 +299,9 @@ func StartEtcd(inCfg *Config) (e *Etcd, err error) { | |||
zap.Strings("listen-metrics-urls", e.cfg.getMetricsURLs()), | |||
) | |||
serving = true | |||
|
|||
e.Server.SyncLearnerPromotionIfNeeded() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing cluster membership after the apply loop has been started seems like a bad idea. My expectation was that we shouldn't modify cluster state outside apply loop. Any reason to do that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expectation is good, but not feasible.
I already clarified it in the comment for the method. We must wait for etcd finishes replaying the WAL records.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We must wait for etcd finishes replaying the WAL records.
Yes, we can also do that in apply loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can also do that in apply loop.
If we intentionally issue a member promotion request, then only one member should do it. Then we must wait for leader gets elected?
I really hate the leader's privilege mode; it causes some uncertainties, i.e. leader change etc. I agree that doing this outside of apply loop isn't ideal, but actually it's safe for this specific case. And we would never let such code/implementation go into main branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not proposing to do a member promotion request.
Just a local patching of v3store:
- When bootstrapping read v2 and v3, if there is a unpromoted member in v3 that is promoted in v2, we can mark this member in v3 as promoted. This is correct because it member can only be promoted once and v2 has always older data than v3.
- To handle Promote Member in WAL, depending on etcd version:
- For v3.5, it should just work with fixes for PromoteMember. This should work as v2 store is the source of truth.
- For v3.6, when bootstrapping we read WAL since v2 snapshot, if we find any MemberPromote, just mark it as promoted in v3. This needs to be done before we start raft as v3 store should be source of truth.
Proposed actions are based on assumption (would need to confirm as I might remember it wrongly), that in v3.6 the v3 store is source of truth of membership even when raft is just started and even during replaying of WAL entries between v2 snapshot and v3 consistent index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- For v3.5, it should just work with fixes for PromoteMember. This should work as v2 store is the source of truth.
Normally it won't work because the index might be < consistent_index. It means that we need to make an exception for learner promotion request, apply it for v3store anyway no matter shouldApplyV3
is true or not.
More important is that it may cause data inconsistence in mix version case, i.e. 3.5.19 vs 3.5.20. In 3.5.19 it won't apply, but 3.5.20 apply, then they may end of having different consisitent index.
in v3.6 the v3 store is source of truth of membership even when raft is just started and even during replaying of WAL entries between v2 snapshot and v3 consistent index.
It's true. But still I don't think we should do any such hacky workaround in 3.6.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normally it won't work because the index might be < consistent_index. It means that we need to make an exception for learner promotion request, apply it for v3store anyway no matter shouldApplyV3 is true or not.
Right, forgot about it.
More important is that it may cause data inconsistence in mix version case, i.e. 3.5.19 vs 3.5.20. In 3.5.19 it won't apply, but 3.5.20 apply, then they may end of having different consisitent index.
Not sure how. We just override the field without bumping CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's true. But still I don't think we should do any such hacky workaround in 3.6.
Heh, would love to have robustness tests work for membership information. It's just hard to test non linearizable requests and dynamic clusters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how. We just override the field without bumping CI.
You are also creating the second big exception: use LookOutsideApply
in apply. Also you won't advance the consistent_index even in normal case (promoting a new learner which is added in 3.5.20).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned in #19557 (comment), doing such hacky auto-sync in 3.6 is almost infeasible.
I am updating the e2e test right now, still following the existing approach...
v2Members, _ := membersFromStore(c.lg, c.v2store) | ||
v3Members, _ := membersFromBackend(c.lg, c.be) | ||
|
||
for id, v2Member := range v2Members { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For v3.6 we care about v3membership information. If for some reason v2 and v3 has different members (this is what happen in previous attempt to make v3 authoritative), it's safer to check v3 members.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is for 3.5 instead of 3.6. In 3.5, v2store is the source of truth. Also we are syncing from v2store to v3store.
Also note that we need to check both v2 and v3store, the member ID must match. Not sure what's your concern here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we resolve this comment?
clonedV3Member := v3Member.Clone() | ||
sort.Strings(clonedV3Member.PeerURLs) | ||
|
||
if reflect.DeepEqual(clonedV2Member.RaftAttributes, clonedV3Member.RaftAttributes) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you just compare IsLearner? We don't care about other information
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RaftAttribute includes both peerURLs and IsLearner, both of them should always match between v2 and v3store. I think it's more prudent to double check peerURL matches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we resolve this comment?
ecc4da2
to
cc73f83
Compare
…ers between v2store and v3store Signed-off-by: Benjamin Wang <[email protected]>
cc73f83
to
e7e5358
Compare
e2e test updated, PTAL, thx |
Provide a fix for the users who have already been affected by #19557
cc @fuweid @ivanvc @jmhbnz @serathius @siyuanfoundation