Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions server/embed/etcd.go
Original file line number Diff line number Diff line change
Expand Up @@ -277,6 +277,9 @@ func StartEtcd(inCfg *Config) (e *Etcd, err error) {
return e, err
}
}

e.Server.SyncLearnerPromotionIfNeeded()
Copy link
Member

@serathius serathius Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not 100% sure about consequences of calling sync here. I don't know if we really don't run any goroutines in background at this point, and proving so might be very hard.

By proposing to run it during bootstrap I meant within bootstrap.go

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is release-3.5 isntead of main or release-3.6, there is no bootstrap.go. Note that we need to do this after corruption check,

https://github.com/ahrtr/etcd/blob/2f55a7c5144d084541114340eb058fb3a37ad9b5/server/embed/etcd.go#L260-L281

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you think this change will interact with corrupt check? Even if we do the sync before corrupt check on one member, next member will start with corrupt check before sync.

Copy link
Member Author

@ahrtr ahrtr Mar 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we really don't run any goroutines in background at this point, and proving so might be very hard.

We call SyncLearnerPromotionIfNeeded before starting the raft loop, before starting all the [periodically] jobs (i.e. monitorVersions, linearizableReadLoop, monitorKVHash, etc), also before starting to serve client & peer requests. It's safe. What's your concern here?

How do you think this change will interact with corrupt check?

If there is data corrupt, we should leave it as it is.

Copy link
Member

@fuweid fuweid Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking if we still need this migration.

I found there is publish API call during starting. Since we already allow to override member information, this publish can help us migration (if I understand it correctly).

Path: membership.MemberAttributesStorePath(s.id),

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@serathius please let me know if you still have any concern on my previous comment. I think we should be all good now.

I am thinking if we still need this migration.

I found there is publish API call during starting. Since we already allow to override member information, this publish can help us migration (if I understand it correctly).

@fuweid I don't think we are on the same page. The publish logic is about attribute, what we are resolving/talking about in this PR (and issue 19557) is about IsLearner, which is part of RaftAttribute.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

override member information

Yes, this is true. thx for pointing out this.

Although the publish logic is only supposed to update attribute, but the implementation just overwrites the whole member info into v3store (bbolt), so it can automatically fix the issues which are affected by #19557 on top of the patch #19563

Just raised another PR with only e2e test cases #19629, which I believe is better than #19627

Copy link
Member Author

@ahrtr ahrtr Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For release-3.6, we don't need any further validation (i.e. requiring 3.5.20+ before upgrading to 3.6) any more, reasons:

  • If there are 2 or more (already promoted) learners, then etcd will crash anyway.
  • If there is only 1 (already promoted) learner, then etcd will automatically fix it due to the same reason as mentioned above.

So for release-3.6, we only need to add two e2e test cases:

Note that we still need to document the requirement of requiring 3.5.20+ before upgrading to 3.6, to avoid the potential upgrading failure (as mentioned above, etcd will crash if there are 2 or more learners).


e.Server.Start()

if err = e.servePeers(); err != nil {
Expand Down
66 changes: 64 additions & 2 deletions server/etcdserver/api/membership/cluster.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ import (
"errors"
"fmt"
"path"
"reflect"
"sort"
"strings"
"sync"
Expand Down Expand Up @@ -530,8 +531,22 @@ func (c *RaftCluster) PromoteMember(id types.ID, shouldApplyV3 ShouldApplyV3) {
if c.v2store != nil {
mustUpdateMemberInStore(c.lg, c.v2store, c.members[id])
}
if c.be != nil && shouldApplyV3 {
unsafeSaveMemberToBackend(c.lg, c.be, c.members[id])
if c.be != nil {
if shouldApplyV3 {
unsafeSaveMemberToBackend(c.lg, c.be, c.members[id])
} else {
// Workaround https://github.com/etcd-io/etcd/issues/19557 for the case that
// learner promotion hasn't been applied.
v3Members, _ := membersFromBackend(c.lg, c.be)
if m, ok := v3Members[id]; ok {
if m.IsLearner {
c.lg.Info("Forcibly apply member promotion", zap.String("member", fmt.Sprintf("%+v", *c.members[id])))
unsafeHackySaveMemberToBackend(c.lg, c.be, c.members[id])
} else {
c.lg.Info("Member promotion had already been correctly applied", zap.String("member", fmt.Sprintf("%+v", *c.members[id])))
}
}
}
}

c.lg.Info(
Expand All @@ -542,6 +557,53 @@ func (c *RaftCluster) PromoteMember(id types.ID, shouldApplyV3 ShouldApplyV3) {
)
}

func (c *RaftCluster) SyncLearnerPromotionIfNeeded() {
c.Lock()
defer c.Unlock()

v2Members, _ := membersFromStore(c.lg, c.v2store)
v3Members, _ := membersFromBackend(c.lg, c.be)

for id, v2Member := range v2Members {
v3Member, ok := v3Members[id]

if !ok {
c.lg.Error("Detected member only in v2store but missing in v3store", zap.String("member", fmt.Sprintf("%+v", *v2Member)))
continue
}

// A peerURL list is considered the same regardless of order.
clonedV2Member := v2Member.Clone()
sort.Strings(clonedV2Member.PeerURLs)

clonedV3Member := v3Member.Clone()
sort.Strings(clonedV3Member.PeerURLs)

if reflect.DeepEqual(clonedV2Member.RaftAttributes, clonedV3Member.RaftAttributes) {
c.lg.Info("Member's RaftAttributes is consistent between v2store and v3store", zap.String("member", fmt.Sprintf("%+v", *v2Member)))
continue
}

// Sync member iff both conditions below are true,
// 1. IsLearner is the only field in RaftAttributes that differs between v2store and v3store.
// 2. v2store.IsLearner == false && v3store.IsLearner == true.
clonedV3Member.IsLearner = false
if reflect.DeepEqual(clonedV2Member.RaftAttributes, clonedV3Member.RaftAttributes) {
syncedV3Member := v3Member.Clone()
syncedV3Member.IsLearner = false
c.lg.Warn("Syncing member in v3store", zap.String("member", fmt.Sprintf("%+v", *syncedV3Member)))
unsafeHackySaveMemberToBackend(c.lg, c.be, syncedV3Member)
c.be.ForceCommit()
continue
}

c.lg.Error("Cannot sync member in v3store due to IsLearner not being the only field that differs between v2store amd v3store",
zap.String("v2member", fmt.Sprintf("%+v", *v2Member)),
zap.String("v3member", fmt.Sprintf("%+v", *v3Member)),
)
}
}

func (c *RaftCluster) UpdateRaftAttributes(id types.ID, raftAttr RaftAttributes, shouldApplyV3 ShouldApplyV3) {
c.Lock()
defer c.Unlock()
Expand Down
160 changes: 159 additions & 1 deletion server/etcdserver/api/membership/cluster_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ import (

"github.com/coreos/go-semver/semver"
"github.com/stretchr/testify/assert"
betesting "go.etcd.io/etcd/server/v3/mvcc/backend/testing"
"github.com/stretchr/testify/require"
"go.uber.org/zap"
"go.uber.org/zap/zaptest"

Expand All @@ -33,6 +33,7 @@ import (
"go.etcd.io/etcd/raft/v3/raftpb"
"go.etcd.io/etcd/server/v3/etcdserver/api/v2store"
"go.etcd.io/etcd/server/v3/mock/mockstore"
betesting "go.etcd.io/etcd/server/v3/mvcc/backend/testing"
)

func TestClusterMember(t *testing.T) {
Expand Down Expand Up @@ -1210,3 +1211,160 @@ func TestRemoveMemberSyncsBackendAndStoreV2(t *testing.T) {
})
}
}

func TestSyncLearnerPromotion(t *testing.T) {
tcs := []struct {
name string

storeV2Members []*Member
backendMembers []*Member

expectV3Members map[types.ID]*Member
}{
{
name: "v3store should keep unchanged if IsLearner isn't the only field that differs",
storeV2Members: []*Member{
{
ID: 100,
RaftAttributes: RaftAttributes{
PeerURLs: []string{"http://10.0.0.100:2380"},
IsLearner: false,
},
},
},
backendMembers: []*Member{
{
ID: 100,
RaftAttributes: RaftAttributes{
PeerURLs: []string{"http://10.0.0.9:2380"},
IsLearner: true,
},
},
},
expectV3Members: map[types.ID]*Member{
100: {
ID: 100,
RaftAttributes: RaftAttributes{
PeerURLs: []string{"http://10.0.0.9:2380"},
IsLearner: true,
},
},
},
},
{
name: "v3store should keep unchanged if IsLearner is the only field that differs but v3store.IsLearner is false",
storeV2Members: []*Member{
{
ID: 100,
RaftAttributes: RaftAttributes{
PeerURLs: []string{"http://10.0.0.9:2380"},
IsLearner: true,
},
},
},
backendMembers: []*Member{
{
ID: 100,
RaftAttributes: RaftAttributes{
PeerURLs: []string{"http://10.0.0.9:2380"},
IsLearner: false,
},
},
},
expectV3Members: map[types.ID]*Member{
100: {
ID: 100,
RaftAttributes: RaftAttributes{
PeerURLs: []string{"http://10.0.0.9:2380"},
IsLearner: false,
},
},
},
},
{
name: "v3store should be updated if IsLearner is the only field that differs and v3store.IsLearner is true",
storeV2Members: []*Member{
{
ID: 100,
RaftAttributes: RaftAttributes{
PeerURLs: []string{"http://10.0.0.9:2380"},
IsLearner: false,
},
},
},
backendMembers: []*Member{
{
ID: 100,
RaftAttributes: RaftAttributes{
PeerURLs: []string{"http://10.0.0.9:2380"},
IsLearner: true,
},
},
},
expectV3Members: map[types.ID]*Member{
100: {
ID: 100,
RaftAttributes: RaftAttributes{
PeerURLs: []string{"http://10.0.0.9:2380"},
IsLearner: false,
},
},
},
},
{
name: "v3store should be updated if IsLearner is the only field that differs and peerURLs are in different order in v2store and v3store",
storeV2Members: []*Member{
{
ID: 100,
RaftAttributes: RaftAttributes{
PeerURLs: []string{"http://10.0.0.9:2380", "http://127.0.0.1:2380"},
IsLearner: false,
},
},
},
backendMembers: []*Member{
{
ID: 100,
RaftAttributes: RaftAttributes{
PeerURLs: []string{"http://127.0.0.1:2380", "http://10.0.0.9:2380"},
IsLearner: true,
},
},
},
expectV3Members: map[types.ID]*Member{
100: {
ID: 100,
RaftAttributes: RaftAttributes{
PeerURLs: []string{"http://127.0.0.1:2380", "http://10.0.0.9:2380"},
IsLearner: false,
},
},
},
},
}
for _, tc := range tcs {
t.Run(tc.name, func(t *testing.T) {
lg := zaptest.NewLogger(t)
be, _ := betesting.NewDefaultTmpBackend(t)
defer be.Close()
mustCreateBackendBuckets(be)
for _, m := range tc.backendMembers {
unsafeSaveMemberToBackend(lg, be, m)
}
be.ForceCommit()

st := v2store.New()
for _, m := range tc.storeV2Members {
mustSaveMemberToStore(lg, st, m)
}

cluster := NewCluster(lg)
cluster.SetBackend(be)
cluster.SetStore(st)

cluster.SyncLearnerPromotionIfNeeded()
v3Members, _ := cluster.MembersFromBackend()
require.Equal(t, tc.expectV3Members, v3Members)
})
}
}
16 changes: 16 additions & 0 deletions server/etcdserver/api/membership/store.go
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,22 @@ func unsafeSaveMemberToBackend(lg *zap.Logger, be backend.Backend, m *Member) er
return nil
}

// unsafeHackySaveMemberToBackend updates the member in a hacky way.
// It's only used to workaround https://github.com/etcd-io/etcd/issues/19557.
func unsafeHackySaveMemberToBackend(lg *zap.Logger, be backend.Backend, m *Member) error {
mkey := backendMemberKey(m.ID)
mvalue, err := json.Marshal(m)
if err != nil {
lg.Panic("failed to marshal member", zap.Error(err))
}

tx := be.BatchTx()
tx.LockOutsideApply()
defer tx.Unlock()
tx.UnsafePut(buckets.Members, mkey, mvalue)
return nil
}

// TrimClusterFromBackend removes all information about cluster (versions)
// from the v3 backend.
func TrimClusterFromBackend(be backend.Backend) error {
Expand Down
12 changes: 12 additions & 0 deletions server/etcdserver/server.go
Original file line number Diff line number Diff line change
Expand Up @@ -2931,3 +2931,15 @@ func maybeDefragBackend(cfg config.ServerConfig, be backend.Backend) error {
func (s *EtcdServer) CorruptionChecker() CorruptionChecker {
return s.corruptionChecker
}

// SyncLearnerPromotionIfNeeded provides a workaround for the users who have
// already been affected by https://github.com/etcd-io/etcd/issues/19557.
// It automatically syncs the v3store (bbolt) from v2store iff all the
// conditions below are true for each member,
// 1. IsLearner is the only field that differs between v2store and v3store.
// 2. v2store.IsLearner == false && v3store.IsLearner == true.
func (s *EtcdServer) SyncLearnerPromotionIfNeeded() {
s.Logger().Info("Trying to sync the learner promotion operations for v3store if needed")
s.cluster.SyncLearnerPromotionIfNeeded()
s.Logger().Info("Finished syncing the learner promotion operations for v3store")
}
8 changes: 8 additions & 0 deletions server/mvcc/backend/verify.go
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,14 @@ func verifyLockEnabled() bool {

func insideApply() bool {
stackTraceStr := string(debug.Stack())

// Exclude the case of `unsafeHackySaveMemberToBackend`, which is
// used to workaround the situation which are already affected by
// https://github.com/etcd-io/etcd/issues/19557
if strings.Contains(stackTraceStr, "unsafeHackySaveMemberToBackend") {
return false
}

return strings.Contains(stackTraceStr, ".applyEntries")
}

Expand Down
Loading