Skip to content

ETCD-727: Rebase etcd 3.5.21 openshift 4.18 #325

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 60 commits into
base: openshift-4.18
Choose a base branch
from

Conversation

Elbehery
Copy link

This PR rebases etcd 3.5.21 into openshift-4.18.

running make locally

SUCCESS: etcd_build (GOARCH=arm64)
./bin/etcd --version
etcd Version: 3.5.21
Git SHA: f0d174d71
Go Version: go1.23.7
Go OS/Arch: darwin/arm64
./bin/etcdctl version
etcdctl version: 3.5.21
API version: 3.5
./bin/etcdutl version
etcdutl version: 3.5.21
API version: 3.5

cc @openshift/openshift-team-etcd @sdodson

wilsonwang371 and others added 30 commits January 25, 2025 22:36
Backport tools/benchmark/cmd/txn_mixed.go from commit
79b2777, pull request etcd-io#13038.

Co-authored-by: Ivan Valdes <[email protected]>
Signed-off-by: Ivan Valdes <[email protected]>
…-txn-mixed

[3.5] backport: tools: add mixed read-write performance evaluation scripts
[3.5] Add more info into log for learner member operations
[3.5] Bump `go.mongodb.org/mongo-driver` and `github.com/golang/glog` to address two CVEs
Signed-off-by: Marcel Franca <[email protected]>
…e-3.5

[3.5] Update golang toolchain to 1.22.12
The compaction behavior is changed in commit
[02635](etcd-io@0263597) and introduces a latency issue.
To be more speicific, the `ticker.C` acts as a fixed timer that triggers every 10ms, regardless of how long each batch of compaction takes.
This means that if a previous compaction batch takes longer than 10ms, the next batch starts immediately, making compaction a blocking operation for etcd.

To fix the issue, this commit revert the compaction to the previous behavior which ensures a 10ms delay between each batch of compaction, allowing other read and write operations to proceed smoothly.

Signed-off-by: Miancheng Lin <[email protected]>
…e-latency

[release-3.5] Fix a performance regression due to uncertain compaction sleep interval
Signed-off-by: joshjms <[email protected]>

change go directive to 1.23

Signed-off-by: joshjms <[email protected]>
…-3.5

[3.5] Update golang toolchain to 1.23.6
…org-x-crypto-to-0.35.0

[3.5] dependency: Bump golang.org/x/crypto from v0.32.0 to v0.35.0
Signed-off-by: Ivan Valdes <[email protected]>
…org-x-net-to-v0.36.0

[3.5] dependency: bump golang.org/x/net from v0.34.0 to v0.36.0
…ry-pick-19520-to-release-3.5

[3.5] Add verify release assets GitHub workflow
We should use WithCompactPhysical to wait for compaction to finish,
because 50ms sleep can't guarantee compaction is done.

Based on the log, the defragment is finished before compaction, which is
not expected. This patch is to make sure compaction should be finished
before assertation.

```
=========================== defragment ==============
    logger.go:146: 2025-02-27T07:57:18.652Z	INFO	m0	finished defragment	{"member": "m0"}
=====================================================

    logger.go:146: 2025-02-27T07:57:18.652Z	INFO	m0	grpc service status changed	{"member": "m0", "service": "", "status": "SERVING"}
    logger.go:146: 2025-02-27T07:57:18.653Z	INFO	grpc	[[core] [Channel etcd-io#1457]Channel Connectivity change to SHUTDOWN]
    logger.go:146: 2025-02-27T07:57:18.653Z	INFO	grpc	[[core] [Channel etcd-io#1457]Closing the name resolver]
    logger.go:146: 2025-02-27T07:57:18.653Z	INFO	grpc	[[core] [Channel etcd-io#1457]ccBalancerWrapper: closing]
    logger.go:146: 2025-02-27T07:57:18.653Z	INFO	grpc	[[core] [Channel etcd-io#1457 SubChannel etcd-io#1458]Subchannel Connectivity change to SHUTDOWN]
    logger.go:146: 2025-02-27T07:57:18.653Z	INFO	grpc	[[core] [Channel etcd-io#1457 SubChannel etcd-io#1458]Subchannel deleted]
    logger.go:146: 2025-02-27T07:57:18.654Z	INFO	grpc	[[transport] [client-transport 0xc00237bd48] Closing: rpc error: code = Canceled desc = grpc: the client connection is closing]
    logger.go:146: 2025-02-27T07:57:18.654Z	INFO	grpc	[[transport] [client-transport 0xc00237bd48] loopyWriter exiting with error: rpc error: code = Canceled desc = grpc: the client connection is closing]
    logger.go:146: 2025-02-27T07:57:18.654Z	INFO	grpc	[[transport] [server-transport 0xc0006bb520] Closing: EOF]
    logger.go:146: 2025-02-27T07:57:18.654Z	INFO	grpc	[[core] [Channel etcd-io#1457]Channel deleted]
    logger.go:146: 2025-02-27T07:57:18.654Z	INFO	grpc	[[transport] [server-transport 0xc0006bb520] loopyWriter exiting with error: transport closed by client]
    hash.go:82:
        	Error Trace:	/home/prow/go/src/github.com/etcd-io/etcd/server/storage/mvcc/testutil/hash.go:82
        	            				/home/prow/go/src/github.com/etcd-io/etcd/server/storage/mvcc/testutil/hash.go:44
        	Error:      	Not equal:
        	            	expected: testutil.KeyValueHash{Hash:0x94694091, CompactRevision:1278, Revision:2507}
        	            	actual  : testutil.KeyValueHash{Hash:0x58af47dc, CompactRevision:2488, Revision:2507}

        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -1,4 +1,4 @@
        	            	 (testutil.KeyValueHash) {
        	            	- Hash: (uint32) 2489925777,
        	            	- CompactRevision: (int64) 1278,
        	            	+ Hash: (uint32) 1487882204,
        	            	+ CompactRevision: (int64) 2488,
        	            	  Revision: (int64) 2507
        	Test:       	TestCompactionHash
        	Messages:   	hashes do not match on rev 2488
    cluster.go:1423: ========= Cluster termination started =====================
    logger.go:146: 2025-02-27T07:57:18.655Z	INFO	grpc	[[core] [Channel etcd-io#1377]Channel Connectivity change to SHUTDOWN]
    logger.go:146: 2025-02-27T07:57:18.655Z	INFO	grpc	[[core] [Channel etcd-io#1377]Closing the name resolver]
    logger.go:146: 2025-02-27T07:57:18.655Z	INFO	grpc	[[core] [Channel etcd-io#1377]ccBalancerWrapper: closing]
    logger.go:146: 2025-02-27T07:57:18.656Z	INFO	grpc	[[core] [Channel etcd-io#1377 SubChannel etcd-io#1378]Subchannel Connectivity change to SHUTDOWN]
    logger.go:146: 2025-02-27T07:57:18.656Z	INFO	grpc	[[core] [Channel etcd-io#1377 SubChannel etcd-io#1378]Subchannel deleted]
    logger.go:146: 2025-02-27T07:57:18.656Z	INFO	grpc	[[transport] [client-transport 0xc0025f2248] Closing: rpc error: code = Canceled desc = grpc: the client connection is closing]
    logger.go:146: 2025-02-27T07:57:18.656Z	INFO	grpc	[[transport] [client-transport 0xc0025f2248] loopyWriter exiting with error: rpc error: code = Canceled desc = grpc: the client connection is closing]
    logger.go:146: 2025-02-27T07:57:18.656Z	INFO	grpc	[[transport] [server-transport 0xc000b56ea0] Closing: EOF]

=========================== compaction ==============
    logger.go:146: 2025-02-27T07:57:18.656Z	INFO	m0	finished scheduled compaction	{"member": "m0", "compact-revision": 2488, "took": "129.590742ms", "hash": 2489925777, "current-db-size-bytes": 40960, "current-db-size": "41 kB", "current-db-size-in-use-bytes": 40960, "current-db-size-in-use": "41 kB"}
=====================================================
```

Fixes: etcd-io#19497

Signed-off-by: Wei Fu <[email protected]>
Signed-off-by: Ivan Valdes <[email protected]>
…ry-pick-19538-to-release-3.5

[release-3.5] deflakey: TestCompactionHash in integration
ahrtr and others added 17 commits March 20, 2025 08:32
[release-3.5] Add e2e test to verify etcd is able to automatically fix the issue
Signed-off-by: Ivan Valdes <[email protected]>
…com-golang-jwt-jwt-v4

[release-3.5] dependency: bump github.com/golang-jwt/jwt/v4 from 4.5.1 to 4.5.2
Addresses CVE-2025-22870 and CVE-2025-22872.

Signed-off-by: Ivan Valdes <[email protected]>
…-net-to-v0.37.0

[release-3.5] dependency: bump golang.org/x/net from v0.36.0 to v0.38.0
Signed-off-by: Ivan Valdes <[email protected]>
…tore

This PR will add the notion of cluster ID into the initial cluster discovery process.
This allows us to automatically archive a data directory when we detect the cluster identifier changing.
The cluster identifier will only change when we are running a restore operation.

The detection requires that the revision.json (created by the revision
monitor sidecar) contains the cluster id. The cluster identifier is also
stored in the local WAL, which is much more expensive to parse. We're going to
only fallback to it when we could not get the cluster id from the
revision.json for any reason. Otherwise the WAL stays untouched, no
repair operations are attempted when it is found corrupted.

Signed-off-by: Thomas Jungblut <[email protected]>
Konflux is replacing RH's internal build system OSBS. OSBS
supported a build-time dependency injection system called
"cachito". Konflux replaces this with "cachi2" which works
differently. REMOTE_SOURCES no longer need to be copied
into place and there is no need to source cachito's environment
information (Konflux automatically rewrites the Dockerfile
to source cachi2/cachi2.env before running the original
RUN commands).
Additionally, cachito appears to have provided go.sum
dependencies whereas cachi2 requires all build-time
dependencies in go.mod. Missing dependencies are added
to go.mod as // indirect in this change.
force-new-cluster seems to have similar watch cache issues as the
ordinary snapshot restore. This PR introduces the already existing utl logic
as a separate package into the server-side code.

This will only introduce a revbump flag, but under the hood implement
both rev bumping and compaction markers.

Signed-off-by: Thomas Jungblut <[email protected]>
This adds the min and max TLS version support from etcd-io#13506 and etcd-io#15156 to the grpc proxy.

Signed-off-by: Thomas Jungblut <[email protected]>
The compaction behavior is changed in commit
[02635](etcd-io@0263597) and introduces a latency issue.
To be more speicific, the `ticker.C` acts as a fixed timer that triggers every 10ms, regardless of how long each batch of compaction takes.
This means that if a previous compaction batch takes longer than 10ms, the next batch starts immediately, making compaction a blocking operation for etcd.

To fix the issue, this commit revert the compaction to the previous behavior which ensures a 10ms delay between each batch of compaction, allowing other read and write operations to proceed smoothly.

Signed-off-by: Miancheng Lin <[email protected]>
@Elbehery
Copy link
Author

/payload 4.18 nightly informing

@Elbehery
Copy link
Author

/payload 4.18 nightly blocking

Copy link

openshift-ci bot commented Mar 30, 2025

@Elbehery: trigger 64 job(s) of type informing for the nightly release of OCP 4.18

  • periodic-ci-openshift-release-master-nightly-4.18-e2e-agent-compact-fips
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-agent-ha-dualstack-conformance
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-agent-single-node-ipv6
  • periodic-ci-openshift-release-master-nightly-4.18-console-aws
  • periodic-ci-openshift-cluster-control-plane-machine-set-operator-release-4.18-periodics-e2e-aws
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-csi
  • periodic-ci-openshift-release-master-ci-4.18-e2e-aws-ovn
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-ovn-cgroupsv2
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-ovn-fips
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-ovn-single-node
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-ovn-single-node-csi
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-ovn-single-node-serial
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-ovn-single-node-techpreview
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-ovn-single-node-techpreview-serial
  • periodic-ci-openshift-release-master-nightly-4.18-upgrade-from-stable-4.17-e2e-aws-upgrade-ovn-single-node
  • periodic-ci-openshift-release-master-ci-4.18-e2e-aws-ovn-upgrade-out-of-change
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-ovn-upi
  • periodic-ci-openshift-cluster-control-plane-machine-set-operator-release-4.18-periodics-e2e-azure
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-azure-csi
  • periodic-ci-openshift-release-master-ci-4.18-e2e-azure-ovn
  • periodic-ci-openshift-release-master-ci-4.18-e2e-azure-ovn-serial
  • periodic-ci-openshift-release-master-ci-4.18-e2e-azure-ovn-techpreview
  • periodic-ci-openshift-release-master-ci-4.18-e2e-azure-ovn-techpreview-serial
  • periodic-ci-openshift-release-master-ci-4.18-e2e-azure-ovn-upgrade-out-of-change
  • periodic-ci-openshift-cluster-control-plane-machine-set-operator-release-4.18-periodics-e2e-gcp
  • periodic-ci-openshift-release-master-ci-4.18-e2e-gcp-ovn
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-gcp-ovn-csi
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-gcp-ovn-rt
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-gcp-ovn-serial
  • periodic-ci-openshift-release-master-ci-4.18-e2e-gcp-ovn-techpreview
  • periodic-ci-openshift-release-master-ci-4.18-e2e-gcp-ovn-techpreview-serial
  • periodic-ci-openshift-release-master-ci-4.18-upgrade-from-stable-4.17-e2e-gcp-ovn-upgrade
  • periodic-ci-openshift-release-master-ci-4.18-e2e-gcp-ovn-upgrade
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ipi-ovn-bm-upgrade
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ipi-ovn-dualstack
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ipi-ovn-dualstack-techpreview
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ipi-ovn-ipv6-techpreview
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ipi-ovn-serial-ipv4
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ipi-ovn-serial-virtualmedia
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ipi-ovn-techpreview
  • periodic-ci-openshift-release-master-nightly-4.18-upgrade-from-stable-4.17-e2e-metal-ipi-ovn-upgrade
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ipi-serial-ovn-ipv6
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ipi-serial-ovn-dualstack
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ipi-upgrade-ovn-ipv6
  • periodic-ci-openshift-release-master-nightly-4.18-upgrade-from-stable-4.17-e2e-metal-ipi-upgrade-ovn-ipv6
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ovn-assisted
  • periodic-ci-openshift-release-master-nightly-4.18-metal-ovn-single-node-recert-cluster-rename
  • periodic-ci-openshift-osde2e-main-nightly-4.18-osd-aws
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-osd-ccs-gcp
  • periodic-ci-openshift-osde2e-main-nightly-4.18-osd-gcp
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-ovn-proxy
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ovn-single-node-live-iso
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-telco5g
  • periodic-ci-openshift-release-master-ci-4.18-upgrade-from-stable-4.17-e2e-aws-ovn-upgrade
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-vsphere-ovn
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-vsphere-ovn-csi
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-vsphere-ovn-serial
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-vsphere-ovn-techpreview
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-vsphere-ovn-techpreview-serial
  • periodic-ci-openshift-release-master-ci-4.18-e2e-vsphere-ovn-upgrade
  • periodic-ci-openshift-release-master-ci-4.18-upgrade-from-stable-4.17-e2e-vsphere-ovn-upgrade
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-vsphere-ovn-upi
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-vsphere-ovn-upi-serial
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-vsphere-static-ovn

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/f73eacc0-0d2b-11f0-9f90-1c0d9dcdba2b-0

Copy link

openshift-ci bot commented Mar 30, 2025

@Elbehery: trigger 11 job(s) of type blocking for the nightly release of OCP 4.18

  • periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-ovn-serial
  • periodic-ci-openshift-release-master-ci-4.18-e2e-aws-upgrade-ovn-single-node
  • periodic-ci-openshift-release-master-ci-4.18-e2e-aws-ovn-techpreview
  • periodic-ci-openshift-release-master-ci-4.18-e2e-aws-ovn-techpreview-serial
  • periodic-ci-openshift-release-master-ci-4.18-e2e-aws-ovn-upgrade
  • periodic-ci-openshift-release-master-ci-4.18-e2e-azure-ovn-upgrade
  • periodic-ci-openshift-release-master-nightly-4.18-fips-payload-scan
  • periodic-ci-openshift-release-master-ci-4.18-upgrade-from-stable-4.17-e2e-gcp-ovn-rt-upgrade
  • periodic-ci-openshift-hypershift-release-4.18-periodics-e2e-aws-ovn-conformance
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ipi-ovn-bm
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ipi-ovn-ipv6

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/fb5d0ea0-0d2b-11f0-835b-ada90e26dc73-0

@openshift-ci openshift-ci bot requested review from deads2k and hexfusion March 30, 2025 05:58
Copy link

openshift-ci bot commented Mar 30, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Elbehery

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 30, 2025
@Elbehery Elbehery changed the title Rebase etcd 3.5.21 openshift 4.18 ETCD-727: Rebase etcd 3.5.21 openshift 4.18 Mar 30, 2025
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 30, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 30, 2025

@Elbehery: This pull request references ETCD-727 which is a valid jira issue.

In response to this:

This PR rebases etcd 3.5.21 into openshift-4.18.

running make locally

SUCCESS: etcd_build (GOARCH=arm64)
./bin/etcd --version
etcd Version: 3.5.21
Git SHA: f0d174d71
Go Version: go1.23.7
Go OS/Arch: darwin/arm64
./bin/etcdctl version
etcdctl version: 3.5.21
API version: 3.5
./bin/etcdutl version
etcdutl version: 3.5.21
API version: 3.5

cc @openshift/openshift-team-etcd @sdodson

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@Elbehery
Copy link
Author

/jira refresh

@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 30, 2025

@Elbehery: This pull request references ETCD-727 which is a valid jira issue.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.
Projects
None yet
Development

Successfully merging this pull request may close these issues.