New cluster not becoming healthy, all services green and running #1805
Replies: 6 comments 1 reply
-
Beta Was this translation helpful? Give feedback.
-
|
you can try downloading the support bundle using Without the support bundle it's hard to tell what's going on really |
Beta Was this translation helpful? Give feedback.
-
|
You added a control plane node. Are there only 2 control plane nodes in the cluster? For etcd to become healthy it requires an odd number of control plan. nodes. |
Beta Was this translation helpful? Give feedback.
-
|
You can certainly create a two control plane node cluster, or add another
control plane node to a single control plane, and Omni makes it healthy.
(It's not HA in any way, but it will not stop etcd being healthy.)
…On Wed, Nov 5, 2025 at 8:46 AM Justin Garrison ***@***.***> wrote:
You added a control plane node. Are there only 2 control plane nodes in
the cluster? For etcd to become healthy it requires an odd number of
control plan. nodes.
—
Reply to this email directly, view it on GitHub
<#1805 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AQGWG5KS25RNQFCHUDPDUMT33ISV5AVCNFSM6AAAAACLHKUQNGVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTIOBYGMYTEMQ>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
|
We use Harbor as image registry in between, could it be that there has been a change to digest instead of tags which containerd uses?? We used to look at for example pause with tag 3.10, and now it tries to get the digest. As you can see the sha is not allowed and gives 401 /v2/registry.k8s.io/pause/manifests/sha256:ee6521f290b2168b6e0935a181d4cff9be1ac3f505666ef0e3c98fae8199917a?ns=registry.k8s.io HTTP/1.1" 401 152 "-" "containerd/v2.1.4" |
Beta Was this translation helpful? Give feedback.
-
|
It looks like when the mirror is not configured (giving 401 on image pulls) it is not automatically bootstrapping anymore, we see these messages in the log talos] etcd is waiting to join the cluster, if this node is the first node in the cluster, please run |
Beta Was this translation helpful? Give feedback.

Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
All looks wel after making a new cluster with a config patch. Talos 1.11.1 with kubernetes 1.34.1
I attached one new machine (which was joined) and made it role control plane. All services are green and running (see image). Still it is not becoming ready. Any ideas?
We see the following weird errors, besides the normal transient ones:
[talos] controller failed {"component": "controller-runtime", "controller": "k8s.ManifestApplyController", "error": "error creating mapping for object /v1/Secret/bootstrap-token-t51urs: Get "https://127.0.0.1:7445/api?timeout=32s\": EOF"}
05/11/2025 16:49:13
[talos] controller failed {"component": "controller-runtime", "controller": "k8s.NodeApplyController", "error": "1 error(s) occurred:\n\ttimeout"}
support (1).zip
Beta Was this translation helpful? Give feedback.
All reactions