New cluster not becoming healthy, all services green and running #1805

peterbosalliandercom · 2025-11-05T15:57:01Z

peterbosalliandercom
Nov 5, 2025

All looks wel after making a new cluster with a config patch. Talos 1.11.1 with kubernetes 1.34.1
I attached one new machine (which was joined) and made it role control plane. All services are green and running (see image). Still it is not becoming ready. Any ideas?

We see the following weird errors, besides the normal transient ones:

[talos] controller failed {"component": "controller-runtime", "controller": "k8s.ManifestApplyController", "error": "error creating mapping for object /v1/Secret/bootstrap-token-t51urs: Get "https://127.0.0.1:7445/api?timeout=32s\": EOF"}
05/11/2025 16:49:13
[talos] controller failed {"component": "controller-runtime", "controller": "k8s.NodeApplyController", "error": "1 error(s) occurred:\n\ttimeout"}

support (1).zip

peterbosalliandercom · 2025-11-05T16:12:41Z

peterbosalliandercom
Nov 5, 2025
Author

Also the bootstrap manifests keep spinning (not loading)

0 replies

Unix4ever · 2025-11-05T16:40:45Z

Unix4ever
Nov 5, 2025
Maintainer

you can try downloading the support bundle using omnictl support --cluster <name>. It's more stable than the UI.

Without the support bundle it's hard to tell what's going on really

1 reply

Unix4ever Nov 5, 2025
Maintainer

Checked the bundle and spotted one issue. Though not sure if it's related.
CA certs are added as the files, but the right and up-to-date way to do it is to use a separate document.
See https://docs.siderolabs.com/talos/v1.11/security/certificate-authorities#appending-the-certificate-authority-ca

rothgar · 2025-11-05T16:44:34Z

rothgar
Nov 5, 2025
Maintainer

You added a control plane node. Are there only 2 control plane nodes in the cluster? For etcd to become healthy it requires an odd number of control plan. nodes.

0 replies

steverfrancis · 2025-11-05T17:13:42Z

steverfrancis
Nov 5, 2025
Maintainer

You can certainly create a two control plane node cluster, or add another control plane node to a single control plane, and Omni makes it healthy. (It's not HA in any way, but it will not stop etcd being healthy.)

…

On Wed, Nov 5, 2025 at 8:46 AM Justin Garrison ***@***.***> wrote: You added a control plane node. Are there only 2 control plane nodes in the cluster? For etcd to become healthy it requires an odd number of control plan. nodes. — Reply to this email directly, view it on GitHub <#1805 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AQGWG5KS25RNQFCHUDPDUMT33ISV5AVCNFSM6AAAAACLHKUQNGVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTIOBYGMYTEMQ> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

0 replies

peterbosalliandercom · 2025-11-05T17:58:30Z

peterbosalliandercom
Nov 5, 2025
Author

We use Harbor as image registry in between, could it be that there has been a change to digest instead of tags which containerd uses?? We used to look at for example pause with tag 3.10, and now it tries to get the digest.

As you can see the sha is not allowed and gives 401

/v2/registry.k8s.io/pause/manifests/sha256:ee6521f290b2168b6e0935a181d4cff9be1ac3f505666ef0e3c98fae8199917a?ns=registry.k8s.io HTTP/1.1" 401 152 "-" "containerd/v2.1.4"

0 replies

peterbosalliandercom · 2025-11-06T09:38:57Z

peterbosalliandercom
Nov 6, 2025
Author

It looks like when the mirror is not configured (giving 401 on image pulls) it is not automatically bootstrapping anymore, we see these messages in the log talos] etcd is waiting to join the cluster, if this node is the first node in the cluster, please run talosctl bootstrap against one of the following IPs (even after correcting the mirror to allow pulls). Is there a way to reset it? When a cluster is destroyed and newly created it works again but we do not want to do that everytime, and should not be needed

0 replies

New cluster not becoming healthy, all services green and running #1805

Uh oh!

Uh oh!

peterbosalliandercom Nov 5, 2025

Replies: 6 comments · 1 reply

Uh oh!

Uh oh!

peterbosalliandercom Nov 5, 2025 Author

Uh oh!

Uh oh!

Unix4ever Nov 5, 2025 Maintainer

Uh oh!

Unix4ever Nov 5, 2025 Maintainer

Uh oh!

rothgar Nov 5, 2025 Maintainer

Uh oh!

steverfrancis Nov 5, 2025 Maintainer

Uh oh!

peterbosalliandercom Nov 5, 2025 Author

Uh oh!

Uh oh!

peterbosalliandercom Nov 6, 2025 Author

peterbosalliandercom
Nov 5, 2025

Replies: 6 comments 1 reply

peterbosalliandercom
Nov 5, 2025
Author

Unix4ever
Nov 5, 2025
Maintainer

Unix4ever Nov 5, 2025
Maintainer

rothgar
Nov 5, 2025
Maintainer

steverfrancis
Nov 5, 2025
Maintainer

peterbosalliandercom
Nov 5, 2025
Author

peterbosalliandercom
Nov 6, 2025
Author