Unable to access cluster due to certificate expiration with odd time #12116
-
|
I have a 4 node raspberry pi Talos cluster that I'm no longer able to access the talos api due to what appears to be an expired certificate. However the expiration date looks to be much too old. I think this looks different than the issue in #9457 because the config info shows the certificate is still valid for a month The kubernetes cluster isn't accessible at all currently Is there anything I can try to recover access to the talos api for this cluster? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
|
I found https://www.reddit.com/r/TalosLinux/comments/1mu0uxp/talosctl_commands_fail_with_tls_verification_on/ that looks like a similar issue. I lost access after all the nodes rebooted suddenly due to a power loss, but I've rebooted nodes in the past with no issues. |
Beta Was this translation helpful? Give feedback.
-
|
Figured it out. I did what I should have done in the first place and connected a monitor to one of the nodes. The console had a bunch of errors like "time query error with server 192.168.4.1" and "kiss of death received". Apparently the nodes are using my eero router for NTP, which is a little surprising to me since I think the default in the machine configs is to use cloudflare and I haven't changed that. I also don't know why the router NTP server was sending an error but I restarted the eero network and then the time query errors stopped and the talosctl certificate errors went away. Unfortunately, etcd is failing to get healthy now but at least I have a chance to figure that out now that I can interact with the api again. |
Beta Was this translation helpful? Give feedback.
Figured it out. I did what I should have done in the first place and connected a monitor to one of the nodes. The console had a bunch of errors like "time query error with server 192.168.4.1" and "kiss of death received". Apparently the nodes are using my eero router for NTP, which is a little surprising to me since I think the default in the machine configs is to use cloudflare and I haven't changed that. I also don't know why the router NTP server was sending an error but I restarted the eero network and then the time query errors stopped and the talosctl certificate errors went away. Unfortunately, etcd is failing to get healthy now but at least I have a chance to figure that out now t…