How does Karpenter handle node unresponsiveness in cases like OOMs? #2186
Unanswered
The9thLime
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hey. I have a nodepool spec like this
Once, all the three nodes in this NodePool became unready and unresponsive because of OOM. They stopped responding and the pods on the node got stuck in a terminating stage. Karpenter was not able to terminate those nodes (and the NodeClaims) and was not unable to provision new replacements. On the AWS console, I could see that those instances were in the terminating stage.
I was wondering how Karpenter handles these node-level OOMs since when something like this happens, the nodes become unresponsive and get into a zombie like state.
Beta Was this translation helpful? Give feedback.
All reactions