-
Notifications
You must be signed in to change notification settings - Fork 480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pods occasionally come up without injected sidecars when AKS cluster is restarted #2901
Comments
The sidecar injection is using pod mutation webhook
The webhook defines
That said I would recommend trying to change the policy to fail. It might have other side effects but it is worth trying. |
When I set
which causes some (but not necessarily all) other deployments to also fail with a similar error message. When I use the additional settings from issue #1765 (exclude the namespace of the OTel operator) then there above error messages don't appear for the OTel operator deployment and the other deployments eventually start as expected with the sidecars, after initially generating the above error message. I still don't understand how to configure the helm chart in order to include these settings. |
I just found out that there is an "admission enforcer" in AKS (Azure/AKS#4002) which apparently overwrites whatever I configure for the |
is the same as #1329? |
Yes, sounds similar. I'm doing sidecar injection instead of auto-instrumentation, but the effect seems to be the same. Plus AKS adds another layer of difficulty with its admission enforcement. |
Yeah we had some discussion on that issue, and ultimately came to this conclusion. Would you mind commenting on that issue with what you expect to happen here / how we could help bubble this up better? Any opposition to me closing this issue? |
@KarstenWintermann I'm going to close this issue in favor of #1329. Let me know if you have any further issues. Thank you! |
Component(s)
No response
What happened?
Description
I have configured an OpenTelemetry collector sidecar and configured several pods for sidecar injection. The sidecars are always injected successfully as expected when I restart individual pods, for example by "kubectl delete pod ...".
However, when the whole AKS cluster is stopped and started (e.g. via "az aks stop ...", "az aks start ..."), occasionally some or all of the pods come up without the sidecars.
There are no relevant K8S events or log messages in case the pods come up without sidecars.
I would estimate that about 50% of the time, there are sidecar injections missing after a cluster restart.
Steps to Reproduce
Expected Result
Sidecars are always present in all pods configured for sidecar injection, even after cluster restart
Actual Result
Occasionally, sidecars are not present after cluster restart in the running pods
Kubernetes Version
1.28.5
Operator version
0.98.0
Collector version
0.98.0
Environment information
Environment
AKS 1.28.5
Quarkus 3.6.x
dapr.io 1.13.2
Node image: AKSUbuntu-2204gen2containerd-202402.07.0
Log output
No response
Additional context
I have noticed similar issues with dapr.io's sidecar injection. However, in dapr.io it is possible to configure a watchdog (https://docs.dapr.io/concepts/dapr-services/operator/#injector-watchdog, https://github.com/dapr/dapr/blob/c75c08f6f364620238b67cb2bfd231b3bde57c79/pkg/operator/watchdog.go) which periodically checks for pods which are annotated for sidecar injection and are missing sidecars. If any pods are found, they are deleted so that they have a chance to come up again with injected sidecar. This fixes the same issue in dapr.
I am aware of issue #1765 but that doesn't fix my issue, also I have found no way to configure the changes described there through the operator helm chart.
The text was updated successfully, but these errors were encountered: