-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Open
Labels
Description
Describe the bug
While running Vault 1.16.1 on Openshift 4.12.0/k8s/1.25.0 the vault-agent-injector pod is gettting constant error reported in it's log:
2025-10-09T19:02:31.223Z [INFO] handler.auto-tls: Generated CA
Listening on ":8080"...
2025-10-09T19:02:31.227Z [INFO] handler: Starting handler..
2025-10-09T19:02:31.324Z [INFO] handler.certwatcher: Updated certificate bundle received. Updating certs...
2025-10-09T19:02:40.268Z [ERROR] handler: http: TLS handshake error from 10.42.137.137:54762: read tcp 100.65.66.229:8080->10.42.137.137:54762: read: permission denied
2025-10-09T19:02:45.267Z [ERROR] handler: http: TLS handshake error from 10.42.137.137:52396: read tcp 100.65.66.229:8080->10.42.137.137:52396: read: permission denied
2025-10-09T19:02:50.268Z [ERROR] handler: http: TLS handshake error from 10.42.137.137:52408: read tcp 100.65.66.229:8080->10.42.137.137:52408: read: permission denied
2025-10-09T19:02:55.268Z [ERROR] handler: http: TLS handshake error from 10.42.137.137:51538: read tcp 100.65.66.229:8080->10.42.137.137:51538: read: permission denied
To Reproduce
Steps to reproduce the behavior:
- Deploy recent release of Vault on Openshift
- Observe the healh status of k8s entiities
- See the vault-agent-injector
Expected behavior
- Deploy recent release of Vault on Openshift
- Observe the healh status of k8s entiities
- Ensure all the components health is OK.
Environment:
- Vault Server Version: 1.16.1
- Vault CLI Version: Vault v1.16.1 (6b59867), built 2024-04-03T12:35:53Z
- Server Operating System/Architecture:
cat /etc/*release
Fedora release 37 (Thirty Seven)
NAME="Fedora Linux"
VERSION="37.20230322.3.0 (CoreOS)"
ID=fedora
VERSION_ID=37
VERSION_CODENAME=""
PLATFORM_ID="platform:f37"
PRETTY_NAME="Fedora CoreOS 37.20230322.3.0"
ANSI_COLOR="0;38;2;60;110;180"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:37"
HOME_URL="https://getfedora.org/coreos/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora-coreos/"
SUPPORT_URL="https://github.com/coreos/fedora-coreos-tracker/"
BUG_REPORT_URL="https://github.com/coreos/fedora-coreos-tracker/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=37
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=37
SUPPORT_END=2023-11-14
VARIANT="CoreOS"
VARIANT_ID=coreos
OSTREE_VERSION='37.20230322.3.0'
Fedora release 37 (Thirty Seven)
Fedora release 37 (Thirty Seven)
Vault server configuration file(s):
extraconfig-from-values.hcl: |-
ui = true
cluster_name = "vault-integrated-storage"
listener "tcp" {
address = "[::]:8200"
cluster_address = "[::]:8201"
tls_cert_file = "/vault/userconfig/tls-server/tls.crt"
tls_key_file = "/vault/userconfig/tls-server/tls.key"
}
storage "raft" {
path = "/vault/data"
retry_join {
leader_api_addr = "https://vault-0.vault-internal:8200"
leader_ca_cert_file = "/vault/userconfig/tls-ca/ca.crt"
leader_client_cert_file = "/vault/userconfig/tls-server/tls.crt"
leader_client_key_file = "/vault/userconfig/tls-server/tls.key"
}
retry_join {
leader_api_addr = "https://vault-1.vault-internal:8200"
leader_ca_cert_file = "/vault/userconfig/tls-ca/ca.crt"
leader_client_cert_file = "/vault/userconfig/tls-server/tls.crt"
leader_client_key_file = "/vault/userconfig/tls-server/tls.key"
}
}
disable_mlock = true
apiVersion: apps/v1
kind: Deployment
metadata:
name: vault-agent-injector
labels:
app.kubernetes.io/name: vault-agent-injector
app.kubernetes.io/instance: vault
app.kubernetes.io/managed-by: Helm
component: webhook
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: vault-agent-injector
app.kubernetes.io/instance: vault
component: webhook
template:
metadata:
labels:
app.kubernetes.io/name: vault-agent-injector
app.kubernetes.io/instance: vault
component: webhook
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app.kubernetes.io/name: vault-agent-injector
app.kubernetes.io/instance: "vault"
component: webhook
topologyKey: kubernetes.io/hostname
serviceAccountName: "vault-agent-injector"
hostNetwork: false
containers:
- name: sidecar-injector
image: "hashicorp/vault-k8s:1.7.0"
imagePullPolicy: "IfNotPresent"
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
env:
- name: AGENT_INJECT_LISTEN
value: :8080
- name: AGENT_INJECT_LOG_LEVEL
value: info
- name: AGENT_INJECT_VAULT_ADDR
value: https://vault.default.svc:8200
- name: AGENT_INJECT_VAULT_AUTH_PATH
value: auth/kubernetes
- name: AGENT_INJECT_VAULT_IMAGE
value: "hashicorp/vault:1.20.4"
- name: AGENT_INJECT_TLS_AUTO
value: vault-agent-injector-cfg
- name: AGENT_INJECT_TLS_AUTO_HOSTS
value: vault-agent-injector-svc,vault-agent-injector-svc.default,vault-agent-injector-svc.default.svc
- name: AGENT_INJECT_LOG_FORMAT
value: standard
- name: AGENT_INJECT_REVOKE_ON_SHUTDOWN
value: "false"
- name: AGENT_INJECT_CPU_REQUEST
value: "250m"
- name: AGENT_INJECT_CPU_LIMIT
value: "500m"
- name: AGENT_INJECT_MEM_REQUEST
value: "64Mi"
- name: AGENT_INJECT_MEM_LIMIT
value: "128Mi"
- name: AGENT_INJECT_DEFAULT_TEMPLATE
value: "map"
- name: AGENT_INJECT_TEMPLATE_CONFIG_EXIT_ON_RETRY_FAILURE
value: "true"
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
args:
- agent-inject
- 2>&1
livenessProbe:
httpGet:
path: /health/ready
port: 8080
scheme: HTTPS
failureThreshold: 2
initialDelaySeconds: 5
periodSeconds: 2
successThreshold: 1
timeoutSeconds: 5
readinessProbe:
httpGet:
path: /health/ready
port: 8080
scheme: HTTPS
failureThreshold: 2
initialDelaySeconds: 5
periodSeconds: 2
successThreshold: 1
timeoutSeconds: 5
startupProbe:
httpGet:
path: /health/ready
port: 8080
scheme: HTTPS
failureThreshold: 12
initialDelaySeconds: 5
periodSeconds: 5
successThreshold: 1
timeoutSeconds: 5
Additional context
What did not help:
- assigning to the vault-agent-injector anyuid SCC and running under stock securityContext
securityContext:
runAsNonRoot: true
runAsGroup: 1000
runAsUser: 100
fsGroup: 1000
- assigning to the vault-agent-injector privileged SCC and updating the deployment, while running under the stock SC
- changing the vault-agent-injector port to 18080
What actually did help, but cannot considered as a workaround:
- switching selinux to Permissive on a node where the pod was scheduled: pod immediately became ready/healthy.