Skip to content

Vault Agent Injector TLS handshake error after 24 hours #764

@aaron-david-hughes

Description

@aaron-david-hughes

Description
Vault Agent Injector throwing "handler: http: TLS handshake error from :41106: remote error: tls: bad certificate." after over 24hrs of not injecting. Working successfully previously. Solved by redeploying but just resets cert, not a desirable fix.

Logs:

2025-05-02T15:09:39.611Z [INFO]  handler.auto-tls: Generated CA
2025-05-02T15:09:39.612Z [INFO]  handler: Starting handler..
Listening on ":8080"...
2025-05-02T15:09:39.711Z [INFO]  handler.certwatcher: Updated certificate bundle received. Updating certs...
2025-05-02T15:54:40.072Z [INFO]  handler: Request received: Method=POST URL=/mutate?timeout=30s
2025-05-02T16:34:04.497Z [INFO]  handler: Request received: Method=POST URL=/mutate?timeout=30s
2025-05-02T16:52:35.893Z [INFO]  handler: Request received: Method=POST URL=/mutate?timeout=30s
2025-05-02T16:55:07.757Z [INFO]  handler: Request received: Method=POST URL=/mutate?timeout=30s
2025-05-02T16:56:07.431Z [INFO]  handler: Request received: Method=POST URL=/mutate?timeout=30s
2025-05-02T16:57:17.234Z [INFO]  handler: Request received: Method=POST URL=/mutate?timeout=30s
2025-05-02T16:58:01.125Z [INFO]  handler: Request received: Method=POST URL=/mutate?timeout=30s
2025-05-02T17:54:43.953Z [INFO]  handler: Request received: Method=POST URL=/mutate?timeout=30s
2025-05-02T17:56:45.772Z [INFO]  handler: Request received: Method=POST URL=/mutate?timeout=30s
2025-05-02T17:57:14.472Z [INFO]  handler: Request received: Method=POST URL=/mutate?timeout=30s
2025-05-02T17:58:11.166Z [INFO]  handler: Request received: Method=POST URL=/mutate?timeout=30s
2025-05-02T17:59:37.082Z [INFO]  handler: Request received: Method=POST URL=/mutate?timeout=30s
2025-05-02T19:26:09.929Z [INFO]  handler: Request received: Method=POST URL=/mutate?timeout=30s
2025-05-02T20:32:18.246Z [INFO]  handler: Request received: Method=POST URL=/mutate?timeout=30s
2025-05-03T10:45:34.032Z [INFO]  handler: Request received: Method=POST URL=/mutate?timeout=30s
2025-05-03T10:47:04.115Z [INFO]  handler: Request received: Method=POST URL=/mutate?timeout=30s
2025-05-03T10:50:26.099Z [INFO]  handler: Request received: Method=POST URL=/mutate?timeout=30s
2025-05-03T10:53:02.145Z [INFO]  handler: Request received: Method=POST URL=/mutate?timeout=30s
2025-05-03T10:59:28.404Z [INFO]  handler: Request received: Method=POST URL=/mutate?timeout=30s
2025-05-03T11:04:46.738Z [INFO]  handler: Request received: Method=POST URL=/mutate?timeout=30s
2025-05-03T11:24:15.372Z [INFO]  handler: Request received: Method=POST URL=/mutate?timeout=30s
2025-05-03T11:27:50.644Z [INFO]  handler: Request received: Method=POST URL=/mutate?timeout=30s
2025-05-03T11:29:18.786Z [INFO]  handler: Request received: Method=POST URL=/mutate?timeout=30s
2025-05-03T11:37:37.212Z [INFO]  handler: Request received: Method=POST URL=/mutate?timeout=30s
2025-05-03T11:39:42.947Z [INFO]  handler: Request received: Method=POST URL=/mutate?timeout=30s
2025-05-05T22:13:27.513Z [ERROR] handler: http: TLS handshake error from <redacted>:41106: remote error: tls: bad certificate
2025-05-05T22:17:20.390Z [ERROR] handler: http: TLS handshake error from <redacted>:56012: remote error: tls: bad certificate
2025-05-05T22:18:48.116Z [ERROR] handler: http: TLS handshake error from <redacted>:35240: remote error: tls: bad certificate
2025-05-05T22:20:03.765Z [ERROR] handler: http: TLS handshake error from <redacted>:49792: remote error: tls: bad certificate
2025-05-05T22:21:23.658Z [ERROR] handler: http: TLS handshake error from <redacted>:36806: remote error: tls: bad certificate
2025-05-05T22:25:23.233Z [ERROR] handler: http: TLS handshake error from <redacted>:58518: remote error: tls: bad certificate
2025-05-05T22:27:20.120Z [ERROR] handler: http: TLS handshake error from <redacted>:43968: remote error: tls: bad certificate
2025-05-05T22:32:30.611Z [ERROR] handler: http: TLS handshake error from <redacted>:49864: remote error: tls: bad certificate

To Reproduce

  1. Verify working to inject
  2. Don't deploy for over 24hrs
  3. Uninstall deployment
  4. Attempt same deployment - should not see init or sidecar injected

Application deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: "cache-using-app-{{ .Release.Name }}"
spec:
  replicas: 1
  selector:
    matchLabels:
      app: "{{ .Release.Name }}"
      version: "{{ .Release.Name }}"
  template:
    metadata:
      labels:
        app: "{{ .Release.Name }}"
        version: "{{ .Release.Name }}"
      annotations:
        vault.hashicorp.com/agent-inject: 'true'
        vault.hashicorp.com/role: 'cache-using-app'
        vault.hashicorp.com/agent-inject-file-my-jks: my.jks
        vault.hashicorp.com/secret-volume-path-my-jks: /run/certs
        vault.hashicorp.com/agent-inject-template-my-jks: |
          {{ `
          {{- with secret "kv/certs/my.jks" -}}
            {{- if .Data.cert -}}
              {{ base64Decode .Data.cert }}
            {{- end -}}
          {{- end -}}
          ` }}
        vault.hashicorp.com/agent-inject-file-my-jks-pwd: my.jks.password-env.sh
        vault.hashicorp.com/secret-volume-path-my-jks-pwd: /run/secrets
        vault.hashicorp.com/agent-inject-template-my-jks-pwd: |
          {{ `
          {{- with secret "kv/certs/my.jks" -}}
            {{- if .Data.password -}}
              export MY_JKS_PWD={{ .Data.password }}
            {{- end -}}
          {{- end -}}
          ` }}
    spec:
      serviceAccountName: "cache-using-app"
      containers:
        - name: "{{ .Release.Name }}"
          image: "<internal image>"
          imagePullPolicy: Never
          ports:
            - containerPort: 8443

To install vault injector in this test environment:

helm repo add hashicorp https://helm.releases.hashicorp.com
helm install vault hashicorp/vault --set="injector.enabled=true"

Both Deployments as per below, can see no sidecar

cache-using-app-cert-96855547f-dd6hc            1/1     Running            0               9h
cache-using-app-vars-64fd588bbf-lrwgl           2/2     Running            0               3d15h

Expected behaviour
Should be able to deploy over 24hrs apart. Cert should be renewed successfully despite not being used for 24hrs.

Environment

  • kubernetes: v1.31.2+k3s1
  • vault-k8s version: 1.6.2

Similar Issues Raised

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions