Skip to content

Nomad-owned sockets not accessible under SELinux by default (Task API, CSI) #26852

@tgross

Description

@tgross

When running Nomad under SELinux in enforcing mode, the Task API socket is not available, even if the appropriate labels are set in the plugin configuration. It appears this is also the case for the CSI plugin socket.

In the issue below I've got some configuration options for enabling it via SELinux policy but that opens a little bit of a hole. I don't think it's something we can solve in Nomad itself without risking interfering with SELinux policies that cluster admins want. (Unless we just set the socket to be container_t?) So maybe this needs to be something we document in the install/deploy guide? I think we'd definitely welcome the opinions of folks in the community who have specific interest here.

example jobspec
job "curl" {

  group "group" {

    task "task" {

      driver = "docker"

      config {
        image   = "curlimages/curl:latest"
        command = "tail"
        args    = ["-f"]
      }

      # this should you hit the Task API via alloc exec with:
      # curl --unix-socket /secrets/api.sock \
      #      -H "X-Nomad-Token: $NOMAD_TOKEN" \
      #      "http://local/v1/..."
      identity {
        env  = true
        file = true
      }

      resources {
        cpu    = 128
        memory = 256
      }

    }
  }
}

We have the volumes.selinuxlabel configuration for the Docker plugin:

selinuxlabel - Allows the operator to set a SELinux label to the allocation and task local bind-mounts to containers. If used with docker.volumes.enabled set to false, the labels will still be applied to the standard binds in the container.

Client configuration for the plugin:

plugin "docker" {
  config {
    volumes {
      enabled = true
      selinuxlabel = "z"
    }
  }
}

If you alloc exec into this allocation and curl the Task API socket, you get a permission denied error in the container:

$ nomad alloc exec 0c4d805f /bin/sh

~ $ curl -v --unix-socket /secrets/api.sock -H "X-Nomad-Token: $NOMAD_TOKEN" "http://local/v1/vars"
*   Trying /secrets/api.sock:0...
* Immediate connect fail for /secrets/api.sock: Permission denied
* Failed to connect to local port 80 after 0 ms: Could not connect to server
* closing connection #0
curl: (7) Failed to connect to local port 80 after 0 ms: Could not connect to server

And the following in your audit logs:

type=AVC msg=audit(1759168401.396:2106): avc: denied { connectto } for pid=74015 comm="curl" path="/var/nomad/dev/data/alloc/0c4d805f-3edc-47e4-1e4f-a7d5dc918a6b/task/secrets/api.sock" scontext=system_u:system_r:container_t:s0:c39,c317 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=unix_stream_socket permissive=0

You can write to other locations in the secrets dir and the mount's Mode is configured correctly in the Docker container:

$ docker inspect 5b1ddb430132 | jq '.[0].Mounts[2]'
{
  "Type": "bind",
  "Source": "/var/nomad/dev/data/alloc/0c4d805f-3edc-47e4-1e4f-a7d5dc918a6b/task/secrets",
  "Destination": "/secrets",
  "Mode": "z",
  "RW": true,
  "Propagation": "rprivate"
}

You can bypass this problem by setting task.config.security_opt = ["label=disable"] but that's just bypassing SELinux protection entirely.

Here's what the resulting labels of objects in the secrets directory are:

# ls -lZ task/secrets/
total 8
srw-rw-rw-. 1 root root system_u:object_r:container_file_t:s0   0 Sep 29 13:43 api.sock
-rw-r--r--. 1  101  102 system_u:object_r:container_file_t:s0   8 Sep 29 13:45 foo.txt
-rw-r--r--. 1 root root system_u:object_r:container_file_t:s0 816 Sep 29 13:43 nomad_token

By my understanding of SELinux config, the container_t process that is the container should be able to write to the sockets that are container_t:

$ sesearch --allow -s container_t | grep unix_stream_socket
...
allow container_t container_t:unix_stream_socket { accept append bind connect connectto create getattr getopt ioctl listen lock map read sendto setattr setopt shutdown write };
...

We can use audit2allow to generate a SELinux policy file that allows access to the task API socket as follows. First, in a root shell verify that we've got the correct item in the audit log:

root:/tmp# tail -1 /var/log/audit/audit.log | grep container_t
type=AVC msg=audit(1759169470.561:2169): avc:  denied  { connectto } for  pid=78079 comm="curl" path="/var/nomad/dev/data/alloc/0c4d805f-3edc-47e4-1e4f-a7d5dc918a6b/task/secrets/api.sock" scontext=system_u:system_r:container_t:s0:c39,c317 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=unix_stream_socket permissive=0

Generate a type enforcement policy file (which by my understanding is really only for human review). This shows the the socket was actually of unconfined_t, which doesn't make a lot of sense to me given the output of ls -lZ above:

root:/tmp# tail -1 /var/log/audit/audit.log | grep container_t | audit2allow -m nomad_task_api > ./nomad_task_api.te

root:/tmp# cat ./nomad_task_api.te

module nomad_task_api 1.0;

require {
        type container_t;
        type unconfined_t;
        class unix_stream_socket connectto;
}

If we generate the policy and install it:

root:/tmp# tail -1 /var/log/audit/audit.log | grep container_t | audit2allow -M nomad_task_api
******************** IMPORTANT ***********************
To make this policy package active, execute:

semodule -i nomad_task_api.pp

root:/tmp# semodule -i ./nomad_task_api.pp

At this point our Task API socket works:

~ $ curl -v --unix-socket /secrets/api.sock -H "X-Nomad-Token: $NOMAD_TOKEN" "http://local/v1/vars"
*   Trying /secrets/api.sock:0...
* Established connection to local (/secrets/api.sock port 0) from  port 0
* using HTTP/1.x
> GET /v1/vars HTTP/1.1
> Host: local
> User-Agent: curl/8.16.0
> Accept: */*
> X-Nomad-Token: eyJhbGc[REDACTED]
>
* Request completely sent off
< HTTP/1.1 200 OK
< Content-Type: application/json
< Vary: Origin
< X-Nomad-Index: 1
< X-Nomad-Knownleader: true
< X-Nomad-Lastcontact: 0
< Date: Mon, 29 Sep 2025 18:12:37 GMT
< Content-Length: 2
<
* Connection #0 to host local:80 left intact
[]~ $ ^C

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions