-
Notifications
You must be signed in to change notification settings - Fork 2k
Description
When running Nomad under SELinux in enforcing mode, the Task API socket is not available, even if the appropriate labels are set in the plugin configuration. It appears this is also the case for the CSI plugin socket.
In the issue below I've got some configuration options for enabling it via SELinux policy but that opens a little bit of a hole. I don't think it's something we can solve in Nomad itself without risking interfering with SELinux policies that cluster admins want. (Unless we just set the socket to be container_t
?) So maybe this needs to be something we document in the install/deploy guide? I think we'd definitely welcome the opinions of folks in the community who have specific interest here.
example jobspec
job "curl" {
group "group" {
task "task" {
driver = "docker"
config {
image = "curlimages/curl:latest"
command = "tail"
args = ["-f"]
}
# this should you hit the Task API via alloc exec with:
# curl --unix-socket /secrets/api.sock \
# -H "X-Nomad-Token: $NOMAD_TOKEN" \
# "http://local/v1/..."
identity {
env = true
file = true
}
resources {
cpu = 128
memory = 256
}
}
}
}
We have the volumes.selinuxlabel
configuration for the Docker plugin:
selinuxlabel - Allows the operator to set a SELinux label to the allocation and task local bind-mounts to containers. If used with docker.volumes.enabled set to false, the labels will still be applied to the standard binds in the container.
Client configuration for the plugin:
plugin "docker" {
config {
volumes {
enabled = true
selinuxlabel = "z"
}
}
}
If you alloc exec
into this allocation and curl the Task API socket, you get a permission denied error in the container:
$ nomad alloc exec 0c4d805f /bin/sh
~ $ curl -v --unix-socket /secrets/api.sock -H "X-Nomad-Token: $NOMAD_TOKEN" "http://local/v1/vars"
* Trying /secrets/api.sock:0...
* Immediate connect fail for /secrets/api.sock: Permission denied
* Failed to connect to local port 80 after 0 ms: Could not connect to server
* closing connection #0
curl: (7) Failed to connect to local port 80 after 0 ms: Could not connect to server
And the following in your audit logs:
type=AVC msg=audit(1759168401.396:2106): avc: denied { connectto } for pid=74015 comm="curl" path="/var/nomad/dev/data/alloc/0c4d805f-3edc-47e4-1e4f-a7d5dc918a6b/task/secrets/api.sock" scontext=system_u:system_r:container_t:s0:c39,c317 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=unix_stream_socket permissive=0
You can write to other locations in the secrets dir and the mount's Mode
is configured correctly in the Docker container:
$ docker inspect 5b1ddb430132 | jq '.[0].Mounts[2]'
{
"Type": "bind",
"Source": "/var/nomad/dev/data/alloc/0c4d805f-3edc-47e4-1e4f-a7d5dc918a6b/task/secrets",
"Destination": "/secrets",
"Mode": "z",
"RW": true,
"Propagation": "rprivate"
}
You can bypass this problem by setting task.config.security_opt = ["label=disable"]
but that's just bypassing SELinux protection entirely.
Here's what the resulting labels of objects in the secrets directory are:
# ls -lZ task/secrets/
total 8
srw-rw-rw-. 1 root root system_u:object_r:container_file_t:s0 0 Sep 29 13:43 api.sock
-rw-r--r--. 1 101 102 system_u:object_r:container_file_t:s0 8 Sep 29 13:45 foo.txt
-rw-r--r--. 1 root root system_u:object_r:container_file_t:s0 816 Sep 29 13:43 nomad_token
By my understanding of SELinux config, the container_t
process that is the container should be able to write to the sockets that are container_t
:
$ sesearch --allow -s container_t | grep unix_stream_socket
...
allow container_t container_t:unix_stream_socket { accept append bind connect connectto create getattr getopt ioctl listen lock map read sendto setattr setopt shutdown write };
...
We can use audit2allow
to generate a SELinux policy file that allows access to the task API socket as follows. First, in a root shell verify that we've got the correct item in the audit log:
root:/tmp# tail -1 /var/log/audit/audit.log | grep container_t
type=AVC msg=audit(1759169470.561:2169): avc: denied { connectto } for pid=78079 comm="curl" path="/var/nomad/dev/data/alloc/0c4d805f-3edc-47e4-1e4f-a7d5dc918a6b/task/secrets/api.sock" scontext=system_u:system_r:container_t:s0:c39,c317 tcontext=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 tclass=unix_stream_socket permissive=0
Generate a type enforcement policy file (which by my understanding is really only for human review). This shows the the socket was actually of unconfined_t
, which doesn't make a lot of sense to me given the output of ls -lZ
above:
root:/tmp# tail -1 /var/log/audit/audit.log | grep container_t | audit2allow -m nomad_task_api > ./nomad_task_api.te
root:/tmp# cat ./nomad_task_api.te
module nomad_task_api 1.0;
require {
type container_t;
type unconfined_t;
class unix_stream_socket connectto;
}
If we generate the policy and install it:
root:/tmp# tail -1 /var/log/audit/audit.log | grep container_t | audit2allow -M nomad_task_api
******************** IMPORTANT ***********************
To make this policy package active, execute:
semodule -i nomad_task_api.pp
root:/tmp# semodule -i ./nomad_task_api.pp
At this point our Task API socket works:
~ $ curl -v --unix-socket /secrets/api.sock -H "X-Nomad-Token: $NOMAD_TOKEN" "http://local/v1/vars"
* Trying /secrets/api.sock:0...
* Established connection to local (/secrets/api.sock port 0) from port 0
* using HTTP/1.x
> GET /v1/vars HTTP/1.1
> Host: local
> User-Agent: curl/8.16.0
> Accept: */*
> X-Nomad-Token: eyJhbGc[REDACTED]
>
* Request completely sent off
< HTTP/1.1 200 OK
< Content-Type: application/json
< Vary: Origin
< X-Nomad-Index: 1
< X-Nomad-Knownleader: true
< X-Nomad-Lastcontact: 0
< Date: Mon, 29 Sep 2025 18:12:37 GMT
< Content-Length: 2
<
* Connection #0 to host local:80 left intact
[]~ $ ^C