-
Notifications
You must be signed in to change notification settings - Fork 13
Open
Labels
needs: reporter feedbackWaiting for reporter to provide additional informationWaiting for reporter to provide additional information
Description
Contrast version
v1.13.0
Deployment platform
Metal-QEMU-TDX
Issue description + logs
I have raised a similar issue before.
But, this time, the underlying issue seems to be different:
First, I have installed correct k3s version:
$ k3s --version
k3s version v1.31.5+k3s1 (56ec5dd4)
go version go1.22.10
I am following these intructions to setup emoji-voting deployment.
$ kubectl describe pods coordinator-0
Name: coordinator-0
Namespace: default
Priority: 0
Runtime Class Name: contrast-cc-metal-qemu-tdx-7362433c
Service Account: coordinator
Node: cocodell/172.16.5.23
Start Time: Fri, 17 Oct 2025 09:13:42 +0000
Labels: app.kubernetes.io/name=coordinator
apps.kubernetes.io/pod-index=0
controller-revision-hash=coordinator-58bff54cbd
statefulset.kubernetes.io/pod-name=coordinator-0
Annotations: contrast.edgeless.systems/pod-role: coordinator
io.katacontainers.config.agent.policy:
H4sIAAAAAAAC/+19+1cjN7Lw7/4rfJ2cs3gX/LZ53M83S4BJOJkBDjCb3TtDfNvuNvRi3J1umxky4X//qvSWWupuG2Ym2UPO7uDWo1QqlUpSVan0TfUgih+S8PpmUd2Y1KudVqdbfR...
Status: Pending
IP:
IPs: <none>
Controlled By: StatefulSet/coordinator
Init Containers:
pvc-holder:
Container ID:
Image: ghcr.io/edgelesssys/contrast/initializer:v1.13.0@sha256:35b0f6f5404a67568ea26987a7eac7f405d5343c86f6a3e671a5193295dafc05
Image ID:
Port: <none>
Host Port: <none>
Command:
/bin/sh
-c
sleep inf
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Limits:
memory: 50Mi
Restart Count: 0
Limits:
memory: 50Mi
Requests:
memory: 50Mi
Liveness: http-get http://:9102/probe/liveness delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:9102/probe/readiness delay=0s timeout=1s period=5s #success=1 #failure=3
Startup: http-get http://:9102/probe/startup delay=1s timeout=1s period=1s #success=1 #failure=60
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-9dsbh (ro)
Conditions:
Type Status
PodReadyToStartContainers False
Initialized False
Ready False
ContainersReady False
PodScheduled True
Volumes:
image-store:
Type: EphemeralVolume (an inline specification for a volume that gets created and deleted with the pod)
StorageClass:
Volume:
Labels: <none>
Annotations: <none>
Capacity:
Access Modes:
VolumeMode: Block
kube-api-access-9dsbh:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 25m default-scheduler 0/1 nodes are available: waiting for ephemeral volume controller to create the persistentvolumeclaim "coordinator-0-image-store". preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.
Warning FailedScheduling 25m default-scheduler 0/1 nodes are available: pod has unbound immediate PersistentVolumeClaims. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.
Normal Scheduled 25m default-scheduler Successfully assigned default/coordinator-0 to cocodell
Normal SuccessfulAttachVolume 24m attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-5a5956b7-c21a-467b-80fd-aa85767f51ad"
Normal SuccessfulMountVolume 24m kubelet MapVolume.MapPodDevice succeeded for volume "pvc-5a5956b7-c21a-467b-80fd-aa85767f51ad" globalMapPath "/var/lib/kubelet/plugins/kubernetes.io/csi/volumeDevices/pvc-5a5956b7-c21a-467b-80fd-aa85767f51ad/dev"
Normal SuccessfulMountVolume 24m kubelet MapVolume.MapPodDevice succeeded for volume "pvc-5a5956b7-c21a-467b-80fd-aa85767f51ad" volumeMapPath "/var/lib/kubelet/pods/77a8eda8-ea6b-498c-918c-9b91347c1666/volumeDevices/kubernetes.io~csi"
Warning FailedCreatePodSandBox 4m50s (x92 over 24m) kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to get sandbox runtime: no runtime for "contrast-cc-metal-qemu-tdx-7362433c" is configured
nodeinstaller is healthy:
$ kubectl describe pods -n kube-system -l app.kubernetes.io/name=contrast-cc-metal-qemu-tdx-7362433c-nodeinstaller
Name: contrast-cc-metal-qemu-tdx-7362433c-nodeinstaller-drx8w
Namespace: kube-system
Priority: 0
Service Account: contrast-cc-metal-qemu-tdx-7362433c-nodeinstaller
Node: cocodell/172.16.5.23
Start Time: Fri, 17 Oct 2025 07:53:33 +0000
Labels: app.kubernetes.io/name=contrast-cc-metal-qemu-tdx-7362433c-nodeinstaller
controller-revision-hash=55975bb54
pod-template-generation=1
Annotations: contrast.edgeless.systems/platform: Metal-QEMU-TDX
contrast.edgeless.systems/pod-role: contrast-node-installer
Status: Running
IP: 10.42.0.30
IPs:
IP: 10.42.0.30
Controlled By: DaemonSet/contrast-cc-metal-qemu-tdx-7362433c-nodeinstaller
Init Containers:
installer:
Container ID: containerd://65f916db6ec26ed246449113e6480b5188b691050530c0e12af10708a2229f73
Image: ghcr.io/edgelesssys/contrast/node-installer-kata:v1.13.0@sha256:9cf97f63bd949c0146a5896366dd1becc9fc8e408f5308483b204379c7989369
Image ID: ghcr.io/edgelesssys/contrast/node-installer-kata@sha256:9cf97f63bd949c0146a5896366dd1becc9fc8e408f5308483b204379c7989369
Port: <none>
Host Port: <none>
Command:
/bin/node-installer
Metal-QEMU-TDX
State: Terminated
Reason: Completed
Exit Code: 0
Started: Fri, 17 Oct 2025 07:53:49 +0000
Finished: Fri, 17 Oct 2025 07:53:56 +0000
Ready: True
Restart Count: 0
Requests:
memory: 700Mi
Environment: <none>
Mounts:
/host from host-mount (rw)
/target-config from target-config (rw)
/var/run/dbus/system_bus_socket from var-run-dbus-socket (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-z7zdq (ro)
Containers:
pause:
Container ID: containerd://516f0b3af7b786577dc861e925da350093a383fcf78d6b180a8f158e3d7e435e
Image: registry.k8s.io/pause:3.6@sha256:3d380ca8864549e74af4b29c10f9cb0956236dfb01c40ca076fb6c37253234db
Image ID: docker.io/rancher/mirrored-pause@sha256:74c4244427b7312c5b901fe0f67cbc53683d06f4f24c6faee65d4182bf0fa893
Port: <none>
Host Port: <none>
State: Running
Started: Fri, 17 Oct 2025 07:54:02 +0000
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-z7zdq (ro)
Conditions:
Type Status
PodReadyToStartContainers True
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
host-mount:
Type: HostPath (bare host directory volume)
Path: /
HostPathType: Directory
var-run-dbus-socket:
Type: HostPath (bare host directory volume)
Path: /var/run/dbus/system_bus_socket
HostPathType: Socket
target-config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: contrast-node-installer-target-config
Optional: true
kube-api-access-z7zdq:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/disk-pressure:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists
node.kubernetes.io/pid-pressure:NoSchedule op=Exists
node.kubernetes.io/unreachable:NoExecute op=Exists
node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events: <none>
TDX is installed correctly:
$ sudo dmesg | grep -i TDX
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-6.15.0-rc1+ root=UUID=0b8b36f6-5d40-40ed-9205-2b3ed34a3fdd ro ro nohibernate kvm_intel.tdx=on nosoftlockup nmi_watchdog=0 tsc=reliable irqaffinity=0-13 nohz_full=14-27 rcu_nocbs=14-27 intel_pstate=disable intel_idle.max_cstate=0 processor.max_cstate=0
[ 0.402851] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.15.0-rc1+ root=UUID=0b8b36f6-5d40-40ed-9205-2b3ed34a3fdd ro ro nohibernate kvm_intel.tdx=on nosoftlockup nmi_watchdog=0 tsc=reliable irqaffinity=0-13 nohz_full=14-27 rcu_nocbs=14-27 intel_pstate=disable intel_idle.max_cstate=0 processor.max_cstate=0
[ 0.969749] virt/tdx: BIOS enabled: private KeyID range [32, 64)
[ 0.969750] virt/tdx: Disable ACPI S3. Turn off TDX in the BIOS to use ACPI S3.
[ 2.079886] virt/tdx: TDX module 1.5.06.00, build number 744, build date 0134d817
[ 2.758032] virt/tdx: 1042424 KB allocated for PAMT
[ 2.758035] virt/tdx: module initialized
But, as you can already see from the above dmesg log, my system is not exactly the same as ubuntu TDX repo.
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 24.04.3 LTS
Release: 24.04
Codename: noble
$ uname -a
Linux cocodell 6.15.0-rc1+ #1 SMP PREEMPT_RT Fri Jul 11 13:06:48 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
I have also installed corresponing QEMU and EDK2 versions and I am able to launch TDs:
ubuntu@td1:~$ sudo dmesg | grep -i tdx
[ 0.000000] tdx: Guest detected
[ 0.000000] tdx: Attributes: SEPT_VE_DISABLE
[ 0.000000] tdx: TD_CTLS: PENDING_VE_DISABLE ENUM_TOPOLOGY
[ 6.990082] process: using TDX aware idle routine
[ 6.990082] Memory Encryption Features active: Intel TDX
[ 19.044222] systemd[1]: Detected confidential virtualization tdx.
I am not sure if this is the root cause for my issue. However, I am having also with issue with contrast not registering runtimeclass with contianerd:
$ kubectl debug node/cocodell -it --image busybox -- cat /host/var/lib/rancher/k3s/agent/etc/containerd/config.toml
Creating debugging pod node-debugger-cocodell-bwgsm with container debugger on node cocodell.
# File generated by k3s. DO NOT EDIT. Use config.toml.tmpl instead.
version = 2
[plugins."io.containerd.internal.v1.opt"]
path = "/var/lib/rancher/k3s/agent/containerd"
[plugins."io.containerd.grpc.v1.cri"]
stream_server_address = "127.0.0.1"
stream_server_port = "10010"
enable_selinux = false
enable_unprivileged_ports = true
enable_unprivileged_icmp = true
device_ownership_from_security_context = false
sandbox_image = "rancher/mirrored-pause:3.6"
[plugins."io.containerd.grpc.v1.cri".containerd]
snapshotter = "overlayfs"
disable_snapshot_annotations = true
[plugins."io.containerd.grpc.v1.cri".cni]
bin_dir = "/var/lib/rancher/k3s/data/cni"
conf_dir = "/var/lib/rancher/k3s/agent/etc/cni/net.d"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
[plugins."io.containerd.grpc.v1.cri".registry]
config_path = "/var/lib/rancher/k3s/agent/etc/containerd/certs.d"
Could you please help me debug this issue further?
Steps to reproduce the behavior
No response
Metadata
Metadata
Assignees
Labels
needs: reporter feedbackWaiting for reporter to provide additional informationWaiting for reporter to provide additional information