-
Notifications
You must be signed in to change notification settings - Fork 128
Description
Is there an existing issue for this?
- I have searched the existing issues
Version
v1.0.1
What happened?
I peered two clusters:
Consumer Cluster: k3s cluster boostrapped on hetzner via https://github.com/kube-hetzner/terraform-hcloud-kube-hetzner
Provider Cluster: private k3s cluster running on tailscale network, no public IPs
All nodes (from both consumer and provider cluster) are also member of the same tailscale network.
Clusters where peered with:
liqoctl peer \
--remote-kubeconfig "$KUBECONFIG_CONSUMER" \
--gw-server-service-location Consumer \
--gw-server-service-type NodePort \
--gw-server-service-port 51840 \
--gw-server-service-nodeport 32050 \
--gw-client-address 100.118.118.27 \ # tailscale IP of control plane node in consumer cluster
--gw-client-port 32050I can schedule workloads successfully in the consumer cluster and they'll run on the provider clusters:
kubectl --context consumer get pods
NAME READY STATUS RESTARTS AGE
nvidia-smi-test-arm64 0/1 Completed 0 9m41s
kubectl --context provider -n default-sparkling-dust get pods
NAME READY STATUS RESTARTS AGE
nvidia-smi-test-arm64 0/1 Completed 0 9m57sI can also retrieve the logs from the provider cluster:
kubectl --context provider -n default-sparkling-dust logs -f nvidia-smi-test-arm64
Mon Aug 11 10:51:58 2025
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 540.4.0 Driver Version: 540.4.0 CUDA Version: 12.6 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 Orin (nvgpu) N/A | N/A N/A | N/A |
| N/A N/A N/A N/A / N/A | Not Supported | N/A N/A |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+But when trying to retrieve the logs via the consumer cluster:
kubectl --context consumer logs -f nvidia-smi-test-arm64
Error from server: Get "https://10.42.1.14:10250/containerLogs/default/nvidia-smi-test-arm64/nvidia-smi-container?follow=true": proxy error from 127.0.0.1:6443 while dialing 10.42.1.14:10250, code 502: 502 Bad GatewayI searched on slack and there are at least two other persons that have the same issue.
Any help / guidance would be very much appreciated!
How can we reproduce the issue?
not sure.
Provider or distribution
k3s
CNI version
flannel
Kernel Version
No response
Kubernetes Version
1.31.11+k3s1
Code of Conduct
- I agree to follow this project's Code of Conduct