Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shim: Add support for cgroups v2 stats #11472

Open
Champ-Goblem opened this issue Feb 13, 2025 · 2 comments · May be fixed by #11473
Open

Shim: Add support for cgroups v2 stats #11472

Champ-Goblem opened this issue Feb 13, 2025 · 2 comments · May be fixed by #11473
Labels
type: bug Something isn't working

Comments

@Champ-Goblem
Copy link
Contributor

Description

Currently, the shim supports CRI stats interface but only uses the v1 version of metrics. This causes issues with gathering containerd-based workload metrics on CGroup V2 systems, where containerd expects the shim to return the V2 metrics https://github.com/containerd/containerd/blob/main/core/metrics/cgroups/cgroups.go#L85-L88.

When the two versions are mismatched, containerd errors about failing to gather metrics for a container:

level=error msg="invalid metric type for 86fae802d3432b33d9eb424efa5a7d1c57dd8fa1eb95eb7d49ebd262a860ea85" error="<nil>"

or

unmarshal stats for 468954f334603fe0e97efbf03f0eacd0143c1fe9d146c93eae4163a4f51f5041" error="can't unmarshal type \"io.containerd.cgroups.v1.Metrics\" to output \"io.containerd.cgroups.v2.Metrics\"

depending on the version of containerd used.

Steps to reproduce

No response

runsc version

docker version (if using docker)

uname

No response

kubectl (if using Kubernetes)

repo state (if built from source)

No response

runsc debug logs (if available)

@Champ-Goblem Champ-Goblem added the type: bug Something isn't working label Feb 13, 2025
Champ-Goblem added a commit to Champ-Goblem/gvisor that referenced this issue Feb 13, 2025
…inerd metrics in the shim, v2 metrics are only used when runsc is run with --system-cgroup=true.\nContainerd requires v2 metrics when the host is run with CGroupsV2.\nThis issue was noticed when attempting to gather metrics on AL2023 which defaults to CGroupsV2.\nFixes: google#11472

Signed-off-by: Champ-Goblem <[email protected]>
Champ-Goblem added a commit to Champ-Goblem/gvisor that referenced this issue Feb 13, 2025
Add support for v2 containerd metrics in the shim, v2 metrics are only used when runsc is run with --system-cgroup=true.
Containerd requires v2 metrics when the host is run with CGroupsV2.
This issue was noticed when attempting to gather metrics on AL2023 which defaults to CGroupsV2.

Fixes: google#11472
Signed-off-by: Champ-Goblem <[email protected]>
@Champ-Goblem Champ-Goblem linked a pull request Feb 13, 2025 that will close this issue
@Champ-Goblem
Copy link
Contributor Author

Champ-Goblem commented Feb 13, 2025

Testing this change on AWS AL2022 vs AL2023:

Image

Commands used for testing:

CPU: stress-ng -c 1
Memory: stress-ng --vm 1 --vm-bytes 100M

The difference between runC and gVisor might be a separate issue.

@EtiennePerot
Copy link
Contributor

cc @milantracy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants