Skip to content

Commit

Permalink
Update cgroups enhancement proposal to support the removal of cgroupv1
Browse files Browse the repository at this point in the history
Signed-off-by: Sai Ramesh Vanka <[email protected]>
  • Loading branch information
sairameshv committed Feb 5, 2025
1 parent 2890ccc commit f1c6619
Showing 1 changed file with 31 additions and 21 deletions.
52 changes: 31 additions & 21 deletions enhancements/machine-config/mco-cgroupsv2-support.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
title: Control Group v2 Enablement
authors:
- "@rphillips"
- "@sairameshv"
reviewers:
- "@mrunalp"
- "@kikisdeliveryservice"
Expand All @@ -10,11 +11,12 @@ reviewers:
- "@cgwalters"
approvers:
- "@mrunalp"
- "@sinnykumari"
- "@yuqi-zhang"
api-approvers:
- "@deads2k"
- "@sttts"
creation-date: 2021-10-19
last-updated: 2021-10-20
last-updated: 2025-02-05
status: implementable
---

Expand All @@ -40,12 +42,10 @@ status: implementable

## Summary

Control Group v2 (cgroup v2) enablement in Kubernetes has progressed to beta
Control Group v2 (cgroup v2) enablement in Kubernetes has progressed to stable
[upstream](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2254-cgroup-v2).
The underlying runtime (cri-o) and supporting subsystems are now ready for
customers to begin their own testing with it. Not all workloads will be
compatible with cgroup v2, so it will *not* be enabled by default within
OpenShift at this time.
cgroup v2 is enabled by default in the freshly installed Openshift clusters >= 4.14.
Control Group v1 is a deprecated feature from OCP 4.16 and the support is intended to be removed from OCP >= 4.19 as the related version RHEL doesn't support it

Note: This enhancement is focusing on `pure` mode cgroup v2. Mixed mode environments
may behave differently (metrics, vpa, hpa, etc) since cgroup v1 is not
Expand All @@ -70,6 +70,8 @@ Some features of cgroup v2 include:

- [ ] Enable cgroup v2 within the Openshift API
- [ ] Add kernel flags to MCO to enable cgroup v2 on nodes
- [ ] Remove support to configure cgroup v1 in OCP clusters
- [ ] Block the upgrades of the clusters using cgroup v1 until migrated to cgroup v2

### Non-Goals

Expand All @@ -81,12 +83,9 @@ to gather data from.

## Proposal

The option to enable cgroup v2 will have to reside in a centralized location.
The [OpenShift Infrastructure config
object](https://github.com/openshift/api/blob/master/config/v1/types_infrastructure.go#L28)
contains information describing how a cluster functions including cloud config
and platform specification for each cloud. Setting the cgroup mode is an
infrastructure setting.
The option to enable cgroup v2 resides in a centralized location i.e. [OpenShift Node config
object](https://github.com/openshift/api/blob/master/config/v1/types_node.go)
Set the upgrade ability of the MCO cluster operator to `false` if a cluster is on cgroup v1

### API Extensions

Expand Down Expand Up @@ -121,12 +120,11 @@ type NodeSpec struct {

### Operational Aspects of API Extensions

Once the previous API is defined, the MCO will read the configured object and
set the appropriate kernel options (on bootstrap). The MCO will report an error
MCO reads the configured object and
sets the appropriate kernel options (on bootstrap). The MCO will report an error
if a user tries to modify/add cgroup related kargs within a MachineConfig.


The following kernel command line arguments would be set when `CgroupMode_v2` is enabled:
MCO also reports error if a user tries to set the `CgroupMode` to `CgroupMode_V1`
The following kernel command line arguments would be observed on the machine config pools by default and also when `CgroupMode_v2` is enabled:
```yaml
kernelArguments:
- systemd.unified_cgroup_hierarchy=1
Expand Down Expand Up @@ -201,19 +199,31 @@ The following jobs will be run against cgroup v2 periodically and with a minimum
### Upgrade / Downgrade Strategy
Downgrading a cluster to an OpenShift version not containing cgroup v2 support
- Downgrading a cluster to an OpenShift version not containing cgroup v2 support
is unsupported.
- Upgrading a cluster which is on cgroup v1 to a version >= 4.19 is blocked until it is migrated to cgroup v2
### Version Skew Strategy
A cluster installed with cgroup v2 will abide by the usual skew upgrade path.
#### Removing a deprecated feature
N/A
cgroup v1 support would be removed from the future/associated versions of RHEL and hence the setting of `CgroupMode_V1` has to be removed from OCP clusters(>= 4.19)

## Implementation History
Following code change inside the MCO [operator](https://github.com/openshift/machine-config-operator/blob/master/pkg/operator/status.go#L265)'s `pkg/operator/status.go` sets the ClusterOperator's `Upgradeable` status to `False` if the cluster is found to be using `CgroupMode_V1`
```go
configNode, err := optr.configClient.ConfigV1().Nodes().Get(context.Background(), ctrlcommon.ClusterNodeInstanceName, metav1.GetOptions{})
if err != nil {
return err
}
if configNode.Spec.CgroupMode == configv1.CgroupModeV1 {
coStatusCondition.Status = configv1.ConditionFalse
coStatusCondition.Reason = "ClusterOnCgroupV1"
coStatusCondition.Message = "Cluster is using cgroup v1, consider migrating to cgroup v2 by updating `CgroupMode` in the `nodes.config.openshift.io` object before upgrading"
}
```
## Alternatives

## Drawbacks

0 comments on commit f1c6619

Please sign in to comment.