Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPNODE-2877: Remove support to configure cgroupsv1 in OCP #2181

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

sairameshv
Copy link
Member

@sairameshv sairameshv commented Jan 30, 2025

  • Removing support to configure cgroupsv1 in the OCP clusters.
  • Removed the enum validation of "v1" for the cgroupMode field of the nodes.config.openshift.io object.
  • Also added integration tests to validate the enum removal on the cgroupMode field

Enhancement Proposal Ref: https://github.com/openshift/enhancements/blob/master/enhancements/machine-config/mco-cgroupsv2-support.md

Summary:

  • This PR allows to block the user from setting cgroupMode v1
  • A change would be added for 4.18 in MCO to set machine-config cluster operator's Upgradeable=False when the cgroupMode is found to be v1 and request users to update to v2
  • All the clusters upgrading to 4.19 have to update to the minimum version of 4.18.z containing the above changes. This can be achieved through the cincinnati-graph-data repo

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jan 30, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Jan 30, 2025

@sairameshv: This pull request references OCPNODE-2877 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.19.0" version, but no target version was set.

In response to this:

According to this, RHEL is going to remove the cgroupsv1 support from RHEL 10 and hence there is a need to remove it from the OCP as well.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Contributor

openshift-ci bot commented Jan 30, 2025

Hello @sairameshv! Some important instructions when contributing to openshift/api:
API design plays an important part in the user experience of OpenShift and as such API PRs are subject to a high level of scrutiny to ensure they follow our best practices. If you haven't already done so, please review the OpenShift API Conventions and ensure that your proposed changes are compliant. Following these conventions will help expedite the api review process for your PR.

@openshift-ci openshift-ci bot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Jan 30, 2025
@sairameshv
Copy link
Member Author

/jira refresh

@openshift-ci-robot
Copy link

openshift-ci-robot commented Jan 30, 2025

@sairameshv: This pull request references OCPNODE-2877 which is a valid jira issue.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@sairameshv
Copy link
Member Author

/hold
Until updated enhancement proposal for cgroup v1 removal is merged

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 6, 2025
@openshift-ci openshift-ci bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 19, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Feb 20, 2025

@sairameshv: This pull request references OCPNODE-2877 which is a valid jira issue.

In response to this:

According to this, RHEL is going to remove the cgroupsv1 support from RHEL 10 and hence there is a need to remove it from the OCP as well.

Added a CEL validation to deny the setting of "v1" to the cgroupMode field of nodes.config.openshift.io object

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@sairameshv
Copy link
Member Author

/test verify

@openshift-ci-robot
Copy link

openshift-ci-robot commented Feb 20, 2025

@sairameshv: This pull request references OCPNODE-2877 which is a valid jira issue.

In response to this:

Removing support to configure cgroupsv1 in the OCP clusters.
Added a CEL validation on the cgroupMode field of the nodes.config.openshift.io object to deny the setting of "v1"

Enhancement Proposal Ref: openshift/enhancements#1751

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@sairameshv
Copy link
Member Author

/retest

@@ -77,6 +77,7 @@ type NodeStatus struct {
}

// +kubebuilder:validation:Enum=v1;v2;""
// +kubebuilder:validation:XValidation:rule="self != \"v1\"",message="cgroups v1 is not supported on openshift anymore"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be == "v1"?

Copy link
Contributor

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please take a look at the tests/readme and add a test to prove this validation ratchets as expected and that any existing invalid value does not prevent future writes

@openshift-ci-robot
Copy link

openshift-ci-robot commented Feb 26, 2025

@sairameshv: This pull request references OCPNODE-2877 which is a valid jira issue.

In response to this:

  • Removing support to configure cgroupsv1 in the OCP clusters.
  • Added a CEL validation on the cgroupMode field of the nodes.config.openshift.io object to deny the setting of "v1"
  • Also added integration tests to validate the newly introduced CEL validation on the cgroupMode field

Enhancement Proposal Ref: openshift/enhancements#1751

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Member Author

@sairameshv sairameshv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JoelSpeed Could you PTAL at this PR?

@openshift-ci openshift-ci bot removed the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Feb 27, 2025
@openshift-ci openshift-ci bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Feb 27, 2025
@sairameshv
Copy link
Member Author

/retest

1 similar comment
@sairameshv
Copy link
Member Author

/retest

@haircommander
Copy link
Member

/lgtm

thanks!

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Feb 28, 2025
Copy link
Contributor

openshift-ci bot commented Feb 28, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: haircommander, sairameshv
Once this PR has been reviewed and has the lgtm label, please assign bparees for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sairameshv
Copy link
Member Author

/retest

1 similar comment
@sairameshv
Copy link
Member Author

/retest

@sairameshv sairameshv force-pushed the deprecate_cgroupv1 branch from 611d390 to ef2493d Compare March 3, 2025 15:23
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Mar 3, 2025
Copy link
Contributor

openshift-ci bot commented Mar 3, 2025

New changes are detected. LGTM label has been removed.

- Cgroupsv1 support is removed from OCP 4.19. Hence, denying the user when
  the `nodes.config` object's `cgroupMode` field is set to `"v1"`
- Added integration tests to validate the enum removal

Signed-off-by: Sai Ramesh Vanka <[email protected]>
@sairameshv sairameshv force-pushed the deprecate_cgroupv1 branch from ef2493d to daced88 Compare March 4, 2025 14:02
@JoelSpeed
Copy link
Contributor

Changes LGTM, how do we know this is safe? Can you please explain in the PR description what has been done in 4.18 that makes this a safe change in 4.19?

@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 4, 2025

@sairameshv: This pull request references OCPNODE-2877 which is a valid jira issue.

In response to this:

  • Removing support to configure cgroupsv1 in the OCP clusters.
  • Removed the enum validation of "v1" for the cgroupMode field of the nodes.config.openshift.io object.
  • Also added integration tests to validate the enum removal on the cgroupMode field

Enhancement Proposal Ref: openshift/enhancements#1751

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 4, 2025

@sairameshv: This pull request references OCPNODE-2877 which is a valid jira issue.

In response to this:

  • Removing support to configure cgroupsv1 in the OCP clusters.
  • Removed the enum validation of "v1" for the cgroupMode field of the nodes.config.openshift.io object.
  • Also added integration tests to validate the enum removal on the cgroupMode field

Enhancement Proposal Ref: https://github.com/openshift/enhancements/blob/master/enhancements/machine-config/mco-cgroupsv2-support.md

Summary:

  • This PR allows to block the user from setting cgroupMode v1
  • A change would be added for 4.18 in MCO to set machine-config cluster operator's Upgradeable=False when the cgroupMode is found to be v1 and request users to update to v2
  • All the clusters upgrading to 4.19 have to update to the minimum version of 4.18.z containing the above changes. This can be achieved through the cincinnati-graph-data repo

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@sairameshv
Copy link
Member Author

Changes LGTM, how do we know this is safe? Can you please explain in the PR description what has been done in 4.18 that makes this a safe change in 4.19?

As described in the enhancement proposal's Goal's section, the upgradadeability of the machine config cluster operator gets set to False when a cluster is found to be on CgroupModeV1. Also, we would make the 4.18.z cluster containing this change as a minimum cluster before upgrading to 4.19.
Updated the description as well with the above explanation

@JoelSpeed
Copy link
Contributor

Also, we would make the 4.18.z cluster containing this change as a minimum cluster before upgrading to 4.19.

Which change do you mean? Is there something in 4.18 that already blocks upgrades if cgroups mode is v1, or is that still work to do?

@sairameshv
Copy link
Member Author

Also, we would make the 4.18.z cluster containing this change as a minimum cluster before upgrading to 4.19.

Which change do you mean? Is there something in 4.18 that already blocks upgrades if cgroups mode is v1, or is that still work to do?

The change still needs to be added

@haircommander
Copy link
Member

yeah I think we should

/hold

on this until we have the upgradable=false condition in MCO and the upgrade edge defined in cincinati

@JoelSpeed
Copy link
Contributor

The change still needs to be added

Do this first. Once you have that logic in 4.18.z and set the minimum upgrade version in the upgrade graph, I'm happy to then merge this API PR to remove the value from the enum

Copy link
Contributor

openshift-ci bot commented Mar 4, 2025

@sairameshv: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/okd-scos-e2e-aws-ovn daced88 link false /test okd-scos-e2e-aws-ovn
ci/prow/verify-crd-schema daced88 link true /test verify-crd-schema
ci/prow/e2e-aws-ovn-hypershift daced88 link true /test e2e-aws-ovn-hypershift
ci/prow/e2e-azure daced88 link false /test e2e-azure

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants