NE-1946: ingress: Add gateway-api-crd-life-cycle-management #1740

Miciah · 2025-01-22T19:57:13Z

No description provided.

Miciah · 2025-01-22T19:57:20Z

openshift-ci · 2025-01-22T19:57:28Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from miciah. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

enhancements/ingress/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

shaneutt

Great foundation, thank you for putting this together Miciah.

@JoelSpeed @knobunc we would appreciate if you would review what's here so far, but note that our intention is to iterate on the TODOs and TBDs in follow up PRs as doing a few separate, smaller and purpose-focused PRs should help reviewing.

shaneutt · 2025-01-22T20:40:56Z

enhancements/ingress/gateway-api-crd-life-cycle-management.md

+OpenShift's Ingress Operator monitors for the presence of the Gateway API CRDs.
+
+- If they are absent, it installs the appropriate version.
+- If they are present at the expected version, the operator does nothing.


If "do nothing" also implies we will use these CRDs I think we'll need to think on that more because we need to manage their lifecycle, and we would need to know who the owner is and have that consent. This however I can add in one of the follow up iterations, so a TODO here should be good:

Suggested change

- If they are present at the expected version, the operator does nothing.

- If they are present at the expected version: TODO - expand upon what we'll do for preexisting CRDs.

Are you alluding to transferring ownership with Server-Side Apply? Now that you mention it, I think the proposal should be rewritten in terms of who owns it rather than whether it is at the expected version. Should I go ahead and s/at the expected version/with the Ingress Operator as the owner/ now, with a to-do to fill in the SSA-related details?

Yes that sounds great 👍

I rewrote in terms of SSA and added a to-do in https://github.com/openshift/enhancements/compare/99aae59bb1ac50f1ef02e07ff3dcc3385216c451..e34441f0d0685989e6417db70dcf049c68f9fc55.

shaneutt · 2025-01-22T20:41:26Z

enhancements/ingress/gateway-api-crd-life-cycle-management.md

+OpenShift release.  Specifically, the CRDs SHOULD be the version corresponding
+to the version of Istio in that OSSM version; they MUST be of some version that
+is compatible and that we have tested with that version of Istio.


Suggested change

OpenShift release. Specifically, the CRDs SHOULD be the version corresponding

to the version of Istio in that OSSM version; they MUST be of some version that

is compatible and that we have tested with that version of Istio.

OpenShift release. Specifically, the CRDs SHOULD be the version corresponding

to the version of Istio in that OSSM version; they MUST be of some version that

is compatible and that we have tested with that version of Istio (TODO: expand upon this)

Done in https://github.com/openshift/enhancements/compare/99aae59bb1ac50f1ef02e07ff3dcc3385216c451..e34441f0d0685989e6417db70dcf049c68f9fc55. Is that sufficient? If we need more detail, it might warrant a separate section for how we validate a particular CRD version.

shaneutt · 2025-01-22T20:45:29Z

enhancements/ingress/gateway-api-crd-life-cycle-management.md

+
+Potential mitigation strategies:
+
+- Handle with an admission webhook.


(just a note that we put handling updates to dead fields being handled by a webhook as a non-goal above)

We have a dilemma when it comes to trying to handle dead fields with a webhook, because our validation would require us to ultimately reference other objects. Something we need to think through, but we can follow up on that as part of the TBD.

I added a cross-reference to the "Admission webhook" section under "Alternatives", and I added a paragraph about consistency issues to that section. I left the TBD to fill in more details. These changes are in https://github.com/openshift/enhancements/compare/99aae59bb1ac50f1ef02e07ff3dcc3385216c451..e34441f0d0685989e6417db70dcf049c68f9fc55.

shaneutt · 2025-01-22T20:48:34Z

enhancements/ingress/gateway-api-crd-life-cycle-management.md

+Note that Gateway API CRD life-cycle management will be part of the
+[gateway-api-with-cluster-ingress-operator](gateway-api-with-cluster-ingress-operator.md) enhancement, which defines its
+own graduation criteria.


We had a conversation before about potentially packaging the CRDs in at the CVO level, I believe the main problem we had with that was limitations to how we can handle conflict resolution. We should capture somewhere in this doc (alternatives considered, etc) the reasons why we moved away from that for posterity.

I added a heading in https://github.com/openshift/enhancements/compare/99aae59bb1ac50f1ef02e07ff3dcc3385216c451..e34441f0d0685989e6417db70dcf049c68f9fc55. I don't actually understand how using CVO would help, so I haven't written anything out yet. I can go back through Jira comments and meeting notes to try to figure that out, or else feel free to write something up about that alternative.

JoelSpeed · 2025-01-23T13:47:40Z

enhancements/ingress/gateway-api-crd-life-cycle-management.md

+As a cluster-admin, I want to upgrade my cluster from OpenShift 4.18 (which
+*doesn't* manage the Gateway API CRDs' life-cycle) to OpenShift 4.19 (which
+*does*).


Probably want a case (perhaps this one?) where the story includes having a third party implementation of gateway API already installed in 4.18. This story at the moment sounds perhaps incomplete as I don't see the impact unless there is a gateway API installed on the 4.18 side?

Yeah, I figured the user story could describe two cases: CRDs were not installed on the 4.18 cluster, and they were. I can split it into two user stories if that makes more sense.

Yeah I think it might make sense to have one explicitly call out that there were Gateway API CRDs installed in 4.18

JoelSpeed · 2025-01-23T17:11:55Z

enhancements/ingress/gateway-api-crd-life-cycle-management.md

+The Ingress Operator's warning will take the form of, at a minimum, setting the
+`Degraded` status condition's status to `True` with a descriptive message on the
+ingress clusteroperator.  


At some point we will need to explain how an admin gets out of this, TODO delete this comment if I find that later

I figure it can be documented under "Workflow Description" or "Support Procedures". I can add a cross-reference here.

That would be helpful, thank you

JoelSpeed · 2025-01-23T17:16:12Z

enhancements/ingress/gateway-api-crd-life-cycle-management.md

+In this situation, the cluster-admin is expected to verify that workload would
+not be broken by handing life-cycle management of the CRDs over to the Ingress
+Operator:


If they cannot verify this, or I guess, they verify it and determine their workload would be broken, we need to explain that path too

Right, we will probably need to describe how workload could be broken (e.g. third-party controller fails to start because it needs newer CRDs), how to predict or detect it, and how to handle it (even if it comes down to, "Work with Red Hat and the third-party vendor to align on a common CRD version.").

I suspect in reality the issue is that either you end up with dead fields, or the controller starts stripping fields it doesn't understand, or it starts trying to default/set a field that doesn't exist within the current CRD, and then breaks/continuously reconciles because the value isn't ever set, anything else? Explaining these potential issues is probably useful for the tech writer who will have to write the release notes 😅

JoelSpeed · 2025-01-23T17:16:42Z

enhancements/ingress/gateway-api-crd-life-cycle-management.md

+_TBD: Should we describe how to turn off the Ingress Operator so that the
+cluster-admin can override the CRDs, or describe how Server-Side Apply enables
+the cluster-admin to take over the CRDs?_


Sounds like you might need what we are doing for CAPI in this space?

I'm hoping we don't need to implement an explicit override [for GA]. Even if the alternative is something a little complicated (involving SSA) or heavy-handed (such as using a CVO override and scaling down the whole operator), we just need something good enough for now and can defer doing something better.

* enhancements/ingress/gateway-api-crd-life-cycle-management.md: New file. * enhancements/ingress/gateway-api-with-cluster-ingress-operator.md: Add cross-reference to the new file.

openshift-ci · 2025-01-27T14:51:05Z

@Miciah: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci-robot · 2025-01-29T15:49:05Z

@Miciah: This pull request references NE-1946 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.19.0" version, but no target version was set.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Miciah · 2025-02-14T19:26:04Z

/close
@shaneutt will open a new PR to continue the EP.

openshift-ci · 2025-02-14T19:26:17Z

@Miciah: Closed this PR.

In response to this:

/close
@shaneutt will open a new PR to continue the EP.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

candita · 2025-02-26T23:17:25Z

Continued in
#1756

openshift-ci bot assigned shaneutt Jan 22, 2025

openshift-ci bot requested review from miheer and rfredette January 22, 2025 19:57

Miciah marked this pull request as draft January 22, 2025 20:20

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 22, 2025

shaneutt reviewed Jan 22, 2025

View reviewed changes

Miciah force-pushed the ingress-add-gateway-api-crd-life-cycle-management-enhancement branch from 99aae59 to e34441f Compare January 23, 2025 17:10

JoelSpeed reviewed Jan 23, 2025

View reviewed changes

Miciah force-pushed the ingress-add-gateway-api-crd-life-cycle-management-enhancement branch from e34441f to 42d946a Compare January 24, 2025 20:40

Miciah marked this pull request as ready for review January 24, 2025 20:40

openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 24, 2025

openshift-ci bot requested a review from candita January 24, 2025 20:41

Miciah force-pushed the ingress-add-gateway-api-crd-life-cycle-management-enhancement branch from 42d946a to 1ec9f49 Compare January 27, 2025 14:25

ingress: Add gateway-api-crd-life-cycle-management

a5f5545

* enhancements/ingress/gateway-api-crd-life-cycle-management.md: New file. * enhancements/ingress/gateway-api-with-cluster-ingress-operator.md: Add cross-reference to the new file.

Miciah force-pushed the ingress-add-gateway-api-crd-life-cycle-management-enhancement branch from 1ec9f49 to a5f5545 Compare January 27, 2025 14:26

Miciah changed the title ~~ingress: Add gateway-api-crd-life-cycle-management~~ NE-1946: ingress: Add gateway-api-crd-life-cycle-management Jan 29, 2025

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jan 29, 2025

openshift-ci bot closed this Feb 14, 2025

candita mentioned this pull request Feb 26, 2025

NE-1946: ingress: Add CRD Lifecycle Management for Gateway API #1756

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NE-1946: ingress: Add gateway-api-crd-life-cycle-management #1740

NE-1946: ingress: Add gateway-api-crd-life-cycle-management #1740

Miciah commented Jan 22, 2025

Miciah commented Jan 22, 2025

openshift-ci bot commented Jan 22, 2025

shaneutt left a comment •

edited

Loading

shaneutt Jan 22, 2025

Miciah Jan 23, 2025

shaneutt Jan 23, 2025

Miciah Jan 23, 2025

shaneutt Jan 22, 2025

Miciah Jan 23, 2025

shaneutt Jan 22, 2025 •

edited

Loading

Miciah Jan 23, 2025

shaneutt Jan 22, 2025 •

edited

Loading

Miciah Jan 23, 2025

JoelSpeed Jan 23, 2025

Miciah Jan 23, 2025

JoelSpeed Jan 29, 2025

JoelSpeed Jan 23, 2025

Miciah Jan 23, 2025

JoelSpeed Jan 29, 2025

JoelSpeed Jan 23, 2025

Miciah Jan 23, 2025

JoelSpeed Jan 29, 2025

JoelSpeed Jan 23, 2025

Miciah Jan 23, 2025 •

edited

Loading

openshift-ci bot commented Jan 27, 2025

openshift-ci-robot commented Jan 29, 2025 •

edited by openshift-ci bot

Loading

Miciah commented Feb 14, 2025

openshift-ci bot commented Feb 14, 2025

candita commented Feb 26, 2025 •

edited

Loading

	- If they are present at the expected version, the operator does nothing.
	- If they are present at the expected version: TODO - expand upon what we'll do for preexisting CRDs.


		Potential mitigation strategies:

		- Handle with an admission webhook.

NE-1946: ingress: Add gateway-api-crd-life-cycle-management #1740

NE-1946: ingress: Add gateway-api-crd-life-cycle-management #1740

Conversation

Miciah commented Jan 22, 2025

Miciah commented Jan 22, 2025

openshift-ci bot commented Jan 22, 2025

shaneutt left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shaneutt Jan 22, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shaneutt Jan 22, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Miciah Jan 23, 2025 • edited Loading

Choose a reason for hiding this comment

openshift-ci bot commented Jan 27, 2025

openshift-ci-robot commented Jan 29, 2025 • edited by openshift-ci bot Loading

Miciah commented Feb 14, 2025

openshift-ci bot commented Feb 14, 2025

candita commented Feb 26, 2025 • edited Loading

shaneutt left a comment •

edited

Loading

shaneutt Jan 22, 2025 •

edited

Loading

shaneutt Jan 22, 2025 •

edited

Loading

Miciah Jan 23, 2025 •

edited

Loading

openshift-ci-robot commented Jan 29, 2025 •

edited by openshift-ci bot

Loading

candita commented Feb 26, 2025 •

edited

Loading