-
Notifications
You must be signed in to change notification settings - Fork 489
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NE-1946: ingress: Add CRD Lifecycle Management for Gateway API #1756
base: master
Are you sure you want to change the base?
NE-1946: ingress: Add CRD Lifecycle Management for Gateway API #1756
Conversation
@shaneutt: This pull request references NE-1946 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.19.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/cc @knobunc @JoelSpeed @dgn |
@shaneutt: This pull request references NE-1946 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.19.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
7af99b6
to
a4db666
Compare
50fcb98
to
7fa4c25
Compare
e66cab0
to
8db12e8
Compare
* enhancements/ingress/gateway-api-crd-life-cycle-management.md: New file. * enhancements/ingress/gateway-api-with-cluster-ingress-operator.md: Add cross-reference to the new file.
Replaces #1740 |
/assign |
/assign @alebedev87 @alebedev87 thank you for agreeing to lead the review and be the final approver. |
Signed-off-by: Shane Utt <[email protected]>
8db12e8
to
96072c1
Compare
/approve It LGTM too, but I'll let Joel apply that. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: knobunc The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
||
* no other actors managing the CRDs from here on out | ||
* only standard CRDs can be deployed (no experimental) | ||
* the CRDs to be deployed at an exact version/schema |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to discuss one point with regard to the ensuring of the exact version/schema. NE-1952 suggests the usage of crd-schema-checker which is a good tool but 1) it may be not trivial to be used as a module (not binary), 2) it cares about some details we may be indifferent about. After the upgrade (to 4.19) will be done, the gatewayapi-controller will override the contents of 4 GWAPI CRDs by doing a simple go-cmp/cmp.Equal() to compare actual vs desired. So, all unwanted versions will be gone from the CRDs. I'm thinking that the same simple comparison may be enough to detect a potential clash in 4.18.z.
I'm wondering what level of details we were expecting to get from crd-schema-checker and whether we really need them. The desired state of the CRDs is stored in the operator's code so we know the GWAPI bundle version which we can report to the user as "expected" (without providing detailed description of what will change).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @JoelSpeed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it may be not trivial to be used as a module (not binary)
I have experience of using it as a module, it's actually fairly straightforward
it cares about some details we may be indifferent about.
This is configurable, but, there's also changes that could be made that it won't yet catch, it's not feature complete. There are various checks that @everettraven has implemented in his own crd-diff project that you would want (thinks like limit changes).
We are working to try and create an upstream project combining both of these and focusing specifically on the changes that would break APIs, relying on that, if we get there in time would likely be better for the long run
doing a simple go-cmp/cmp.Equal() to compare actual vs desired.
This seems like a reasonable approach, making sure they exactly match is a requirement I could see you imposing over time. The OpenAPI schema for a version is considered as an atomic unit IIRC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it cares about some details we may be indifferent about.
This is configurable, but, there's also changes that could be made that it won't yet catch, it's not feature complete. There are various checks that @everettraven has implemented in his own crd-diff project that you would want (thinks like limit changes).
I see, so basically both tools (crd-shema-checker and crd-diff) seem to target editors of CRDs to help them to not commit harmful changes. They can be used in a verify job in openshift/api
repostiory, for instance.
I think our goal in NE-1952 is sligthly different. We want to know whether GWAPI CRDs currently installed in the cluster match the desired ones. Will the desired version be incompatible or not? Do we really care about this? (I tend to say - no).
We are working to try and create an upstream project combining both of these and focusing specifically on the changes that would break APIs, relying on that, if we get there in time would likely be better for the long run
Does this mean that you would not consider crd-schema-checker as production ready?
The OpenAPI schema for a version is considered as an atomic unit IIRC
I'm not sure I managed to link this to our case. Will this somehow impact our implementation (on 4.18.z we will warn that OCP takes over GWAPI, on 4.19 we will update all GWAPI CRDs to desired definitions)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that you would not consider crd-schema-checker as production ready?
At the moment, I think it has some rough edges and it's probably not what you'd want to use here. We have some work to do in the upstream version (when we get there) to get it to a point where I'd comfortably say integrate it in a wider context (eg there's not get out of jail free card right now)
I'm not sure I managed to link this to our case.
I believe if someone updates any part of the schema, they update the whole schema as far as the API server is concerned. So it makes it hard for additions to be made, unless they are exclusively additive.
A DeepDerivative check would probably suffice for you IIUC, but DeepEqual would be more precise. It depends if you want to allow a schema that is slightly further ahead or not
resources via zstream (and major) releases to add new features and capabilities. | ||
|
||
### Goals | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably the first goal should be:
Establish the owner of Gateway API CRDs as OpenShift Cluster Ingress Operator.
Another goal should be:
Prevent upgrades on OpenShift Clusters that have non-matching Gateway API CRDs already installed.
check. As an extra precaution for users forcing upgrades precipitously, we will | ||
provide an admin gate to both provide an extra warning and gather consent from | ||
the cluster admin to take over the management of the CRDs. This will be | ||
accompanied by detailed documentation on what the admin should check and how to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we use the standard "unsupported config override" for exceptions, we don't need to provide any documentation.
> **Note**: Even if they force through, we will take over the CRDs once the | ||
> upgrade completes (or go `Degraded` if that fails for some reason). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If they acknowledge that they are going into "unsupported config override" mode, then we can't and shouldn't take it over.
> What are some important details that didn't come across above in the | ||
> **Proposal**? Go in to as much detail as necessary here. This might be | ||
> a good place to talk about core concepts and how they relate. While it is useful | ||
> to go into the details of the code changes required, it is not necessary to show | ||
> how the code will be rewritten in the enhancement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just add a TBD here for now.
|
||
### Removing a deprecated feature | ||
|
||
N/A. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's worth mentioning we need to remove the code that currently manages CRDS in Cluster Ingress Operator.
> a good place to talk about core concepts and how they relate. While it is useful | ||
> to go into the details of the code changes required, it is not necessary to show | ||
> how the code will be rewritten in the enhancement. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do need some mention of where the CRDs will be stored. Will they be stored in openshift/api or somewhere else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think openshift/api is only for *.openshift.io
APIs.
If CIO is managing the CRDs, presumably they should be part of CIO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If these are upstream CRDs, then generally they are copied into the relevant operator's repository to be injected into the payload directly, or through operator installation conditionally
Previously, the "GatewayAPI" feature gate managed both the GatewayAPI CRDs and the Gateway Controller. However, with the introduction of Gateway CRD lifecycle management (openshift/enhancements#1756), these responsibilities were separated. A dedicated feature gate now controls the Gateway Controller to distinguish its production readiness from that of the CRDs.
any interference from the first-party implementation. Relatedly, I want to be | ||
able to utilize both the first-party and any third-party solution alongside | ||
each other simultaneously and independently without any interference between the | ||
two. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Goals don't seem to enable this story? Especially, there doesn't seem to be any way for a third-party implementation to protect itself from unknown/dead fields when CIO decides to update the CRDs.
Co-authored-by: Candace Holman <[email protected]> Co-authored-by: Miciah Dashiel Butler Masters <[email protected]>
@shaneutt: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
This proposes the mechanisms by which we will deliver Gateway API as a "core-like" API on the platform going forward.
This covers all the work being done in the NE-1898 epic and is a hard requirement for delivering Gateway API support in 4.19.