-
Notifications
You must be signed in to change notification settings - Fork 555
GEP-713 enhancements #3609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
GEP-713 enhancements #3609
Conversation
|
||
Cons: | ||
#### Target object status |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we give some examples of this? I think I understand it but not certain
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a simplified example, plus an extension of it for the case including sectionName
.
Please let me know if that works or if you expected to see a full YAML.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A full yaml would be nice
geps/gep-713/index.md
Outdated
going to affect their object, at apply time, which helps a lot with discoverability. | ||
* **Accepted**: the meta resource passed both syntactic validation by the API server and semantic validation enforced by the controller, such as whether the target objects exist. | ||
* **Enforced**: the meta resource’s spec is guaranteed to be fully enforced, to the extent of what the controller can ensure. | ||
* **Partially enforced**: parts of the meta resource’s spec is guaranteed to be enforced, while other parts are known to have been superseded by other specs, to the extent of what the controller can ensure. The status should include details highlighting which parts of the meta resource are enforced and which parts have been superseded, with the references to all other related meta resources. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as this is not a MUST then its not a problem, but this seems like it could be quite onerous to compute. For example, imagine I have a global policy and then 1000 namespaces any of which could partially conflict. Its not great to have to 'bubble up' these to the parent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A concrete example of this in existing Gateway API is attachedRoutes
, which is similarly complex for implementations to compute (efficiently)
geps/gep-713/index.md
Outdated
|
||
## Background and concepts | ||
The merge strategies typically include strategies for dealing with conflicting and/or missing specs, such as for applying default and/or override values on the target resources. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's important to note that sometimes the merge strategy may be specified in the design of the object (that is, it's a defaults policy or something), rather than in a field?
In fact, I tend to think that, if the merge strategy is listed in a field, it should be in the status
, not the spec
, since it's relevant info for users of the Policy more than implementers (who will build the merge strategy into code when handling the Policy anyway).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, this feels like something that belongs in status.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 that merge strategy may be defined in the metaresource, e.g. the API contract is either only one is allowed per target
, or multiple are allowed
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel confused about the merge strategy in the status
instead of the spec
.
Are we talking about the metaresource's status
and spec
? Or the target's?
The merge strategy, if more than one is supported by the metaresource kind, is a choice of the user that declares an instance of the metaresource. How can it be in the status?
The user literally specify what merge strategy to use when merging that instance of the metaresource. It should be in the spec
, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a few lines about merge strategy as a user choice or not, and reflected in the status stanza of the metaresource.
geps/gep-713/index.md
Outdated
|
||
**Ana**: _What the hell just happened??_ | ||
If multiple meta resources target the same context, this is considered to be a conflict. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to define "context" again here, I think. (I'd forgotten the definition by the time I got to this part).
If multiple meta resources target the same context, this is considered to be a conflict. | |
If multiple meta resources target the same context (that is, multiple instances of the same or similar policies acting on the same hierarchy have an effective target of the same object), this is considered to be a conflict. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"same", yes; "similar", not a good idea IMO. I think the behavior for different kinds of policies should be undefined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm okay with removing "or similar", but I think that if we're going to leave this as undefined in some cases, we need to be specific in the ones where we do need to have opinions:
- For Gateway API Policy objects included in the specification, in the case of intent conflict with some other Policy on Gateway API objects, the Gateway API Policy must take precedence.
- For implementation specific Policy objects that affect the same properties across multiple implementations, it's up to the implementations to define behavior. If they don't then the behavior is, necessarily, undefined and could produce differing outcomes depending on unknown factors.
In other words, this is a terrible idea and users should try not to use multiple Policy objects that affect the same things.
geps/gep-713/index.md
Outdated
**Chihiro**: _At a guess, all the workloads in the `baker` namespace actually | ||
fail a lot, but they seem OK because there are retries across the whole | ||
namespace?_ 🤔 | ||
Conflicts must be resolved by applying a defined *merge strategy* (see further definition in the next section), where the meta resource considered higher between two conflicting specs dictates the merge strategy according to which the conflict must be resolved, defaulting to the lower spec (more specific) beating the higher one if not specified otherwise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conflicts must be resolved by applying a defined *merge strategy* (see further definition in the next section), where the meta resource considered higher between two conflicting specs dictates the merge strategy according to which the conflict must be resolved, defaulting to the lower spec (more specific) beating the higher one if not specified otherwise. | |
Conflicts must be resolved by applying a defined *merge strategy* (see further definition in the next section). | |
When resolving conflicts, the meta resource higher in the relevant hierarchy dictates the merge strategy - that is, merge strategy conflict resolution works on a least-specific-wins basis. After that the merge strategy's conflict resolution rules apply. | |
If no merge strategy is specified, then implementations should use more-specific-wins merge strategy by default. |
I think this is what you meant here @guicassolato?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but I happen to find the suggested text more confusing than the original.
"least-specific-wins" and "more-specific-wins" have different subjects in the sentences, and therefore I would phrase it differently to avoid confusion.
A merge strategy is a function that takes as input 2 specs and outputs 1.
One thing is determining the merge strategy. When resolving a conflict posed by 2 metaresources, the least specific metaresource among the two dictates the merge strategy that will be used to solve the conflict, i.e. the function that will take both metaresource specs as input. It's always the least specific metaresource that determines it.
The determined merge strategy can be a merge strategy that resolves to "least-specific-wins" or "more-specific-wins" (and occasionally to things more sophisticated than that, like actual merges).
If the least specific metaresource does not specify a merge strategy, then the merge strategy used to resolve the conflict is "more-specific-wins".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rephrased a bit to break down as suggested but trying to avoid overloading terminology.
Signed-off-by: Guilherme Cassolato <[email protected]>
Signed-off-by: Guilherme Cassolato <[email protected]>
Signed-off-by: Guilherme Cassolato <[email protected]>
Signed-off-by: Guilherme Cassolato <[email protected]>
Signed-off-by: Guilherme Cassolato <[email protected]>
Signed-off-by: Guilherme Cassolato <[email protected]>
Signed-off-by: Guilherme Cassolato <[email protected]>
Signed-off-by: Guilherme Cassolato <[email protected]>
Signed-off-by: Guilherme Cassolato <[email protected]>
Signed-off-by: Guilherme Cassolato <[email protected]>
Signed-off-by: Guilherme Cassolato <[email protected]>
Signed-off-by: Guilherme Cassolato <[email protected]>
Signed-off-by: Guilherme Cassolato <[email protected]>
… semantics rephrased for improved readability Signed-off-by: Guilherme Cassolato <[email protected]>
…'Conflict resolution rules' subsections Signed-off-by: Guilherme Cassolato <[email protected]>
…ted concepts Signed-off-by: Guilherme Cassolato <[email protected]>
Signed-off-by: Guilherme Cassolato <[email protected]>
…/etc in the spec Signed-off-by: Guilherme Cassolato <[email protected]>
Ref.: kubernetes-sigs#3609 (comment) Signed-off-by: Guilherme Cassolato <[email protected]>
Signed-off-by: Guilherme Cassolato <[email protected]>
1. Define names and mechanisms for possible merge strategies (so both what e.g. “atomic default” means, but also that “atomic default” is the correct name for that strategy) 2. Define a status mechanism by which the strategy SHOULD be reported, and that a conformant implementation MUST use the names defined in 1 to report strategy. 3. Define what merge strategy is preferred for `defaults`, and define that implementations using the defaults clause SHOULD use that strategy. 4. Define what merge strategy is preferred for `overrides`, and define that implementations using the overrides clause SHOULD use that strategy. 5. Acknowledge that implementations MAY support other strategies, or selecting strategies at runtime, but that those are implementation-specific behaviors. Signed-off-by: Guilherme Cassolato <[email protected]>
Signed-off-by: Guilherme Cassolato <[email protected]>
…lementations Signed-off-by: Guilherme Cassolato <[email protected]>
Signed-off-by: Guilherme Cassolato <[email protected]>
def4f23
to
7c9a6a5
Compare
Even though it appears (discreetly) in the template, I believe |
- _Where_ it’s applied | ||
- _What_ the resultant policy is saying | ||
In other words: | ||
- When the Policy CRD allows specifying the merge strategy at individual CRs, then `established ⇒ 𝑓`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section is confusing. L477 is concise enough, so this can be removed to avoid confusion.
The best outcome is that Ana needs to look only at a specific route to know what | ||
Policy settings are being applied to that Route, and where they come from. | ||
However, some of the other problems below make it very difficult to achieve this. | ||
For example, if two policies are attached at different levels of the hierarchy, e.g. `Gateway` and `HTTPRoute`, by application of the [Conflict resolution rules](#conflict-resolution-rules), the policy attached to the `Gateway` (higher, less specific level) will be considered the _established_ spec, whereas the policy attached to the `HTTPRoute` (lower, more specific level) will be considered the _challenger_ spec. By applying the **Atomic defaults** merge strategy, the effective policy is set to equal to the spec proper of the policy attached to the `HTTPRoute`, and the policy attached to the `Gateway` MUST NOT be enforced in the scope of the `HTTPRoute` augmented by the effective policy (although occasionally it might in the scope of other effective targets, i.e., other HTTPRoutes). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this impact policies at the same level in the hierarchy? The defaults/overrides model does not gel well when the Creation Timestamp is used to determine the policy precedence. I will argue that Defaults/Overrides are only relevant to determine policy precedence for policies at different levels in the config hierarchy. If you agree, the GEP should explicitly state so.
If you disagree, I would like to understand the relevance of default/override within the same hierarchy.
- The definition of a `strategy` field in the `spec` stanza of the Policy, or equivalentely a `mergeType` field. | ||
- The definition of `defaults` and/or `overrides` fields in the `spec` stanza of the policy wrapping the "spec proper" fields. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add examples for these here
details of an _arbitrarily defined_ object, that needs to be included in the base | ||
API. | ||
Two known patterns adopted by Policy implementations that support specifying one of multiple merge strategies in the Policy CRs are: | ||
- The definition of a `strategy` field in the `spec` stanza of the Policy, or equivalentely a `mergeType` field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The definition of a `strategy` field in the `spec` stanza of the Policy, or equivalentely a `mergeType` field. | |
- The definition of a `mergeStrategy` field in the `spec` stanza of the Policy, or equivalentely a `mergeType` field. |
|
||
How does the Cluster Admin know what Policy is applied where, and what the content | ||
of that Policy is? | ||
## End-to-end examples |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still missing, and would be a lot easier to make sense than the verbose graphs and text
|
||
For objects that do not have a `status.Conditions` field available (`Secret` is a good example), that object SHOULD instead have an annotation of `colors.controller.k8s.io/ColorPolicyAffected: true` added instead. | ||
|
||
#### Status needs to be namespaced by implementation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the Target object's status you are referring to?
Please add yaml examples in all status related sections.
|
||
In Gateway API's Route Parent status, `parentRef` plus the controller name have been used for this. | ||
|
||
For a policy, something similar can be done, namespacing by the reference to the implementation's controller name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be explicit and not leave room for ambiguity. Standardizing on a policy attachment Status API would be extremely beneficial to implementations and users.
|
||
#### Creating common data representation patterns | ||
|
||
Defining a _common_ pattern for including the details of an _arbitrarily defined_ object, to be included in a library for all possible implementations, is challenging, to say the least. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What details are you referring to? Please include examples.
|
||
Gateway API defines two kinds of Direct policies, both for augmenting the behavior of Kubernetes `Service` resources: | ||
|
||
| Policy kind | Description | Target kinds | Merge strategies | Policy class | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should standardize the Status API for existing policies. A Status API column is required to ensure these APIs are actually compliant with the entirety of this proposal.
| **ObservabilityPolicy** | Configure connection behavior between client and NGINX. | HTTPRoute, GRPCRoute | None | Direct | | ||
| **UpstreamSettingsPolicy** | Configure connection behavior between NGINX and backend. | Service | None | Direct | | ||
|
||
#### Gloo Gateway |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also include https://github.com/kgateway-dev/kgateway, which is the next gen version of Gloo
|
||
The basic status conditions are: | ||
|
||
* **Accepted**: the policy passed both syntactic validation by the API server and semantic validation enforced by the controller, such as whether the target objects exist. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that a policy could affect targeted resources differently, is the status going to be partitioned by AncestorRef similar to the existing PolicyStatus API, or be an aggregate as a top level []Conditions field?
What type of PR is this?
/kind gep
What this PR does / why we need it:
Rewriting of GEP-713 (Memorandum) to clarify concepts and incorporate enhancements discussed at #2927.
Which issue(s) this PR fixes:
Related to #713
Does this PR introduce a user-facing change?: