Skip to content

TLSRoute: Require hostnames #3872

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

rostislavbobo
Copy link

@rostislavbobo rostislavbobo commented Jun 20, 2025

What type of PR is this?

/kind feature

What this PR does / why we need it:

This PR makes the hostnames field in TLSRoute non-optional. Hostnames is the only available match for TLSRoutes and, per the TLSRoute spec, at least one hostname match is required. If SNI hostname matching is not needed, TCPRoute should be used instead.

Which issue(s) this PR fixes:

Fixes #3871

Does this PR introduce a user-facing change?:

- Adds v1alpha3 experimental TLSRoute with the hostnames field being required.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 20, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: rostislavbobo
Once this PR has been reviewed and has the lgtm label, please assign mlavacca for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot
Copy link
Contributor

Welcome @rostislavbobo!

It looks like this is your first PR to kubernetes-sigs/gateway-api 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/gateway-api has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot
Copy link
Contributor

Hi @rostislavbobo. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jun 20, 2025
@rostislavbobo rostislavbobo force-pushed the tlsroute-hostnames-required branch from 8a842f0 to 5630d40 Compare June 27, 2025 12:43
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jun 27, 2025
@rostislavbobo rostislavbobo force-pushed the tlsroute-hostnames-required branch from 75ab19e to 563794e Compare June 27, 2025 13:01
@k8s-ci-robot k8s-ci-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 27, 2025
@rostislavbobo rostislavbobo force-pushed the tlsroute-hostnames-required branch 4 times, most recently from 2cc1661 to bf124db Compare June 27, 2025 14:04
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jun 27, 2025
@rostislavbobo rostislavbobo force-pushed the tlsroute-hostnames-required branch from bf124db to 1c19f20 Compare June 27, 2025 14:53
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jun 27, 2025
@rostislavbobo rostislavbobo marked this pull request as ready for review June 27, 2025 16:47
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 27, 2025
@robscott
Copy link
Member

Thanks @rostislavbobo!

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 27, 2025
@rostislavbobo rostislavbobo force-pushed the tlsroute-hostnames-required branch 2 times, most recently from 1b1ea5d to 3051c77 Compare June 27, 2025 16:58
@youngnick
Copy link
Contributor

I think this is a good idea, but unfortunately, I think we need to add the new version, and move the v1alpha2 source files to mostly be aliases to the new objects (except for this change).

That will have the generator include both v1alpha2 and v1alpha3 into the generated YAML files.

Alternatively, we will need to very clearly document that folks must ensure that the v1alpha2 CRD objects are not removed from the cluster before the actual TLSRoute objects are migrated to v1alpha3, or all the TLSRoute objects will disappear.

@youngnick
Copy link
Contributor

This is because I know that people are using TLSRoute in production. I know they shouldn't be, but they definitely are, and we will screw them over very hard if we are not careful.

@shaneutt
Copy link
Member

I think this is a good idea, but unfortunately, I think we need to add the new version, and move the v1alpha2 source files to mostly be aliases to the new objects (except for this change).
That will have the generator include both v1alpha2 and v1alpha3 into the generated YAML files.

+1

Alternatively, we will need to very clearly document that folks must ensure that the v1alpha2 CRD objects are not removed from the cluster before the actual TLSRoute objects are migrated to v1alpha3, or all the TLSRoute objects will disappear.
This is because I know that people are using TLSRoute in production. I know they shouldn't be, but they definitely are, and we will screw them over very hard if we are not careful.

+1

I agree: as long as the costs are reasonably low, we should at least try to avoid breaking people on the API today.

Normally alpha APIs are fair game for complete disruption at any time, but TLSRoute, UDPRoute and TCPRoute have been in a unique "perma-alpha" for such a long time that I think if it's not a big lift, trying to make this easier for folks is fair.

/cc @Rycieos

@kubernetes-sigs kubernetes-sigs deleted a comment from k8s-ci-robot Jun 30, 2025
@Rycieos
Copy link

Rycieos commented Jun 30, 2025

I know that people are using TLSRoute in production. I know they shouldn't be, but they definitely are

😆 yeah that's me. In my defense, I started using TLSRoutes in non-prod environments 2.5 years ago, hoping that it would graduate to GA along with the rest of Gateway API, at which point I would move it into prod. Obviously that didn't happen, and I recognize that taking that risk is on me.

I appreciate the work done to make such migrations painless, but I definitely understand that this part of the API is still in alpha and that disruptions are expected. I can't ask for more than keeping me in the loop, which the team has done great with so far.

@rostislavbobo
Copy link
Author

I agree: as long as the costs are reasonably low, we should at least try to avoid breaking people on the API today.

Of course, @shaneutt , @youngnick , that's why we're not introducing this breaking change in v1alpha2 and are bumping the version to v1alpha3 instead.

unfortunately, I think we need to add the new version, and move the v1alpha2 source files to mostly be aliases to the new objects (except for this change).
This is because I know that people are using TLSRoute in production.

I hear you, makes sense. Let me alias TLSRoutes v1alpha2.

@rostislavbobo
Copy link
Author

I started using TLSRoutes in non-prod environments 2.5 years ago, hoping that it would graduate to GA along with the rest of Gateway API

This year I'm revamping TLSRoutes and hope this v1alpha3 finally sets it up for GA 🙂
When we're ready, I'll start coordinating this within the community and will keep you in the loop.

@Rycieos , let me know if there's anything else we should include.

@Rycieos
Copy link

Rycieos commented Jun 30, 2025

@Rycieos , let me know if there's anything else we should include.

I have brought up #2111 a lot already, but I think it bears repeating. As @robscott pointed out in the original Slack thread for this PR (which I will quote here since Slack is dying soon),

If I'm understanding correctly, this doesn't require a change to TLSRoute API itself, but more expanding the interaction between a Gateway and TLSRoute to allow for the combination of TLS termination and TLSRoute?
It is absolutely something we should keep in mind though. I think/hope this version rev to v1alpha3 would set us up for GA of TLSRoute, and I'd really like to have the interaction with Gateway well specified before that point.

So IMO, #2111 should come at the same time as v1alpha3, or at the very latest in v1beta1, assuming we don't skip the beta stage for TLSRoute. How that ties in to Gateway resource versions I don't think has been figured out yet.

@rostislavbobo rostislavbobo force-pushed the tlsroute-hostnames-required branch from 11146d7 to b5c8228 Compare July 1, 2025 16:01
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 1, 2025
TLSRoute: Update config crd

TLSRoute: Update pkg

TLSRoute: Update hack example
@rostislavbobo rostislavbobo force-pushed the tlsroute-hostnames-required branch 4 times, most recently from 663631a to e71e087 Compare July 1, 2025 16:59
@rostislavbobo
Copy link
Author

I think we need to add the new version, and move the v1alpha2 source files to mostly be aliases to the new objects (except for this change).

@youngnick , v1alpha3 is already based on v1alpha2 (see v1alpha3/backendtlspolicy_types.go, v1alpha3/zz_generated.deepcopy.go). That means we can't reference v1alpha2 from v1alpha3. And I don't think we should change that, as this is the right way to handle versioning and aliasing in the v1alpha3 case.

I've put the new TLSRoute into v1alpha3 and just referenced reused v1alpha2 types (such as TLSRouteStatus and TLSRouteRule).

If we want to add all remaining XRoutes to v1alpha3, we'd need to keep the v1alpha2 XRoutes as they are, and then create new v1alpha3 XRoutes that reference these original v1alpha2 definitions. I personally don't see much value in doing this. What do you think?

@rostislavbobo rostislavbobo force-pushed the tlsroute-hostnames-required branch 2 times, most recently from 29dbaa9 to fd1db55 Compare July 1, 2025 19:19
@rostislavbobo rostislavbobo force-pushed the tlsroute-hostnames-required branch from fd1db55 to 4eb32a7 Compare July 1, 2025 19:38
@k8s-ci-robot
Copy link
Contributor

@rostislavbobo: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-gateway-api-crds-validation-1 4eb32a7 link true /test pull-gateway-api-crds-validation-1
pull-gateway-api-crds-validation-4 4eb32a7 link true /test pull-gateway-api-crds-validation-4
pull-gateway-api-crds-validation-2 4eb32a7 link true /test pull-gateway-api-crds-validation-2
pull-gateway-api-crds-validation-5 4eb32a7 link true /test pull-gateway-api-crds-validation-5
pull-gateway-api-crds-validation-3 4eb32a7 link true /test pull-gateway-api-crds-validation-3

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@youngnick
Copy link
Contributor

@rostislavbobo Yeah, I think that's fine. The intent here was to end up with the generated YAML having both versions, to save users from a breaking change when they install the v1.4 YAMLs (it's possible to have a TLSRoute saved in v1alpha2 that will not be valid in v1alpha3). Also, we can't even make a conversion webhook for this case, because we can't guess what the hostname should be if it previously was not present.

So, we're going to have to break people who weren't already setting a hostname in the TLSRoute.

But, if we only include v1alpha3 as the storage version in the generated CRD YAML, then the existing v1alpha2 objects will just vanish from the cluster when the definition is updated (since they're no longer a valid version) - even if they were using a hostname already and would translate just fine into v1alpha3.

So I think that the options here are:

  • include v1alpha2 with breaking changes between it and v1alpha3. This will break people who are not using a hostname, in that v1alpha2 objects cannot be cast into v1alpha3 objects. tbh I'm not sure how the apiserver will handle this case.
  • don't include v1alpha2 in the YAMLs. This will mean that everyone who is currently using TLSRoute will need to back up their routes in v1alpha2 before installing v1.4, change them to v1alpha3, making sure that a hostname is being set, then put them back after installing v1.4.

We can test this to find out what would happen in the first case by:

  • setup a cluster that has Gateway API v1.3 installed, including v1alpha2 TLSRoute
  • create some TLSRoute objects, one that does have hostname, and one that does not.
  • install the YAML from this PR
  • See if you can get any of the existing TLSRoute objects. (I don't think you will see them any more after that update).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Require hostnames for TLSRoutes
6 participants