-
Notifications
You must be signed in to change notification settings - Fork 1.7k
#2168 A brand-new WatchKube
provider for Kubernetes service discovery
#2174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WatchKube
discovery provider
0a07926
to
59e55fc
Compare
I've rebased the branch. Don't push please❗
And you will be able to contribute more 😉 |
59e55fc
to
697117c
Compare
Hi @raman-m , from my point of view all required stuff is done for moving to code review. Or am I missing something? |
The issue found at tintoy/dotnet-kube-client#163 was resolved by tintoy/dotnet-kube-client#164 and included in the release v2.5.12. However, this release has not been marked as the latest on the repository's releases page, where v2.5.10 is still listed as the most recent version. Could you please explain why v2.5.10 is considered the latest release? |
If you're confident and have completed the main development phase, can't you press the "Ready for Review" button? |
Have no idea, let's wait for the author's answer to your question. |
Hello, Nikolay! While it's true that the PR seems partially ready for review following the closure of tintoy/dotnet-kube-client#163 and its subsequent release of v2.5.12, I am concerned that reviewing this PR may be premature due to the following dependencies:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Testing the new WatchKube
type in ShouldReturnServicesFromK8s is necessary, but it is not sufficient for completing acceptance testing.
While it is well-tested by unit tests, there are no new acceptance tests for the proposed feature. Please review the scenarios of the current Kube
and PollKube
providers for inspiration. If Not, I will share my ideas later, in 1-2 weeks, after the release of the v23.3.6 patch.
Not quite "relies", but conflicts on rebase will apear, I'll resolve them after #2180 merging.
No problem, I'll be in touch when you get back to this |
b3d145d
to
e6f33a0
Compare
4330bf7
to
4734c9d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me; minor question about the .Catch()
operator, but not a blocker.
Thank you, Adam, for reviewing the code! @ggnaegi Gui, any suggestions? |
Development Complete ✅
Thank you, Nikolay, for professional development and quick feedback! This PR is ready to merge, but before approving it, I will conduct a final review over the next few days until Monday. I/We will write a more advanced section in the K8s doc, as the current version seems too brief to effectively present the feature. Also, I believe the appropriate "WatchKube provider" section could be longer, including real-world code snippets or even a sample. Could you attach some configurations, instructions, or code blocks from your production environment, please? My idea is as follows: at a minimum, we should compare the new provider type with the current ones, Finally, at long last, I am curious about your production environment where this new provider will be utilized. As a team, we are interested in beta-testing after deployment to production, and we will be ready to apply hotfixes if any bugs are found. Regarding delivery... How urgent is this feature for you and your team? Would you like to keep it in the Summer'25 milestone, scheduled for release in September? I believe we need a quicker timeline, and I am ready to include it in the current Spring'25 milestone, which contains 6 (six) features and is set for release in June, in a couple of weeks.
|
Not sure what you mean, it's configured in standard way -
Sounds good, we could start from this:
* - depends on poll interval
Yep, nothing special, we're hosting ocelot in k8s (24.0, .net 9, ubuntu) with several services behind it. We use rolling update
This isn't urgent at all, currently we're using
It's up to you, I'll need up to a week to conduct tests with the new provider once the package is ready. |
Thank you for the preliminary comparison table! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good to me, but let's talk a bit more and then merge this PR.
I'm going to push one commit with my review which will give my approval.
internal const int FailedSubscriptionRetrySeconds = 5; | ||
internal const int FirstResultsFetchingTimeoutSeconds = 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 5? Why 3? Why internal?
How will the user be able to configure this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made internal
for reference from Tests (with InternalsVisibleTo
attribute). Numbers are made up =) Didn't want to make configuration complicated. We can move it to configuration, but I think FailedSubscriptionRetrySeconds
should stay as const
, because KubeClient
already uses retries. Our retry was added for really unexpected errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no objection to the Retry logic However, my concern lies with the unconfigurable numbers. I believe that a configuration design can be deferred for now and introduced at a later stage, in a separate pull request.
An acceptable compromise would be to enable configuration directly in C# code.
Next step 👇
Therefore, I propose converting the internal constants into public static properties, allowing developers to set custom values for advanced user scenarios directly in C# code, given that configuration properties are not implemented now to be defined in ocelot.json
, aka global ServiceDiscoveryProvider section.
The Q ❔
What I consider inefficient and worth discussing further is the Seconds dimension. Why not Milliseconds? As you may know, some deployed environments operate at the millisecond level — 10, 100, or 500 ms — clearly less than one second. For now, I propose decreasing the default values to one second as the minimum for definitions, but I believe that for highly rapid systems operating in milliseconds, one second won't significantly impact response time.
Ultimately, I would suggest that waiting 5 seconds between retries is excessive 😉
Do you have any use cases from your project, Kubernetes with the Ocelot environment in production, based on data from monitoring or traced logs?
2nd Q
The 2nd question: Do both properties depend on each other? It seems the second property of fetching the first result is merely a delay in obtaining the initial result from the Observable layer, correct? This delay does not appear to be related to the retry logic, is that right?
In case they are both dependent, we have to think carefully about their values in seconds, avoiding any disruptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, my proposal is this commit → 061f018
However, I am uncertain about the following:
- Returning an
int
type that represents the value of theTimeSpan.FromSeconds(double)
argument: it has thedouble
parameter - The default values of
5
and3
seconds for the backing fields should be reduced, for instance, they could be set to1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Therefore, I propose converting the internal constants into public static properties, allowing developers to set custom values for advanced user scenarios directly in C# code
reasonable
What I consider inefficient and worth discussing further is the Seconds dimension. Why not Milliseconds?
10 ms for retries could create high pressure on k8s =) 10 ms = 100 rps. 100 rps * downstream services count * ocelot instances.
First result timeout is time that client can wait before first results fetching, and it can vary from couple milliseconds and up to defined timeout. So it's good candidate for configuration. Will try to find some numbers from prod env.
Do you have any use cases from your project, Kubernetes with the Ocelot environment in production, based on data from monitoring or traced logs
No, as I said KubeClient already retries subcription in case of network failures. You can think about retry logic introduced in this PR like "try -> catch -> retry all nested stuff inside KubeClient whatever it was under the hood". This 'whatever' could be bug or anything else in KubeClient logic.
This delay does not appear to be related to the retry logic, is that right?
Right, there is no dependency between properties. First results is important part, as far as discovery provider instantiated during first call.
So, my proposal is this commit.
Agree with your proposal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI I have assigned def values (both to 1 second) in commit 28a3afe 💡
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
10 ms for retries could create high pressure on k8s =) 10 ms = 100 rps. 100 rps * downstream services count * ocelot instances.
First result timeout is time that client can wait before first results fetching, and it can vary from couple milliseconds and up to defined timeout. So it's good candidate for configuration. Will try to find some numbers from prod env.
It appears this analysis is taking some time, and I am unable to wait any longer: the PR is ready to merge.
If there are further proposals for improvement, we will incorporate them into subsequent PRs, OK?
WatchKube
discovery providerWatchKube
provider for Kubernetes service discovery
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ready for delivery ✅
- Code reviews ✔️✔️
- Unit testing ✔️
- Acceptance testing ✔️
- Updated docs ✔️
@kick2nick Congrats! 🥳 It will be released in version 24.1 |
Closes #2168
Proposed Changes
WatchKube
service discovery provider that utilizes k8s watch api requests for efficient detection of changes.It's proof of concept that needs improvements (see comments) and tests.
Tested locally, it works as expected, but with older versions of
KubeClient
which can be used with newer one after fix tintoy/dotnet-kube-client#163.p.s. sorry for mess with #2173
Had no intention to force downgrade of KubeClient, will resume after fix)