-
Notifications
You must be signed in to change notification settings - Fork 67
Supporting stretch Kafka cluster with Strimzi #129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
80f778e
to
8396b40
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, thanks for the proposal. Left some initial comments.
Can you please put one sentence per line to make the review easier? You can look at one of the other proposals for an example.
The word "cluster" is overloaded in this context, so we should always pay attention and clarify if we are talking about Kubernetes or Kafka.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the proposal. I left some comments.
But TBH, I do not think the level of depth it has is nowhere near to where it would need to be to approve or not approve anything. It is just a super high-level idea that without the implementation details cannot be correct or wrong. We cannot approve some API changes and then try to figure out how to implement the code around it. It needs to go hand in hand.
It also almost completely ignores the networking part which is the most complicated part. It needs to cover how the different mechanisms will be supported and handled as we should be able to integrate into the cloud native landscape and fit in with the tools already being used in this area. Relying purely on something like Ingress is not enough. So the proposal needs to cover how this will be handled and how do we ensure the extensibility of this.
It would be also nice to cover topics such as:
- How will the installation be handled both on the side clusters as well as on the main Kubernetes cluster
- Testing strategy (how and where will we test this given our resources)
6239b26
to
f8d3497
Compare
72b3605
to
19fac97
Compare
…tion Added details about how to use Submariner for cross cluster communication Contributes to: strimzi#129 Signed-off-by: Aswin A <[email protected]>
…tion Added details about how to use Submariner for cross cluster communication Contributes to: strimzi#129 Signed-off-by: Aswin A <[email protected]>
c541a8e
to
124b0b1
Compare
c9455a3
to
cf8672c
Compare
…tion Added details about how to use Submariner for cross cluster communication Contributes to: strimzi#129 Signed-off-by: Aswin A <[email protected]> Signed-off-by: Mark S Taylor <[email protected]>
cf8672c
to
c9455a3
Compare
Signed-off-by: Aswin A <[email protected]>
Signed-off-by: Aswin A <[email protected]>
…tion Added details about how to use Submariner for cross cluster communication Contributes to: strimzi#129 Signed-off-by: Aswin A <[email protected]>
c9455a3
to
64f06ad
Compare
Moved sentences to separate lines to help with reviews Signed-off-by: Aswin A <[email protected]>
…tion Signed-off-by: Aswin A <[email protected]>
Signed-off-by: Mark S Taylor <[email protected]>
Signed-off-by: Mark S Taylor <[email protected]>
5a71224
to
b47b711
Compare
Signed-off-by: Mark S Taylor <[email protected]>
This stretch cluster proposal has been updated significantly to include details of a prototype. We'd like to request a re-review of the proposal, please. Many thanks! |
I know of user interest in Istio, load balancers and node ports, Submariner, and Skupper. To be clear, I'm not talking with them daily, so no idea which one is still valid and who moved to something else etc. So these are things where interest was mentioned in the past that I can still remember. I'm not aware of any interest in Cilium (well, other than you I guess). But all of these technologies have one thing in common. The interest in stretch cluster is very small among them. Basically single-digit number of users. And there are plenty of other technologies that can be used and that Strimzi users use. For example ... Linkerd, Calico, Ingress, Gateway API, and so on. Any of those users might want to stretch a Kafka cluster tomorrow. And it will be hard to reject it, because suddenly it will have pretty much the same demand level as the previous implementations. That is why I think it is important to have the plugability and only basic support in Strimzi and have all the various niche implementations live outside. |
…afka clusters Update proposal to clarify that .cluster.local is not used in advertised.listeners and that .svc is sufficient for intra-cluster DNS resolution. Signed-off-by: Aswin A <[email protected]>
Just to add that, from a business point of view, those who are using Strimzi as part of IBM Event Streams have been asking us for an open solution for stretch clusters due to the resulting simplification for their applications and operations. Many customers have deployed proprietary implementations (making it harder for us to displace with Strimzi as an open implementation) and other customers are in the process of building their own custom solutions. This is the reason for the urgency / engagement from Ashwin, Mark and others on this topic. Matt Sunley, Principal Product Manager for IBM Event Automation |
I really appreciate all the valuable feedback maintainers have provided so far. It's helped sharpen the thinking around how to structure this proposal responsibly and in a way that fits within Strimzi's long-term maintainability model. Let me share some context and thoughts on why I believe we may not need to introduce a formal pluggable interface at this stage, while still keeping the door open for future extensibility. We started with a clean slate and tested various technologies: Submariner, Istio, Skupper, and later, Cilium. Only Submariner and Cilium worked for this particular use case, and what stood out was both rely on the Kubernetes Multi-Cluster Services (MCS) API. That realization shaped our thinking. We weren't building something specific to Cilium or Submariner - rather, we found that the MCS API acts as a unifying foundation across implementations. It allows services to be exported using ServiceExport, and makes them accessible through standardized *.clusterset.local DNS names. So, instead of baking technology specific logic into the Operator, we're simply consuming K8s native abstractions, which gives us portability and neutrality by design. The operator doesn't need to care whether it's Cilium, Submariner, or other tech behind the scenes --as long as it follows MCS, it just work. What's interesting is that Kubernetes itself becomes the pluggable interface in this model. Users can choose how to expose Kafka brokers and controllers via MCS compliant solutions like Submariner, Cilium etc. The operator doesn't need to be aware of the network implementation --- only that it can resolve hostnames and connect to remote Pods. This makes it very easy for users to try stretch clusters, using whatever networking solution they prefer, without pushing any specific technology into Strimzi core. We're absolutely not against having a pluggable model down the road. If adoption grows and the need for tech specific handling becomes clearer, we'll have a solid foundation to build one. But at this point, introducing an abstraction layer for plugins feels premature — and possibly even counterproductive because
We'd rather validate the concept with a lean, Kubernetes-aligned implementation first. If multiple users or stacks eventually require custom behavior, we can revisit pluggability based on real-world needs and usage patterns. |
We’re already aware of the aspects related to |
Signed-off-by: Aswin A <[email protected]>
@sunleym @aswinayyolath please see my responses below ...
Honestly, I'm not sure what exact Confluent feature you are talking about. Cluster Linking? I don't think that is comparable to stretch clusters over multiple Kube clusters. So, probably some other feature? I also wonder where exactly you see the But at least in my personal view, Strimzi's main goal is not to have feature parity with Confluent. We do not have the resources for that. (But we have clearly other advantages) And as such, we have our own goals with much higher priority, such as project sustainability and stability. And as you opened the vendor perspective, I think what I suggested fits there very well. Having a proper pluggable mechanism will enable all vendors providing software to decide which plugins they will support based on which technologies or provide their own - open source or proprietary plugins - based on the custom demand. As a vendor, the first thing you want to avoid is having to support some networking stack that you do not want to just because it is baked-in in the upstream project.
To be honest, I do not know what was you path to get where we are, but until I raised the comments about it, this proposal was written purely around Cilium and not around the MCS API. But as far as my latest understanding is, the MCS API has two useful implementations - Cilium and Submariner. And we have heard about a real interest only for Submariner, because as far as I understood it now there is no particular interest in Cilium from IBM. Don't get me wrong, it would be great if one day there would be one API covering all the various projects we see users interested in. But it is not there, and I do not think I would want to go all in on the MCS API and bake it into the Strimzi code-base as the only option. So, as far as I'm concerned, having one of the available plugins designed to support the MCS API would be great. But that is it.
Yes, Kubernetes is becoming more and more plugable for a long time. And it is one of the inspirations for why I think we should make it plugable here as well. See for example, the development around cloud provider plugins, container runtime plugins, storage driver plugins, etc. They all follow more or less what I'm suggesting here.
This is the right motivation, but applied in exactly the wrong direction. You provide the concept outside of the core code base and validate it. And if it turns out that half of Strimzi users are using the MCS API plugin and it is hugely popular, we can revisit it. Having the pluggable interface in Strimzi and the plugins live outside the Strimzi code-base will set clear expectations to Strimzi users:
As a Strimzi maintainer, I absolutely want to avoid the situation where Strimzi includes some code to validate and idea ... lures the users to start using it ... and then after six months drops it because the validation failed. That is something Feature Gate does not protect against. Because if you in the feature gate clearly state that this might be removed in the future if it fails the validation, most people will wait for it to be finished before using it. So while you might avoid luring someone into a feature you will remove after 6 months, it will ultimately also set it up to fail the validation. And all of that of course while having a bunch of code in the code base that has to be removed and lost of asociated costs and effort already burnt through. |
Thanks for pointing that out, Just to share the background, after the initial PoC, the revised proposal did include both Submariner and Cilium as viable options for enabling stretch clusters. During community discussions, we agreed to narrow the initial implementation scope and focus on one technology to keep things manageable. At that point, Cilium was selected due to some observed performance advantages in our testing. That said, we always intended to keep the design open to supporting other technologies in the future. |
cb36c43
to
c1789da
Compare
Signed-off-by: Aswin A <[email protected]>
c1789da
to
002de75
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had another pass and left comments (I have to go through the discussion on the main page which caused more changes on the proposal I guess)
|
||
##### Remote cluster operator configuration | ||
|
||
When deploying the operator to remote clusters, the operator must be configured to reconcile only StrimziPodSet resources by setting the existing environment variable: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the STRIMZI_POD_SET_RECONCILIATION_ONLY
could be not used.
AFAICS the combination of the strimzi.io/enable-stretch-cluster
annotation on the Kafka
custom resource and the strimzi.io/remote-podset
annotation on the StrimziPodSet
should be enough to avoid collisions.
For example ...
- If the user creates a
Kafka
CR namedfoo
in the central cluster (withstrimzi.io/enable-stretch-cluster
annotation), we'll haveStrimziPodSet
(withremote-podset
annotation) landing on the remote cluster. This would help the remote cluster operator (strimzipodset controller) to reconcile it (together with all the "local"StrimziPodSet
). - At the same time, the user can create a
Kafka
CR namedfoo
(again!) in the local cluster as well: it doesn't havestrimzi.io/enable-stretch-cluster
annotation, so it's local, the cluster operator creates theStrimziPodSet
(withoutstrimzi.io/remote-podset
annotation) and it's able to reconcile it.
I guess that having two Kafka
CR with same name (as stretched cluster and local cluster) won't be a problem in terms of advertising addresses and quorum voters because the stretched ones will take clusterId into account at DNS names level.
This way the cluster operator on the remote cluster can still operate the other operands (bridge, connect, ...).
But at this point shouldn't we have a similar annotation to remote-podset
for all the other resources (listed later in the proposal) like ConfigMap
, Secret
and so on to avoid clashing with the same but corresponding to the local Kafka cluster having the same name as the stretched one?
- The feature gate will be disabled by default, allowing early adopters and community members to safely test the functionality without affecting production environments. | ||
- After at least two Strimzi releases, and based on user feedback and observed stability, enabling the feature gate by default may be considered. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree on having this behind a feature gate despite to enable a stretch cluster needs some steps and configuration. A FG makes clear to the users that's a beta feature to be tested before maturing. Of course, we would need a timeline, I agree.
095-stretch-cluster.md
Outdated
### Kafka Connect, Kafka Bridge and MirrorMaker2 | ||
This proposal does not cover stretching Kafka Connect, Kafka MirrorMaker 2 or the Kafka Bridge. | ||
These components will be deployed to the central cluster and will function as they do today. | ||
Operators running in remote clusters will not manage KafkaBridge, KafkaConnect, KafkaConnector, or KafkaMirrorMaker2 resources. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As in the same thread I left a comment about the possibility to avoid the STRIMZI_POD_SET_RECONCILIATION_ONLY
and having remote cluster operator operating all other components deployed locally. For the stretch cluster it would make sense to continue to deploy in the central one imho.
Signed-off-by: Aswin A <[email protected]>
Co-authored-by: Paolo Patierno <[email protected]> Signed-off-by: Aswin A <[email protected]>
Co-authored-by: Paolo Patierno <[email protected]> Signed-off-by: Aswin A <[email protected]>
Co-authored-by: Paolo Patierno <[email protected]> Signed-off-by: Aswin A <[email protected]>
Co-authored-by: Paolo Patierno <[email protected]> Signed-off-by: Aswin A <[email protected]>
Co-authored-by: Paolo Patierno <[email protected]> Signed-off-by: Aswin A <[email protected]>
Thanks for raising this, @ppatierno — you're right, the external bootstrap section definitely deserve a more detailed explanation. I'm currently working on refining that part of the proposal, and Jakub has also shared some valuable questions in this area. I do have a working solution in mind, and I've tested how it behaves in practice, but I want to take a bit more time to properly evaluate the trade-offs and ensure we're proposing the most robust and user-friendly approach. I'll update the proposal soon with a more complete explanation. |
As a more general comment on the direction of the proposal ... During the review, in the past weeks, I raised the plugability issue with @aswinayyolath. It was mostly related to the fact that the proposal had a distinction between using Cilium (which didn't need any operator code change) vs Submariner (which needed the creation of a I think that leveraging a Kubernetes API to abstract the underneath networking would be the best choice, while on the other side I could also see that the documentation is poor (some sections are empty or TBD) which, to be honest, set doubts on how much the community is still working on it. There are also a few projects implementing it. On the other side I am not sure I get what @scholzj is mentioning as plugability in order to use projects like "Linkerd, Calico, Ingress, Gateway API, and so on" (as mentioned in one of the comments). Of course, they don't implement the MCS API (or it's just my ignorance here) so I would assume what Jakub is referring to is a kind of "Strimzi API for stretched cluster" that someone has to implement via a plugin in order to use their preferred technology. Is my understanding right? But also, if we think this feature has not big demand aren't we sure we can't be more opinionated and supporting MCS API with the few implementations available? A user who wants a stretch cluster should use one of them. I know it could be not possible for various business reasons, but at the same time do we want to really think about a generic new API in order to provide several users to implement their own and then discovering after months no one is going to do so? Also I am not sure from where the comparison with Confluent is coming but I agree with Jakub that the scope and goal of Strimzi are different. I can understand that discussing a proposal for so long time could be frustrating as I can feel from you when reading about "urgency / engagement ..." for customers that IBM has on its side but, as an open source maintainer, I take care of the project on the long term so thinking through helps (I hope that IBM folks can confirm that the proposal has been improved a lot with all the feedback from the community). Even if it means delaying stuff at the beginning, it pays on the long run. |
Thanks Jakub and Paulo. I do appreciate all the technical collaboration between the IBM and RedHat folks on this topic, and that it's worth taking time to start down the right path. At the same time, we see users of Kafka in Kubernetes with requirements for stretch clusters - it comes up a lot - and they are adopting/already using proprietary solutions (whether or not they really provide a similar capability, perception is all that matters) or even rolling their own in some cases. Many of these users would adopt Strimzi and that's what I want to see. |
Co-authored-by: Paolo Patierno <[email protected]> Signed-off-by: Aswin A <[email protected]>
@ppatierno The Strimzi API and plugins I'm talking about are set of Java interfaces as the API and a JAR with the implementation of the interfaces as the plugin. I.e. as our PodSecurityProviders or Kafka connectors for example. |
Well this looks more the technical explanation about what to do which was pretty clear to me. My question was more about ... are you envisage a custom Strimzi API for plugins (so something we should have in the proposal which doesn't exist at all) and not using a Kubernetes API like the MCS one? And my next question was, why you don't see enough having the MCS API (without the pluggability you are requesting)? Because of only a few projects implementing MCS? |
TBH, I think the advantages of having a pluggable interface are pretty obvious. And I think I covered many of them in my comments already anyway. For example:
The MCS API ... I would expect it to be one of the possible implementations. But TBH, for me it is not a Kubernetes API - at least not yet. It is a project worked on by one of the Kubernetes SIGs. I would be happy if it one day helps to standardize things. But while I do not claim to be an expert on it, it does not seem to be there. The obvious question marks are the number of implementations and the maturity of the API (4-year-old alpha version?). Why do you think it is the only thing we need to support? So no, for me it is not the obvious choice to hardcode it and dump it on the Strimzi community. And if you wanna build this using Kubernetes APIs, the obvious choice would be load balancers, node ports, etc. - but even there I would vote for the pluggability over having it hardcoded. Designing the pluggable interface might be initially more complicated. But if you are in it for the long term, I'm 100% sure it is worth it. As a core community, we also reduce some effort on developing and maintaining the various implementations and on testing them. It will also lead to cleaner design, as you cannot just hardcode all the stuff into the codebase but have to think about it a bit more. If you are against the plugability, I would also be curious what you would do if someone comes next month with the proposal to hardcode something else next to it? I do not think you would have other choice than to accept it. The plugability I'm proposing gives you a clear path for everyone without dumping the burden on the core community. |
d3db920
to
00055c6
Compare
…cluster and validate KafkaNodePool deployment targets in stretch cluster setups Signed-off-by: Aswin A <[email protected]>
00055c6
to
4d74ba1
Compare
Signed-off-by: Mark S Taylor <[email protected]>
Signed-off-by: Mark S Taylor <[email protected]>
This proposal describes design details of stretch cluster
Prototype
A working prototype can be deployed using the steps outlined in a draft README that is being iteratively revised.
Note: The prototype might not always exactly align with this proposal so please refer to the README documentation when working with the prototype.
POC implementation