Add recipe on ECK multi-tenancy #8735

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Closed

LolloneS wants to merge 4 commits into main from eck-multi-tenancy-docs

LolloneS commented Jul 11, 2025

This PR adds a much-needed recipe on ECK multi-tenancy. While this is nothing definitive, in Services we see many customers interested in the topic. This recipe aims to be a starting point for sparking thoughts and discussions on how to best structure multi-tenant ECK deployments especially in complex organizations with tens of teams and numerous environments.


          Add recipe on ECK multi-tenancy

5d89e6f

github-actions bot commented Jul 11, 2025

Warning

It looks like this PR modifies one or more .asciidoc files. These files are being migrated to Markdown, and any changes merged now will be lost. See the migration guide for details.

Collaborator

prodsecmachine commented Jul 11, 2025 •

edited

Loading

🎉 Snyk checks have passed. No issues have been found so far.

✅ security/snyk check is complete. No issues have been found. (View Details)

✅ license/snyk check is complete. No issues have been found. (View Details)

botelastic bot added the triage label

LolloneS added >docs and removed triage labels

kvalliyurnatt requested review from pebrc, barkbay, naemono, rhr323 and kvalliyurnatt

July 11, 2025 13:35

pebrc reviewed

View reviewed changes

Collaborator

pebrc left a comment

Thanks for your contribution, much appreciated! I wonder though if a blog post would be a better place for this? The way we use the recipes folder is for concrete yaml manifests that users can apply (with minimal modifcation) to test drive/POC certain setups. The yaml part of your write up is comparatively small and serves more to illustrate the accompanying text.

.gitignore

@@ @@ -73,3 +73,6 @@ Chart.lock @@
               # build
               build/dev*
               build/eck*
+              # macOS specific files
+              .DS_Store

Collaborator

pebrc Jul 14, 2025

Use your personal global.gitignore for platform specific files

config/recipes/multi-tenancy/01-config.yaml

Comment on lines +8 to +34

+                podTemplate:
+                  metadata:
+                    labels:
+                      elasticsearch.k8s.elastic.co/cluster-name: cluster-a
+                  spec:
+                    tolerations: # I "accept" the nodes defined for my team
+                      - key: "taints.demo.elastic.co/team"
+                        operator: "Equal"
+                        value: "team-a"
+                        effect: "NoSchedule"
+                    affinity: # I want the nodes defined for my project
+                      nodeAffinity:
+                        requiredDuringSchedulingIgnoredDuringExecution:
+                          nodeSelectorTerms:
+                            - matchExpressions:
+                                - key: "labels.demo.elastic.co/team"
+                                  operator: "In"
+                                  values:
+                                    - "team-a"
+                      podAntiAffinity: # Try to not put me on the same host where other pods for the same ES cluster are running
+                        preferredDuringSchedulingIgnoredDuringExecution: # or requiredDuring...
+                          - weight: 100
+                            podAffinityTerm:
+                              labelSelector:
+                                matchLabels:
+                                  elasticsearch.k8s.elastic.co/cluster-name: cluster-a
+                              topologyKey: kubernetes.io/hostname

Collaborator

pebrc Jul 14, 2025

None of this is actually ECK specific. The use of taints/tolerations node and pod anti-affinity applies to any workload on Kubernetes.

config/recipes/multi-tenancy/README.asciidoc

Comment on lines +52 to +53

		- There is no environment in which to test Kubernetes and elastic-operator upgrades, which means each upgrade is going to be fire-and-pray.
		- Depending on the implementation, this architecture could become a noisy neighbors party. For instance, a misconfigured development cluster could saturate the underlying host's resources or bandwidth, hence degrading the performance of the pods deployed on the same host.

Collaborator

pebrc Jul 14, 2025

Not sure the informal/colloquial tone here ("fire-and-pray", "noisy neighbors party") fits our reference documentation.

config/recipes/multi-tenancy/README.asciidoc Outdated Show resolved Hide resolved

config/recipes/multi-tenancy/README.asciidoc

+              The main two options are:
+              - Both the production and non-production Monitoring clusters live in a single, separate Kubernetes cluster
+              - Each Monitoring cluster lives in its own Kubernetes cluster

Collaborator

pebrc Jul 14, 2025

Suggested change

      
            - Each Monitoring cluster lives in its own Kubernetes cluster
          
            - Each monitoring cluster lives in its own Kubernetes cluster

config/recipes/multi-tenancy/README.asciidoc Outdated Show resolved Hide resolved

config/recipes/multi-tenancy/README.asciidoc Outdated Show resolved Hide resolved

config/recipes/multi-tenancy/README.asciidoc


		=== Reference architecture #2: one Kubernetes cluster per Elasticsearch deployment

		Many Elastic Stack admins opt for having 1:1 mapping between Elasticsearch clusters and Kubernetes clusters, meaning that each Kubernetes cluster is fully dedicated to one single Elasticsearch cluster. This allows for even stronger hard multi-tenancy (assuming this can be considered multi-tenancy) and does not require configurations such as the taints and tolerations, but requires the capability to run a fleet of Kubernetes clusters, which is a task on its own. In other words, in this case customers will intentionally decide to have a fleet of Kubernetes clusters, and will have all the needed automation to manage them. If no such automation is available, it is almost guaranteed that the final outcome will be an impossible-to-manage Kubernetes and Elasticsearch cluster sprawl.

Collaborator

pebrc Jul 14, 2025

Is that really the case? Do we have data to back that claim up? I am surprised that "many" admins would choose that approach given the operational overhead of running a dedicated k8s cluster. But I could be wrong.

config/recipes/multi-tenancy/README.asciidoc


		image::prod-and-non-prod-hard.jpeg[Reference architecture #1: production and non-production with hard multi-tenancy,align="center"]

		In this architecture, we use Kubernetes namespaces to achieve "soft" multi-tenancy, and pair that with Kubernetes taints, tolerations, and nodeAffinity to ensure "hard" multi-tenancy, so that a node in the Kubernetes cluster will only host pods for Elasticsearch clusters belonging to the same team. This scenario enforces stricter separation of concerns, but comes at a cost: it is in fact highly unlikely that such a deployment would allow for a similar level of resource and cost efficiency, since it probably requires more nodes to be added to the Kubernetes cluster than strictly necessary, likely ending up with some of them being under-utilized.

Collaborator

pebrc Jul 14, 2025

One could argue that this is not really hard multi-tenancy as you would still share the same node pools and the same control plane i.e. effectively a shared database. So to make this multi-tenancy "harder" one would could want dedicated node pools and maybe even virtual clusters per tenant (which however might make things a bit complicated for storage provisioning especially for local storage with Elasticsearch)

Author

LolloneS Jul 15, 2025

So to make this multi-tenancy "harder" one would could want dedicated node pools

If you look at the YAML example, we are targeting specific nodes via affinity, taints, and tolerations. In a real-world scenario, those nodes would indeed belong to different nodepools (think dedicated ASGs on AWS). You are right that I should make it clearer :)

xeraa commented Jul 16, 2025

I wonder though if a blog post would be a better place for this?

Since I thought "recipe" when seeing it first and then we redirected it: This IMO makes for a pretty dry blog post. Maybe it needs a bit more YAML and actual recipes to be a better fit? I didn't see much in the docs about tolerations for example, which I thought would still be a good addition?

Collaborator

pebrc commented Jul 16, 2025

I wonder though if a blog post would be a better place for this?

Since I thought "recipe" when seeing it first and then we redirected it: This IMO makes for a pretty dry blog post. Maybe it needs a bit more YAML and actual recipes to be a better fit? I didn't see much in the docs about tolerations for example, which I thought would still be a good addition?

Yes maybe if we had a complete working example it would make more sense.

Kushmaro commented Jul 18, 2025

@xeraa , actually, the team is looking to start blogging way more heavily over ECK to push its marketing forward.. I think adding a couple of diagrams to this might make it a good blogpost to start with

LolloneS and others added 3 commits

July 25, 2025 10:49


          Update config/recipes/multi-tenancy/README.asciidoc

a9196ec

Co-authored-by: Peter Brachwitz <[email protected]>


          Update config/recipes/multi-tenancy/README.asciidoc

0ccc383

Co-authored-by: Peter Brachwitz <[email protected]>


          Update config/recipes/multi-tenancy/README.asciidoc

18f9874

Co-authored-by: Peter Brachwitz <[email protected]>

Author

LolloneS commented Aug 3, 2025

I'm closing this PR as this is going to be released as a blog post.

LolloneS closed this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

pebrc pebrc left review comments

barkbay Awaiting requested review from barkbay

naemono Awaiting requested review from naemono

rhr323 Awaiting requested review from rhr323

kvalliyurnatt Awaiting requested review from kvalliyurnatt

Labels

>docs