Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make namespace parsing and informers pluggable #626

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

emsixteeen
Copy link
Contributor

This addresses #620 – in the case where the mpi-operator is running in a highly restrictive RBAC environment and the need to run MPIJobs in only a subset of namespaces.

The changes provide:

  1. A configurable namespace parser function, what will either use a default [and simple] parser, or a user-provided one.
  2. Configurable Informer functions that will created either the default SharedInformerFactory for each type of Informer or user-provided ones.

Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign alculquicondor for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow bot requested a review from zw0610 February 6, 2024 20:26
Copy link
Member

@tenzen-y tenzen-y left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not sure why this change is needed since the current implementation (without this PR) also watchs only the dedicated namespace.

@emsixteeen
Copy link
Contributor Author

Here's the scenario ...

  • There are N namespaces in the cluster (ns1, ns2, ns3, ...), and the cluster does not allow listing Secrets at the cluster level (which is enforced by some policy agent).

  • Per-namespace RoleBindings can be created to allow other Subjects (such as ServiceAccounts in other namespaces) access to items like Secrets on a per-namespace level (i.e. RoleBinding vs ClusterRoleBinding).

  • The desire is to run a single instance of the mpi-operator (in its own namespace), and have it watch for MPIJobs in only a subset of the N namespaces, e.g. ns1 and ns3. In other words, the goal is to allow MPIJobs to run in some namespaces, but not all (more than one namespace, but not cluster-wide).

  • Listing items like Secrets of the watched namespace(s) is a requirement of the mpi-operator (or it fails to run).

Because of the way Informers work under the hood (Informers implement the underlying watch functionality), you can only select one namespace to watch or all namespaces to watch.

With the current implementation this scenario wouldn't seem to work ...

@tenzen-y
Copy link
Member

Here's the scenario ...

  • There are N namespaces in the cluster (ns1, ns2, ns3, ...), and the cluster does not allow listing Secrets at the cluster level (which is enforced by some policy agent).
  • Per-namespace RoleBindings can be created to allow other Subjects (such as ServiceAccounts in other namespaces) access to items like Secrets on a per-namespace level (i.e. RoleBinding vs ClusterRoleBinding).
  • The desire is to run a single instance of the mpi-operator (in its own namespace), and have it watch for MPIJobs in only a subset of the N namespaces, e.g. ns1 and ns3. In other words, the goal is to allow MPIJobs to run in some namespaces, but not all (more than one namespace, but not cluster-wide).
  • Listing items like Secrets of the watched namespace(s) is a requirement of the mpi-operator (or it fails to run).

Because of the way Informers work under the hood (Informers implement the underlying watch functionality), you can only select one namespace to watch or all namespaces to watch.

With the current implementation this scenario wouldn't seem to work ...

@emsixteeen Uhm, I see. I understand your situation, but as I can see your code, the mpi-operator can only watch a single namespace the same as before, right?:

func DefaultKubeInformer(namespaces []string, kubeClient kubeclientset.Interface) kubeinformers.SharedInformerFactory {
var kubeInformerFactoryOpts []kubeinformers.SharedInformerOption
if namespaces[0] != metav1.NamespaceAll {
kubeInformerFactoryOpts = append(kubeInformerFactoryOpts, kubeinformers.WithNamespace(namespaces[0]))
}

@emsixteeen
Copy link
Contributor Author

@emsixteeen Uhm, I see. I understand your situation, but as I can see your code, the mpi-operator can only watch a single namespace the same as before, right?:

That is correct – the PR does not change any current functionality or implementation. Rather, it provides an extension point to call different InformerFactory functions for each of the Informers.

It does this by replacing the actual call to <SomePackage>.NewSharedInformerFactoryWithOptions(...) function with a function definition. The function implementation makes the actual call to <SomePackage>.NewSharedInformerFactoryWithOptions(...).

The PR provides a default function which behaves the same way as the current implementation, which is what you referenced.

The crux of the changes are where the function definition is defined:

type KubeInformerFunc func(namespaces []string, kubeClient kubeclientset.Interface) kubeinformers.SharedInformerFactory
type MpiJobInformerFunc func(namespaces []string, mpiJobClient mpijobclientset.Interface) informers.SharedInformerFactory
type VolcanoInformerFunc func(namespaces []string, volcanoClient volcanoclient.Interface) volcanoinformers.SharedInformerFactory
type SchedulerPluginsInformerFunc func(namespaces []string, schedClient schedclientset.Interface) schedinformers.SharedInformerFactory
type InformerOptions struct {
KubeInformer KubeInformerFunc
MpiJobInformer MpiJobInformerFunc
VolcanoInformer VolcanoInformerFunc
SchedulerPluginsInformer SchedulerPluginsInformerFunc
}

Here is where / how it's used:

kubeInformerFactory := opt.KubeInformer(namespaces, kubeClient)
mpiJobInformerFactory := opt.MpiJobInformer(namespaces, mpiJobClientSet)

For comparison, the current implementation:

kubeInformerFactory := kubeinformers.NewSharedInformerFactoryWithOptions(kubeClient, 0, kubeInformerFactoryOpts...)
kubeflowInformerFactory := informers.NewSharedInformerFactoryWithOptions(mpiJobClientSet, 0, kubeflowInformerFactoryOpts...)


Considering the scenario, the thought process and rationale for these changes:

Background:

  • Current Informer packages uses Kubernetes codegen tools.
  • Currently, the codegen tools create Informers that watch either one or all namespaces.
  • There's a package xns-informer1 which is built upon Kubernetes codegen tools that creates Informers for multiple namespaces. Perhaps there are others.
  • The Informers created by the xns-informer package are experimental and are usually – but not guaranteed to be – API compatible2.

Options:

  1. Determine that the scenario is not supportable. It's either one namespace or all namespaces.
  2. Bring in the xns-informer package (or another package) as a dependency, and use its codegen tools to create Informer packages.
  3. Some middle ground: keep everything API compatible, while still providing a way to bring in a different InformerFactory.

Thoughts:

  • The first option is the underlying issue being addressed!
  • The second option seems a bit heavy-handed: bringing in a third-party experimental library seems counter-intuitive.
  • The third option seems like a better one: leave all functionality and implementation as is; everything remains API compatible; and extension points are added. If one wants to use experimental libraries, they can just extend the mpi-operator by hooking into it, while leaving everything else intact.

Usability:

In order to make use of the extension points, a new package is needed, which is composed using the mpi-operator3. The upside to this is that one is free to swap InformerFactory implementations. The downside is that these implementations live outside of the mpi-operator project and package.

There's probably plenty more pros and cons which haven't even been touched upon ... Perhaps there's a some better way to go about this ...?

Footnotes

  1. https://github.com/maistra/xns-informer

  2. https://github.com/maistra/xns-informer?tab=readme-ov-file#cross-namespace-informers

  3. Example: https://github.com/emsixteeen/mpi-operator-xns

@tenzen-y
Copy link
Member

@emsixteeen Uhm, I see. I understand your situation, but as I can see your code, the mpi-operator can only watch a single namespace the same as before, right?:

That is correct – the PR does not change any current functionality or implementation. Rather, it provides an extension point to call different InformerFactory functions for each of the Informers.

It does this by replacing the actual call to <SomePackage>.NewSharedInformerFactoryWithOptions(...) function with a function definition. The function implementation makes the actual call to <SomePackage>.NewSharedInformerFactoryWithOptions(...).

The PR provides a default function which behaves the same way as the current implementation, which is what you referenced.

The crux of the changes are where the function definition is defined:

type KubeInformerFunc func(namespaces []string, kubeClient kubeclientset.Interface) kubeinformers.SharedInformerFactory
type MpiJobInformerFunc func(namespaces []string, mpiJobClient mpijobclientset.Interface) informers.SharedInformerFactory
type VolcanoInformerFunc func(namespaces []string, volcanoClient volcanoclient.Interface) volcanoinformers.SharedInformerFactory
type SchedulerPluginsInformerFunc func(namespaces []string, schedClient schedclientset.Interface) schedinformers.SharedInformerFactory
type InformerOptions struct {
KubeInformer KubeInformerFunc
MpiJobInformer MpiJobInformerFunc
VolcanoInformer VolcanoInformerFunc
SchedulerPluginsInformer SchedulerPluginsInformerFunc
}

Here is where / how it's used:

kubeInformerFactory := opt.KubeInformer(namespaces, kubeClient)
mpiJobInformerFactory := opt.MpiJobInformer(namespaces, mpiJobClientSet)

For comparison, the current implementation:

kubeInformerFactory := kubeinformers.NewSharedInformerFactoryWithOptions(kubeClient, 0, kubeInformerFactoryOpts...)
kubeflowInformerFactory := informers.NewSharedInformerFactoryWithOptions(mpiJobClientSet, 0, kubeflowInformerFactoryOpts...)

Considering the scenario, the thought process and rationale for these changes:

Background:

  • Current Informer packages uses Kubernetes codegen tools.
  • Currently, the codegen tools create Informers that watch either one or all namespaces.
  • There's a package xns-informer1 which is built upon Kubernetes codegen tools that creates Informers for multiple namespaces. Perhaps there are others.
  • The Informers created by the xns-informer package are experimental and are usually – but not guaranteed to be – API compatible2.

Options:

  1. Determine that the scenario is not supportable. It's either one namespace or all namespaces.
  2. Bring in the xns-informer package (or another package) as a dependency, and use its codegen tools to create Informer packages.
  3. Some middle ground: keep everything API compatible, while still providing a way to bring in a different InformerFactory.

Thoughts:

  • The first option is the underlying issue being addressed!
  • The second option seems a bit heavy-handed: bringing in a third-party experimental library seems counter-intuitive.
  • The third option seems like a better one: leave all functionality and implementation as is; everything remains API compatible; and extension points are added. If one wants to use experimental libraries, they can just extend the mpi-operator by hooking into it, while leaving everything else intact.

Usability:

In order to make use of the extension points, a new package is needed, which is composed using the mpi-operator3. The upside to this is that one is free to swap InformerFactory implementations. The downside is that these implementations live outside of the mpi-operator project and package.

There's probably plenty more pros and cons which haven't even been touched upon ... Perhaps there's a some better way to go about this ...?

Footnotes

  1. https://github.com/maistra/xns-informer
  2. https://github.com/maistra/xns-informer?tab=readme-ov-file#cross-namespace-informers
  3. Example: https://github.com/emsixteeen/mpi-operator-xns

Thank you for this clarifications.
This means that you want to implement a separate mpi-operator with x-ns informers in external repositories like https://github.com/emsixteeen/mpi-operator-xns.
So, you want to export informer factories in this mpi-operator repository, right?

TBH, I don't like such extension points for external mpi-operator.
I believe that we should make an effort in this upstream mpi-operator.

Also, if it's challenging to make efforts for you in this repository or upstream kubernetes, you can fork this repository, and you can directly improve the forked mpi-operator with x-ns informers.

@alculquicondor WDYT?

@tenzen-y
Copy link
Member

tenzen-y commented Feb 15, 2024

Also, I guess that switching to controller-runtime would be possible to create x-ns informers like this: https://github.com/kubernetes-sigs/controller-runtime/blob/56159419231e985c091ef3e7a8a3dee40ddf1d73/pkg/cache/cache.go#L223-L225

But, I'm not sure that the controller-runtime will really resolve this issue.

@emsixteeen
Copy link
Contributor Author

But, I'm not sure that the controller-runtime will really resolve this issue.

I don't think it will resolve it ... I haven't checked the controller-runtime code too deeply, but I'm assuming that it uses that standard client-go (i.e. generated) Informers.

If it's the case – that it's using the standard generated Informers – then that's the key issue: an Informer will List either one namespace or all namespaces. The filtering is done post-facto – i.e. the Informer will get objects from all namespaces, but the filter will only act upon interested namespace(s).

The catch there is that if RBAC rules don't allow for some item to be listed across all namespaces, the Informer will keep getting a permission error, and the HasSycned calls won't return true.

@emsixteeen
Copy link
Contributor Author

Thank you for this clarifications. This means that you want to implement a separate mpi-operator with x-ns informers in external repositories like https://github.com/emsixteeen/mpi-operator-xns. So, you want to export informer factories in this mpi-operator repository, right?

It's not that I'm looking to implement another operator, inasmuch as be afforded functionality that is not available in the current design. (Noting that the current design is more an artifact of how Kubernetes codegen works, and not directly related to the design of the mpi-operator itself.)

TBH, I don't like such extension points for external mpi-operator. I believe that we should make an effort in this upstream mpi-operator.

Totally understand... The main reason this PR creates extension points was so that there would be no need to bring in (and create dependencies upon) experimental packages (such as xns-informer, or others).

On one hand, creating extension points can become a liability in the long term (needing to support it, side effects, risking breaking other things down the road, etc.) ... On the other hand, I don't know of another way to do this without creating additional dependencies and also obtaining this functionality (barring upstream code changes in how client-go and Kuberentes codegen works). Having these extension points though, creates no change in functionality in the upstream operator. Potential issues / side effects can be introduced if the extension points were used – but that would be the responsibility of the extender. In other words, even with these extension points in place, nothing changes in the upstream operator.

Regardless, I'm completely open to bringing this functionality directly into [this] upstream mpi-operator... What might be some next-best steps, or ideas, to implementing this? Is it to OK using to use Informers that are codegen'd by something other than Kubernetes codgen? (Again, the initial focus was to not bring in additional dependencies that are out of the core Kubernetes ecosystem.)

Here's an early prototype of what bringing in xns-informer generated Informers might look like: https://github.com/emsixteeen/mpi-operator/tree/xns-informers

Also, if it's challenging to make efforts for you in this repository or upstream kubernetes, you can fork this repository, and you can directly improve the forked mpi-operator with x-ns informers.

Given the value the project has delivered, I'd really rather the effort be applied to the upstream project...

@alculquicondor WDYT?

@tenzen-y
Copy link
Member

@emsixteeen I meant that I disagree with both exporting Informer for external projects and using the xns-informer in this project.
If you want to use the features that have xns-informer, I think you should discuss to add the features to kubernetes core features in the upstream Kubernetes.

If you don't want to such a discussion in the kubernetes core project, you can just fork this repository and then you can extend the mpi-operator using the xns-informer.

I think importing not maintainable library (xns-informer) should avoid as much as possible, and we should not increase maintenance costs by exporting functions for the external projects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants