Skip to content

fix: provisionning limits counter should be init without num executors #1620

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

heurtematte
Copy link

@heurtematte heurtematte commented Nov 13, 2024

The concurrency limit set in the cloud configuration is not well interpreted.

e.g: if a limit is set up to 2, only one executor will be able to run at a time. Currently the limit is a kind of n-1.

This is due to the fact that numExecutors is used twice in class KubernetesProvisioningLimits:

  • In the initInstance method with node.getNumExecutors()
  • In the register method with param numExecutors

initInstance method:

register method:

@heurtematte heurtematte requested a review from a team as a code owner November 13, 2024 14:01
@heurtematte heurtematte changed the title fix: provisionning limits counter should be init wihtout num executors fix: provisionning limits counter should be init without num executors Nov 13, 2024
@Vlatombe
Copy link
Member

Hello,

I would advise you to try writing a test that demonstrates the problem you are facing. Your current patch is removing count that occurs at startup, which means any agent that was existing before startup won't be taken into account by the limit logic.

@pieterjanpintens
Copy link

We also see weird behavior with this setting. It feels that the internal counter it is not always in sync with reality.
We have a limit set of 20 concurrent nodes. Our master claims that there are indeed 20 nodes up, while not even 10 are actually connected. This blocks the jobs as their executors cannot find nodes to run on.

@jglick
Copy link
Member

jglick commented Jul 18, 2025

the internal counter it is not always in sync with reality

FWIW I would not advise using this feature of the plugin at all. Use K8s namespace quotas instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants