Skip to content

[Improvement-17157][Master] Support setting max.concurrent.workflow.instances #17159

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: dev
Choose a base branch
from

Conversation

tusaryan
Copy link
Contributor

@tusaryan tusaryan commented May 1, 2025

Purpose of the pull request

This Pull Request implements a new configuration option max.concurrent.workflow.instances in the master server, allowing administrators to set a limit on the number of concurrently running workflow instances on a single master. This addresses the issue of potential overload when multiple workflows failover to a single master, as described in #17157.

close #17157

Brief change log

  • Added a new configuration property master.server-load-protection.max-concurrent-workflow-instances in application.yaml under the server-load-protection section. This property is optional and defaults to 2147483647 (no limit) if not set. To set specific limit, Eg:
    master.server-load-protection.max-concurrent-workflow-instances=100 #Set your desired limit here
    Replace 100 with any integer value you want as the limit.
  • Modified MasterServerLoadProtection.java to read and utilize the new configuration property.
  • Changed from using WorkflowCacheRepository to IWorkflowRepository for better abstraction and testability.
  • Implemented logic in MasterServerLoadProtection to check the current count of running workflow instances against the configured limit.
  • Updated the isOverload method in MasterServerLoadProtection to return true if the concurrent workflow instance count exceeds the configured maximum, marking the master as busy.
  • Updated MasterServerLoadProtectionConfig.java to properly inject IWorkflowRepository into the MasterServerLoadProtection instance.
  • Updated MasterServerLoadProtectionTest.java to include a test case verifying the new functionality where the master is marked as overloaded when the concurrent workflow instance count exceeds the configured limit.

Verify this pull request

This change added tests and can be verified as follows:

  • Updated MasterServerLoadProtectionTest.java with a new test method (isOverloadWithMaxConcurrentWorkflowInstances) to specifically test the behavior when the number of running workflow instances reaches or exceeds the configured max.concurrent.workflow.instances limit.
  • Explanation of test verification::
    • When the workflow count (5) is less than the max limit (10), the master is not overloaded
    • When the workflow count (5) equals the max limit (5), the master is marked as overloaded
    • When the workflow count (5) exceeds the max limit (3), the master is marked as overloaded

fix issue: [Improvement][Master] Support set max.concurrent.workflow.instances in master #17157

Pull Request Notice

Pull Request Notice

If your pull request contains incompatible change, you should also add it to docs/docs/en/guide/upgrade/incompatible.md

@tusaryan tusaryan marked this pull request as draft May 1, 2025 14:50
@tusaryan tusaryan marked this pull request as ready for review May 1, 2025 14:50
@SbloodyS SbloodyS changed the title [Fix-17157][Master] Support setting max.concurrent.workflow.instances [Improvement-17157][Master] Support setting max.concurrent.workflow.instances May 5, 2025
@SbloodyS SbloodyS added the improvement make more easy to user or prompt friendly label May 5, 2025
@SbloodyS SbloodyS added this to the 3.3.1 milestone May 5, 2025
Copy link

sonarqubecloud bot commented May 5, 2025

Copy link
Member

@ruanwenjun ruanwenjun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, and it's better to expose the activeWorkflowInstanceCount in MasterHeartBeat.

@tusaryan
Copy link
Contributor Author

tusaryan commented May 7, 2025

Can anyone review the changes again

@tusaryan tusaryan requested a review from ruanwenjun May 7, 2025 21:29
@PostConstruct
public void init() {
MasterServerLoadProtection serverLoadProtection = masterConfig.getServerLoadProtection();
serverLoadProtection.setWorkflowRepository(workflowRepository);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to inject at the constructor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this resolved? Why mark this as resolved?

@tusaryan tusaryan requested a review from ruanwenjun May 8, 2025 08:53
@nielifeng nielifeng requested a review from Copilot May 9, 2025 09:44
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a new configuration option "max.concurrent.workflow.instances" to limit the number of concurrently running workflow instances on a single master. Key changes include adding a new property in the YAML configuration, updating MasterServerLoadProtection to use IWorkflowRepository for counting workflow instances, and incorporating tests to verify overload behavior.

  • Added test cases in MasterServerLoadProtectionTest.java to validate overload conditions.
  • Updated MasterServerLoadProtection.java to enforce the new concurrent workflow limit.
  • Modified MasterServerLoadProtectionConfig.java to inject the IWorkflowRepository dependency.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
dolphinscheduler-master/src/test/java/org/apache/dolphinscheduler/server/master/config/MasterServerLoadProtectionTest.java Added test cases for the overload behavior with the new configuration option.
dolphinscheduler-master/src/main/java/org/apache/dolphinscheduler/server/master/config/MasterServerLoadProtectionConfig.java Updated to inject IWorkflowRepository into the load protection component.
dolphinscheduler-master/src/main/java/org/apache/dolphinscheduler/server/master/config/MasterServerLoadProtection.java Integrated the new "max.concurrent.workflow.instances" property checking and logging for overload conditions.
Comments suppressed due to low confidence (1)

dolphinscheduler-master/src/main/java/org/apache/dolphinscheduler/server/master/config/MasterServerLoadProtection.java:52

  • [nitpick] The log message contains a minor typo with 'over then' which could be corrected to 'exceeds' for better clarity.
log.info("OverLoad: the workflow instance count: {} is over then the maxConcurrentWorkflowInstances {}", currentWorkflowInstanceCount, maxConcurrentWorkflowInstances);

@ruanwenjun
Copy link
Member

Please solve the code check style by mvn spotless:apply

@tusaryan
Copy link
Contributor Author

I wanted to let you know that I'm currently traveling. Because of this, I might be a bit delayed in addressing the feedback on this pull request. I will definitely get to it as soon as things settle down for me.

@tusaryan
Copy link
Contributor Author

Please solve the code check style by mvn spotless:apply

Code style issues have been fixed by running mvn spotless:apply. Please let me know if there's anything else I can do!

@ruanwenjun ruanwenjun force-pushed the improvement/17157-master-concurrent-limit branch from 0e05e2b to 8b5361f Compare May 16, 2025 10:24
Copy link
Member

@ruanwenjun ruanwenjun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ruanwenjun ruanwenjun force-pushed the improvement/17157-master-concurrent-limit branch from bd4d946 to 4fdac71 Compare May 27, 2025 01:27
@Bean
public MasterServerLoadProtection masterServerLoadProtection(
IWorkflowRepository workflowRepository,
@Value("${master.server-load-protection.max-concurrent-workflow-instances:2147483647}") int maxConcurrentWorkflowInstances) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to add this config in doc, and you need to set other fields in MasterServerLoadProtection, e.g. maxSystemCpuUsagePercentageThresholds, maxJvmCpuUsagePercentageThresholds.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to add this config in doc, and you need to set other fields in MasterServerLoadProtection, e.g. maxSystemCpuUsagePercentageThresholds, maxJvmCpuUsagePercentageThresholds.

Thanks for the feedback! I’ll get to it as soon as possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay. I will update it within 1-2 days.

tusaryan and others added 8 commits June 9, 2025 09:34
- Refactored MasterServerLoadProtection to require IWorkflowRepository and maxConcurrentWorkflowInstances via constructor injection.
- Updated MasterServerLoadProtectionConfig to provide these dependencies using Spring @bean and @value with a default of 2147483647.
- Removed direct instantiation of MasterServerLoadProtection in MasterConfig to avoid missing constructor arguments.
@ruanwenjun ruanwenjun force-pushed the improvement/17157-master-concurrent-limit branch from 4fdac71 to 3f0ec17 Compare June 9, 2025 01:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend improvement make more easy to user or prompt friendly test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Improvement][Master] Support set max.concurrent.workflow.instances in master
4 participants