Skip to content

Conversation

zhuo-zhi
Copy link

@zhuo-zhi zhuo-zhi commented May 17, 2025

Why are the changes needed?

Support Volcano scheduler integration by enabling PodGroup creation for podTask in Propeller. Volcano is a Kubernetes-native batch scheduler optimized for high-performance, AI/ML, and big data workloads.

Before this change, all podTasks associated with a FlyteWorkflow shared a single PodGroup, which was automatically created just once by the Volcano controller when the first podTask was launched. Subsequent podTasks reused this PodGroup, which introduced limitations. Specifically, different podTasks could not be scheduled with different Volcano queues or job priorities, since the shared PodGroup enforced a single queue and priority for all.

This change enables Propeller to create individual Volcano PodGroups for each podTask, allowing them to specify distinct queues and job priorities for scheduling.

What changes were proposed in this pull request?

  1. The plugin manager for the k8s pod plugin creates Volcano PodGroups before creating the corresponding pods for podTasks. The PodGroup will have the same name as the pod, ensuring unique mapping between PodGroup and Pod. It also adds the PodGroup name to the pod’s annotations, ensuring that the Volcano PodGroup controller does not create the PodGroup again.
  2. Each PodGroup inherits the queue and priority from its Pod.
  3. Introduced a feature flag enable-create-pod-group-for-pod in the config to control whether this feature is enabled. The default value is false to ensure backward compatibility.
  4. Regarding PodGroup garbage collection, the PodGroup is deleted by Kubernetes' GC. Since the PodGroup's owner reference points to a FlyteWorkflow, the deletion process follows a cascading effect. When the FlyteWorkflow is deleted, Kubernetes GC removes the associated PodGroup due to the absence of a living owner reference.

How was this patch tested?

Unit tests added
Used internally at Linkedin.

Labels

Please add one or more of the following labels to categorize your PR:

  • added: For new features.
  • changed: For changes in existing functionality.
  • deprecated: For soon-to-be-removed features.
  • removed: For features being removed.
  • fixed: For any bug fixed.
  • security: In case of vulnerabilities

This is important to improve the readability of release notes.

Setup process

Screenshots

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

Docs link

Summary by Bito

This pull request enhances the Volcano scheduler support by allowing the creation of individual PodGroups for each podTask in Propeller, improving scheduling flexibility with distinct queues and job priorities. A new configuration option ensures backward compatibility, and relevant unit tests have been added to validate and verify the new functionality.

Copy link

welcome bot commented May 17, 2025

Thank you for opening this pull request! 🙌

These tips will help get your PR across the finish line:

  • Most of the repos have a PR template; if not, fill it out to the best of your knowledge.
  • Sign off your commits (Reference: DCO Guide).

@zhuo-zhi
Copy link
Author

Hi reviewers, could you please help me add the added label to this PR? It seems I don’t have the access. Thank you!

@Sovietaced Sovietaced added the added Merged changes that add new functionality label May 17, 2025
Copy link

codecov bot commented May 17, 2025

Codecov Report

Attention: Patch coverage is 60.00000% with 32 lines in your changes missing coverage. Please review.

Project coverage is 58.50%. Comparing base (8e0416d) to head (18b4387).
Report is 17 commits behind head on master.

Files with missing lines Patch % Lines
...er/pkg/controller/nodes/task/k8s/plugin_manager.go 62.33% 23 Missing and 6 partials ⚠️
flyteplugins/go/tasks/plugins/k8s/pod/plugin.go 0.00% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6457      +/-   ##
==========================================
+ Coverage   58.47%   58.50%   +0.02%     
==========================================
  Files         940      940              
  Lines       71584    71658      +74     
==========================================
+ Hits        41861    41920      +59     
- Misses      26540    26551      +11     
- Partials     3183     3187       +4     
Flag Coverage Δ
unittests-datacatalog 59.03% <ø> (ø)
unittests-flyteadmin 56.22% <ø> (-0.02%) ⬇️
unittests-flytecopilot 30.99% <ø> (ø)
unittests-flytectl 64.72% <ø> (ø)
unittests-flyteidl 76.12% <ø> (ø)
unittests-flyteplugins 61.06% <0.00%> (+0.11%) ⬆️
unittests-flytepropeller 54.81% <62.33%> (+0.02%) ⬆️
unittests-flytestdlib 64.04% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Sovietaced
Copy link
Member

Without looking to deeply at this I feel that the plugin manager code probably shouldn't be doing exceptional work for a third party Kubernetes scheduler. My intuition is that you would want to create a new plugin for Volcano and configure your Propeller instance to use that as the default plugin.

@zhuo-zhi
Copy link
Author

Without looking to deeply at this I feel that the plugin manager code probably shouldn't be doing exceptional work for a third party Kubernetes scheduler. My intuition is that you would want to create a new plugin for Volcano and configure your Propeller instance to use that as the default plugin.

@Sovietaced Thanks for the review.

All the changes introduced in this PR are gated behind a configuration flag, which is disabled by default. This ensures that the plugin manager will not perform this functionality unless explicitly enabled.

This feature is primarily designed to support the current k8s Pod plugin, and more broadly, any future plugins that launch Pod resources. If I understand correctly, the current code structure assumes each plugin is responsible for managing only one k8s object. This assumption is why a new plugin doesn't work because additional logic is required in the plugin manager to create PodGroups before launching Pods.

For other plugins such as TFJob, PyTorchJob, and MPIJob, their corresponding operators are already integrated with Volcano and can create PodGroups on their own.

Signed-off-by: zhuo-zhi <[email protected]>
}
}

err = backOffHandler.Handle(ctx, func() error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about this case?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As before, this error will still be handled by the logic a few lines below.

@Sovietaced
Copy link
Member

All the changes introduced in this PR are gated behind a configuration flag, which is disabled by default. This ensures that the plugin manager will not perform this functionality unless explicitly enabled.

I understand. My main concern is that core flyte propeller internals shouldn't have references to third party elements like volcano. You're using a feature flag to branch the logic in the generic plugin manager code to do something very specific for volcano.

I will wait for more of the Flyte core maintainers to look at this but I'm thinking it might make more sense to rearchitect Flyte to allow plugins to build multiple resources.

Signed-off-by: zhuo-zhi <[email protected]>
@zhuo-zhi
Copy link
Author

zhuo-zhi commented May 18, 2025

All the changes introduced in this PR are gated behind a configuration flag, which is disabled by default. This ensures that the plugin manager will not perform this functionality unless explicitly enabled.

I understand. My main concern is that core flyte propeller internals shouldn't have references to third party elements like volcano. You're using a feature flag to branch the logic in the generic plugin manager code to do something very specific for volcano.

I will wait for more of the Flyte core maintainers to look at this but I'm thinking it might make more sense to rearchitect Flyte to allow plugins to build multiple resources.

I see your point about making the plugin manager code more generic. I'm not sure how much effort it would take, or whether there are concrete plans to rearchitect the Flyte plugins to support building multiple resources. In the meantime, would it be possible to include this change in an earlier release so users can opt in earlier if needed? It could also help validate the effectiveness of a potential future rearchitecture. More feedback from the Flyte core maintainers would also be greatly appreciated.

@davidmirror-ops
Copy link
Contributor

davidmirror-ops commented Jun 3, 2025

@zhuo-zhi can we add docs here on how to prepare the K8s cluster for this integration?

@davidmirror-ops davidmirror-ops added the lgtm This PR has been approved by a maintainer label Jun 3, 2025
Signed-off-by: zhuo-zhi <[email protected]>
@zhuo-zhi
Copy link
Author

zhuo-zhi commented Jun 9, 2025

@zhuo-zhi can we add docs here on how to prepare the K8s cluster for this integration?

@davidmirror-ops Thanks for the review. I've created a separate PR to document the preparation steps: unionai/unionai-docs#372

return podRequestedResources
}

// getRequestResource will get the actual request resource.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this module is not the right place to add this.

@fg91
Copy link
Member

fg91 commented Jun 9, 2025

All the changes introduced in this PR are gated behind a configuration flag, which is disabled by default. This ensures that the plugin manager will not perform this functionality unless explicitly enabled.

I understand. My main concern is that core flyte propeller internals shouldn't have references to third party elements like volcano. You're using a feature flag to branch the logic in the generic plugin manager code to do something very specific for volcano.

I will wait for more of the Flyte core maintainers to look at this but I'm thinking it might make more sense to rearchitect Flyte to allow plugins to build multiple resources.

I agree with everything @Sovietaced said, I also feel we cannot have such specialized logic in the plugin manager itself, even with a feature flag :/

@zhuo-zhi
Copy link
Author

zhuo-zhi commented Jun 11, 2025

All the changes introduced in this PR are gated behind a configuration flag, which is disabled by default. This ensures that the plugin manager will not perform this functionality unless explicitly enabled.

I understand. My main concern is that core flyte propeller internals shouldn't have references to third party elements like volcano. You're using a feature flag to branch the logic in the generic plugin manager code to do something very specific for volcano.
I will wait for more of the Flyte core maintainers to look at this but I'm thinking it might make more sense to rearchitect Flyte to allow plugins to build multiple resources.

I agree with everything @Sovietaced said, I also feel we cannot have such specialized logic in the plugin manager itself, even with a feature flag :/

@fg91 @Sovietaced Thanks for the insights. I understand your concern. I’m wondering if there are any actionable steps you’d suggest I take to help move the PR forward?

Rearchitecting Flyte to support plugins that build multiple resources appears to be a significant change, as it would likely require refactoring all existing plugins along with the plugin manager. If that’s the only viable path for integration, would it be possible for me to get some support on this?

@Sovietaced
Copy link
Member

@fg91 @Sovietaced Thanks for the insights. I understand your concern. I’m wondering if there are any actionable steps you’d suggest I take to help move the PR forward?

I would defer to the Union folks or people that understand the Propeller architecture deeply.

@fg91
Copy link
Member

fg91 commented Jun 12, 2025

I agree that re-architecting the plugin interface to allow for "extra resources" would be a bigger effort. I don't see a clean way to integrate the creation of resources like PodGroup into Flyte without this though.

The adapted plugin interface could in theory look like this:

type Plugin interface {
	// Defines a func to create a query object (typically just object and type meta portions) that's used to query k8s
	// resources.
	BuildIdentityResource(ctx context.Context, taskCtx pluginsCore.TaskExecutionMetadata) (client.Object, error)

	// Defines a func to create the full resource object that will be posted to k8s.
	BuildResource(ctx context.Context, taskCtx pluginsCore.TaskExecutionContext) (client.Object, error)

	// Analyses the k8s resource and reports the status as TaskPhase. This call is expected to be relatively fast,
	// any operations that might take a long time (limits are configured system-wide) should be offloaded to the
	// background.
	GetTaskPhase(ctx context.Context, pluginContext PluginContext, resource client.Object) (pluginsCore.PhaseInfo, error)

	// Properties desired by the plugin
	GetProperties() PluginProperties

        // New
        // Defines a func to create query objects for the extra resources e.g. for aborting
        BuildExtraIdentityResources(...) ([]client.Object, error)
        
        // Defines a func to create the full extra resource objects that will be posted to k8s.
        BuildExtraResources(...) ([]client.Object, error)
}

Which extra resources would be created would depend on the user configuration for the plugin returned by
GetProperties().

Flytepropeller would not do any state monitoring for those extra resources, just create them along the creation of the main task resource. And it would delete them on abort. In theory we could omit the BuildExtraIdentityResources if propeller injected an owner reference to the main task object into the extra resources so they would get cleaned up automatically if the main task object was deleted. That would leave us with the additional BuildExtraResources.

I don't think the existing plugins would create that much work because their respective BuildExtraResources functions would just do nothing initially. But the creation of the resources (and their deletion or injection of owner refs) would have to be implemented in the k8s plugin.

All this is just brainstorming though, something like this would definitely require an RFC (I could help with the process) and it also definitely would need early input from Union folks like @eapolinario @wild-endeavor, @kumare3.


Alternative approach:

If your main goal was not officially integrating Flyte with Volcano but you/your organization being able to use Flyte pod tasks with Volcano, I think a much lower hanging fruit (albeit not an official integration into flyte) would be to handle the PodGroup creation in a (mutating) admission webhook that gets triggered when flytepropeller creates the pods. You could deploy such a service along side Flyte. I've done something like this in the past for other resources.

@kumare3
Copy link
Contributor

kumare3 commented Jun 17, 2025

Cc @troychiu this was the one

@davidmirror-ops davidmirror-ops requested a review from troychiu July 9, 2025 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

added Merged changes that add new functionality lgtm This PR has been approved by a maintainer triage/discuss

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants