-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: add scalePolicy in lws #213
Comments
/assign |
/kind api-change |
@ahg-g @liurupeng @kerthcet Any ideas or suggestions? |
But how to scale, aren't we also depend on the HPA implementation. If so, I don't think this is fit with lws as an plain workload. Some higher orchestration tools may need this for easy configuration, if you're interest, you can take a look at InftyAI/llmaz#3, we're aiming for the simplified configuration. |
Right, why do we need this if we have HPA? |
thanks! My thought is to integrate HPA through the API in LWS (just like we use a headless service), so that users wouldn't need to configure HPA separately and could manage it directly from within LWS. |
The headless service case is different, it is an implementation detail of enabling pod to pod communication, it is not easy to setup independently because in some cases they need to be created per replica. Pod to pod communication is important for LWS because it aims to handle distributed inference. LWS is like Deployment, they are API to deploy the replicas of a workload. Autoscaling and load balancing are intentionally not baked into deployment apis to enable composability. Autoscaling and load balancing are features that can be setup in different ways, autoscaling can be done using HPA or KEDA; load balancing can be setup using Service or Gateway API. The deployment api (Deployment or LWS) should not impose a specific way of doing that, and hence it is not in their scope. |
Thank you all for your suggestions. I will close this issue.:) |
What would you like to be added:
I want to implement a function similar to scalePolicy, the api is as follows.
We can implement the scale subresource (it seems to have been implemented now) and embed HPA, so that users can use scalePolicy directly without using HPA separately.
This would involve an api-change, I'll contribute a KEP if needed.
Why is this needed:
Completion requirements:
This enhancement requires the following artifacts:
The artifacts should be linked in subsequent comments.
The text was updated successfully, but these errors were encountered: