Skip to content

Initial InferencePool Status for Inference Extension EPP Plugin #11230

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 13, 2025

Conversation

danehans
Copy link
Contributor

@danehans danehans commented May 16, 2025

Description

Adds support for InferencePool status to the inference extension endpointpicker plugin.

Change Type

/kind new_feature

Changelog

Adds InferencePool status management to Inference Extension endpointpicker (EPP) Plugin.

Fixes #11306

@danehans danehans marked this pull request as draft May 16, 2025 20:13
@github-actions github-actions bot added kind/feature Categorizes issue or PR as related to a new feature. release-note labels May 16, 2025
@danehans danehans force-pushed the epp_plugin_status branch 6 times, most recently from d07adff to eef2050 Compare May 28, 2025 18:46
@danehans danehans marked this pull request as ready for review May 28, 2025 18:55
@danehans danehans requested review from lgadban and yuval-k May 28, 2025 20:42
@danehans
Copy link
Contributor Author

Note that I implemented InferencePool status management in the plugin instead of the controllers package because I plan to remove the inference extension EPP deployer.

@lgadban lgadban linked an issue May 29, 2025 that may be closed by this pull request
@danehans danehans force-pushed the epp_plugin_status branch 3 times, most recently from 6a2b21a to 647f10b Compare June 3, 2025 16:19
@danehans danehans changed the title Adds InferencePool Status to Inference Extension EPP Plugin Initial InferencePool Status to Inference Extension EPP Plugin Jun 3, 2025
@danehans danehans changed the title Initial InferencePool Status to Inference Extension EPP Plugin Initial InferencePool Status for Inference Extension EPP Plugin Jun 3, 2025
@shashankram shashankram self-assigned this Jun 6, 2025
Comment on lines 348 to 352
aggErrs.Write([]byte(prologue))
for _, err := range errs {
aggErrs.Write([]byte(` "`))
aggErrs.Write([]byte(err.Error()))
aggErrs.Write([]byte(`"`))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer errors delimited by a semicolon or \n (errors.Join). Why quote this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't use errors.Join since it creates a new line between errors, which does not render well in status messages. I'll update to remove the quoted error messages and separate each with a semicolon.

@@ -950,6 +950,17 @@ func (h *RoutesIndex) FetchHttp(kctx krt.HandlerContext, ns, n string) *ir.HttpR
return route
}

// ListHTTPRoutesInNamespace returns all HTTPRouteIRs in the given namespace.
func (h *RoutesIndex) ListHTTPRoutesInNamespace(ns string) []ir.HttpRouteIR {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need this. You can use FetchHTTPRoutesBySelector and only set the Namespace in the selector

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I need to plumb a krt.HandlerContext throughout the call chain.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unfamiliar with krt.HandlerContext, but the pattern of passing it all the way through the callstack is used everywhere (Translate entrypoints). FetchHTTPRoutesBySelector uses a namespaced index, so lookups would be O(1) vs O(n) with ListHTTPRoutesInNamespace.

Is it a big change to make?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shashankram I don't expect a large number of HTTPRoutes in the InferencePool's namespace. For the time being, I would like to stick with the current approach and implement this improvement in #11379.

@danehans danehans force-pushed the epp_plugin_status branch from 647f10b to b60c1f5 Compare June 10, 2025 03:59
@danehans danehans requested a review from shashankram June 10, 2025 15:05
@danehans danehans added this pull request to the merge queue Jun 13, 2025
Merged via the queue into kgateway-dev:main with commit 0573119 Jun 13, 2025
20 checks passed
@danehans danehans deleted the epp_plugin_status branch June 13, 2025 21:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
control-plane-parity kind/feature Categorizes issue or PR as related to a new feature. release-note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add InferencePool Status
3 participants