-
Notifications
You must be signed in to change notification settings - Fork 520
Initial InferencePool Status for Inference Extension EPP Plugin #11230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d07adff
to
eef2050
Compare
Note that I implemented InferencePool status management in the plugin instead of the controllers package because I plan to remove the inference extension EPP deployer. |
6a2b21a
to
647f10b
Compare
internal/kgateway/extensions2/plugins/inferenceextension/endpointpicker/plugin.go
Outdated
Show resolved
Hide resolved
internal/kgateway/extensions2/plugins/inferenceextension/endpointpicker/plugin.go
Outdated
Show resolved
Hide resolved
internal/kgateway/extensions2/plugins/inferenceextension/endpointpicker/status.go
Outdated
Show resolved
Hide resolved
internal/kgateway/extensions2/plugins/inferenceextension/endpointpicker/status.go
Outdated
Show resolved
Hide resolved
internal/kgateway/extensions2/plugins/inferenceextension/endpointpicker/plugin.go
Outdated
Show resolved
Hide resolved
internal/kgateway/extensions2/plugins/inferenceextension/endpointpicker/status.go
Outdated
Show resolved
Hide resolved
internal/kgateway/extensions2/plugins/inferenceextension/endpointpicker/status.go
Outdated
Show resolved
Hide resolved
internal/kgateway/extensions2/plugins/inferenceextension/endpointpicker/status.go
Outdated
Show resolved
Hide resolved
aggErrs.Write([]byte(prologue)) | ||
for _, err := range errs { | ||
aggErrs.Write([]byte(` "`)) | ||
aggErrs.Write([]byte(err.Error())) | ||
aggErrs.Write([]byte(`"`)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer errors delimited by a semicolon or \n (errors.Join). Why quote this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't use errors.Join
since it creates a new line between errors, which does not render well in status messages. I'll update to remove the quoted error messages and separate each with a semicolon.
@@ -950,6 +950,17 @@ func (h *RoutesIndex) FetchHttp(kctx krt.HandlerContext, ns, n string) *ir.HttpR | |||
return route | |||
} | |||
|
|||
// ListHTTPRoutesInNamespace returns all HTTPRouteIRs in the given namespace. | |||
func (h *RoutesIndex) ListHTTPRoutesInNamespace(ns string) []ir.HttpRouteIR { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need this. You can use FetchHTTPRoutesBySelector
and only set the Namespace in the selector
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I need to plumb a krt.HandlerContext
throughout the call chain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am unfamiliar with krt.HandlerContext, but the pattern of passing it all the way through the callstack is used everywhere (Translate entrypoints). FetchHTTPRoutesBySelector uses a namespaced index, so lookups would be O(1) vs O(n) with ListHTTPRoutesInNamespace.
Is it a big change to make?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shashankram I don't expect a large number of HTTPRoutes in the InferencePool's namespace. For the time being, I would like to stick with the current approach and implement this improvement in #11379.
Signed-off-by: Daneyon Hansen <[email protected]>
Signed-off-by: Daneyon Hansen <[email protected]>
Signed-off-by: Daneyon Hansen <[email protected]>
647f10b
to
b60c1f5
Compare
Description
Adds support for InferencePool status to the inference extension endpointpicker plugin.
Change Type
/kind new_feature
Changelog
Fixes #11306