-
Notifications
You must be signed in to change notification settings - Fork 520
Inference Extension: Adds Initial e2e Tests #11344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
test/kubernetes/e2e/features/inferenceextension/testdata/client.yaml
Outdated
Show resolved
Hide resolved
kind: Service | ||
metadata: | ||
name: vllm-llama3-8b-instruct-epp | ||
namespace: inf-ext-e2e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we define the namespace, instead of inheriting it from the client that applies the resource? For example https://github.com/kgateway-dev/kgateway/blob/main/test/kubernetes/e2e/features/backends/inputs/backend.yaml#L5 we don't define the ns. I think this allows us to run a feature against multipple different installations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The EPP Deployment (args) and ClusterRoleBinding (subjects namespace) manifests are hard-coded for this namespace. If we allow the client to set the namespace, the EPP will not reach a running state for any other namespace. Kustomize, Go templating, etc, could be used, but I was trying to follow the approach taken by other e2e tests. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair! Is it possible to make the EPP deployment default to the namespace it is installed in? I'm good with what you have defined currently however
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sam-heilbron are you referring to removing namespace: inf-ext-e2e
from the manifests and supplying the ns in the kubectl apply commands? If so, this is possible, but IMHO it can obfuscate the ns limitations of the deployment that I describe above. To truly expose ns config to the user, we need to template the manifests so the ns name can be plumbed into the EPP Deployment (args) and ClusterRoleBinding (subjects namespace) manifests.
test/kubernetes/e2e/features/inferenceextension/testdata/epp.yaml
Outdated
Show resolved
Hide resolved
spec: | ||
containers: | ||
- name: vllm-sim | ||
image: ghcr.io/llm-d/llm-d-inference-sim:v0.1.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is going to be our strategy around updating this? I see that for other tags we rely on latest, though this one seems pinned. Would it be useful to add a comment clarifying if it's intentionally pinnned here, or if we want to keep this on as much the latest as possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vllm-sim currently does not produce a latest
tagged image until llm-d/llm-d-inference-sim#50 is merged. I'll add a TODO with a link to the PR so we update the tag after the upstream PR gets merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I really like the fcomment idea, since it allows any developer to step in and update it. I noticed that the PR you refereced merges, so can we update this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created a new upstream issue and updated the ref.
e5a9bcb
to
b08d4fa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! A few small comments
kind: Service | ||
metadata: | ||
name: vllm-llama3-8b-instruct-epp | ||
namespace: inf-ext-e2e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair! Is it possible to make the EPP deployment default to the namespace it is installed in? I'm good with what you have defined currently however
spec: | ||
containers: | ||
- name: vllm-sim | ||
image: ghcr.io/llm-d/llm-d-inference-sim:v0.1.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I really like the fcomment idea, since it allows any developer to step in and update it. I noticed that the PR you refereced merges, so can we update this?
Signed-off-by: Daneyon Hansen <[email protected]>
Signed-off-by: Daneyon Hansen <[email protected]>
Signed-off-by: Daneyon Hansen <[email protected]>
Signed-off-by: Daneyon Hansen <[email protected]>
Signed-off-by: Daneyon Hansen <[email protected]>
Adds initial Inference Extension e2e tests.
Description
In support of #10411
Change Type
I'm not sure if new end-to-end (E2E) tests are considered a new feature, so feel free to update as needed.
Changelog
Additional Notes
First step to make kgtw an inference extension conformant implementation (xref).