Inference Extension: Adds Initial e2e Tests #11344

danehans · 2025-06-05T16:17:14Z

Adds initial Inference Extension e2e tests.

Description

In support of #10411

Change Type

I'm not sure if new end-to-end (E2E) tests are considered a new feature, so feel free to update as needed.

/kind new_feature

Changelog

Adds initial InferencePool e2e tests

Additional Notes

First step to make kgtw an inference extension conformant implementation (xref).

test/kubernetes/e2e/features/inferenceextension/testdata/client.yaml

sam-heilbron · 2025-06-05T22:39:55Z

test/kubernetes/e2e/features/inferenceextension/testdata/epp.yaml

+kind: Service
+metadata:
+  name: vllm-llama3-8b-instruct-epp
+  namespace: inf-ext-e2e


Is there a reason we define the namespace, instead of inheriting it from the client that applies the resource? For example https://github.com/kgateway-dev/kgateway/blob/main/test/kubernetes/e2e/features/backends/inputs/backend.yaml#L5 we don't define the ns. I think this allows us to run a feature against multipple different installations

The EPP Deployment (args) and ClusterRoleBinding (subjects namespace) manifests are hard-coded for this namespace. If we allow the client to set the namespace, the EPP will not reach a running state for any other namespace. Kustomize, Go templating, etc, could be used, but I was trying to follow the approach taken by other e2e tests. WDYT?

That's fair! Is it possible to make the EPP deployment default to the namespace it is installed in? I'm good with what you have defined currently however

@sam-heilbron are you referring to removing namespace: inf-ext-e2e from the manifests and supplying the ns in the kubectl apply commands? If so, this is possible, but IMHO it can obfuscate the ns limitations of the deployment that I describe above. To truly expose ns config to the user, we need to template the manifests so the ns name can be plumbed into the EPP Deployment (args) and ClusterRoleBinding (subjects namespace) manifests.

test/kubernetes/e2e/features/inferenceextension/testdata/epp.yaml

test/kubernetes/e2e/file.go

test/kubernetes/e2e/tests/inference_extension_test.go

test/kubernetes/e2e/features/inferenceextension/types.go

sam-heilbron · 2025-06-05T22:52:19Z

test/kubernetes/e2e/features/inferenceextension/testdata/vllm.yaml

+    spec:
+      containers:
+      - name: vllm-sim
+        image: ghcr.io/llm-d/llm-d-inference-sim:v0.1.1


What is going to be our strategy around updating this? I see that for other tags we rely on latest, though this one seems pinned. Would it be useful to add a comment clarifying if it's intentionally pinnned here, or if we want to keep this on as much the latest as possible?

vllm-sim currently does not produce a latest tagged image until llm-d/llm-d-inference-sim#50 is merged. I'll add a TODO with a link to the PR so we update the tag after the upstream PR gets merged.

Thanks! I really like the fcomment idea, since it allows any developer to step in and update it. I noticed that the PR you refereced merges, so can we update this?

I created a new upstream issue and updated the ref.

test/kubernetes/e2e/tests/manifests/inference-extension-helm.yaml

test/kubernetes/e2e/features/inferenceextension/suite.go

sam-heilbron

LGTM! A few small comments

sam-heilbron · 2025-06-10T14:47:27Z

test/kubernetes/e2e/features/inferenceextension/testdata/epp.yaml

+kind: Service
+metadata:
+  name: vllm-llama3-8b-instruct-epp
+  namespace: inf-ext-e2e


That's fair! Is it possible to make the EPP deployment default to the namespace it is installed in? I'm good with what you have defined currently however

sam-heilbron · 2025-06-10T14:48:16Z

test/kubernetes/e2e/features/inferenceextension/testdata/vllm.yaml

+    spec:
+      containers:
+      - name: vllm-sim
+        image: ghcr.io/llm-d/llm-d-inference-sim:v0.1.1


Thanks! I really like the fcomment idea, since it allows any developer to step in and update it. I noticed that the PR you refereced merges, so can we update this?

test/kubernetes/e2e/features/inferenceextension/suite.go

Signed-off-by: Daneyon Hansen <[email protected]>

github-actions bot added kind/feature Categorizes issue or PR as related to a new feature. release-note labels Jun 5, 2025

danehans requested review from sam-heilbron and lgadban June 5, 2025 16:19

lgadban requested a review from npolshakova June 5, 2025 19:04

danehans force-pushed the epp_e2e branch from 0d8f1b5 to ce9a16d Compare June 5, 2025 20:18

sam-heilbron reviewed Jun 5, 2025

View reviewed changes

danehans force-pushed the epp_e2e branch 2 times, most recently from e5a9bcb to b08d4fa Compare June 6, 2025 18:32

danehans requested a review from sam-heilbron June 6, 2025 21:35

sam-heilbron approved these changes Jun 10, 2025

View reviewed changes

npolshakova reviewed Jun 10, 2025

View reviewed changes

test/kubernetes/e2e/features/inferenceextension/suite.go Outdated Show resolved Hide resolved

npolshakova reviewed Jun 10, 2025

View reviewed changes

test/kubernetes/e2e/features/inferenceextension/suite.go Outdated Show resolved Hide resolved

danehans added 4 commits June 10, 2025 12:52

Inference Extension: Adds Initial e2e Tests

2ea4451

Signed-off-by: Daneyon Hansen <[email protected]>

Updates CI job for infer ext e2e test

88e231a

Signed-off-by: Daneyon Hansen <[email protected]>

Resolve Sam review comments

8740029

Signed-off-by: Daneyon Hansen <[email protected]>

Readds kgtw image registry due to 11359

a8f6fb7

Signed-off-by: Daneyon Hansen <[email protected]>

danehans force-pushed the epp_e2e branch from b08d4fa to a35df66 Compare June 10, 2025 22:52

danehans requested review from npolshakova and sam-heilbron June 10, 2025 22:56

Resolves review feedback

85db682

Signed-off-by: Daneyon Hansen <[email protected]>

danehans force-pushed the epp_e2e branch from a35df66 to 85db682 Compare June 10, 2025 23:28

sam-heilbron approved these changes Jun 11, 2025

View reviewed changes

danehans added this pull request to the merge queue Jun 11, 2025

Merged via the queue into kgateway-dev:main with commit baed049 Jun 11, 2025
20 checks passed

danehans deleted the epp_e2e branch June 11, 2025 18:57

danehans mentioned this pull request Jun 17, 2025

Inference Extension: Use single shared extproc filter #11414

Open

Inference Extension: Adds Initial e2e Tests #11344

Inference Extension: Adds Initial e2e Tests #11344

Uh oh!

Conversation

danehans commented Jun 5, 2025

Description

Change Type

Changelog

Additional Notes

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

danehans Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sam-heilbron left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

danehans Jun 6, 2025 •

edited

Loading