Skip to content

Inference Extension: Adds Initial e2e Tests #11344

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 11, 2025

Conversation

danehans
Copy link
Contributor

@danehans danehans commented Jun 5, 2025

Adds initial Inference Extension e2e tests.

Description

In support of #10411

Change Type

I'm not sure if new end-to-end (E2E) tests are considered a new feature, so feel free to update as needed.

/kind new_feature

Changelog

Adds initial InferencePool e2e tests

Additional Notes

First step to make kgtw an inference extension conformant implementation (xref).

@github-actions github-actions bot added kind/feature Categorizes issue or PR as related to a new feature. release-note labels Jun 5, 2025
@danehans danehans requested review from sam-heilbron and lgadban June 5, 2025 16:19
@lgadban lgadban requested a review from npolshakova June 5, 2025 19:04
kind: Service
metadata:
name: vllm-llama3-8b-instruct-epp
namespace: inf-ext-e2e
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we define the namespace, instead of inheriting it from the client that applies the resource? For example https://github.com/kgateway-dev/kgateway/blob/main/test/kubernetes/e2e/features/backends/inputs/backend.yaml#L5 we don't define the ns. I think this allows us to run a feature against multipple different installations

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The EPP Deployment (args) and ClusterRoleBinding (subjects namespace) manifests are hard-coded for this namespace. If we allow the client to set the namespace, the EPP will not reach a running state for any other namespace. Kustomize, Go templating, etc, could be used, but I was trying to follow the approach taken by other e2e tests. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair! Is it possible to make the EPP deployment default to the namespace it is installed in? I'm good with what you have defined currently however

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sam-heilbron are you referring to removing namespace: inf-ext-e2e from the manifests and supplying the ns in the kubectl apply commands? If so, this is possible, but IMHO it can obfuscate the ns limitations of the deployment that I describe above. To truly expose ns config to the user, we need to template the manifests so the ns name can be plumbed into the EPP Deployment (args) and ClusterRoleBinding (subjects namespace) manifests.

spec:
containers:
- name: vllm-sim
image: ghcr.io/llm-d/llm-d-inference-sim:v0.1.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is going to be our strategy around updating this? I see that for other tags we rely on latest, though this one seems pinned. Would it be useful to add a comment clarifying if it's intentionally pinnned here, or if we want to keep this on as much the latest as possible?

Copy link
Contributor Author

@danehans danehans Jun 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vllm-sim currently does not produce a latest tagged image until llm-d/llm-d-inference-sim#50 is merged. I'll add a TODO with a link to the PR so we update the tag after the upstream PR gets merged.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I really like the fcomment idea, since it allows any developer to step in and update it. I noticed that the PR you refereced merges, so can we update this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a new upstream issue and updated the ref.

@danehans danehans force-pushed the epp_e2e branch 2 times, most recently from e5a9bcb to b08d4fa Compare June 6, 2025 18:32
@danehans danehans requested a review from sam-heilbron June 6, 2025 21:35
Copy link
Contributor

@sam-heilbron sam-heilbron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! A few small comments

kind: Service
metadata:
name: vllm-llama3-8b-instruct-epp
namespace: inf-ext-e2e
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair! Is it possible to make the EPP deployment default to the namespace it is installed in? I'm good with what you have defined currently however

spec:
containers:
- name: vllm-sim
image: ghcr.io/llm-d/llm-d-inference-sim:v0.1.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I really like the fcomment idea, since it allows any developer to step in and update it. I noticed that the PR you refereced merges, so can we update this?

Signed-off-by: Daneyon Hansen <[email protected]>
@danehans danehans added this pull request to the merge queue Jun 11, 2025
Merged via the queue into kgateway-dev:main with commit baed049 Jun 11, 2025
20 checks passed
@danehans danehans deleted the epp_e2e branch June 11, 2025 18:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. release-note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants