[CB] use used block ids for dummy batch size 2 #259

yannicks1 · 2025-06-24T08:16:01Z

[CB] use used block ids for dummy batch size 2

solves #258

Signed-off-by: Yannick Schnider <[email protected]>

github-actions · 2025-06-24T08:19:54Z

👋 Hi! Thank you for contributing to vLLM support on Spyre.
Just a reminder: Make sure that your code passes all the linting checks, otherwise your PR won't be able to be merged. To do so, first install the linting requirements, then run format.sh and commit the changes. This can be done with uv directly:

uv sync --frozen --group lint --active --inexact

Or this can be done with pip:

uv pip compile --group lint > requirements-lint.txt
pip install -r requirements-lint.txt
bash format.sh

Now you are good to go 🚀

yannicks1 · 2025-06-24T18:28:11Z

Status: To be tested on AIU Spyre

yannicks1 · 2025-06-26T08:40:05Z

bot:test
TEST_FILE=tests/e2e/test_spyre_cb.py MARKERS="spyre" VLLM_SPYRE_TEST_MODEL_LIST="tiny-granite-3.2-8b"

yannicks1 · 2025-06-26T09:10:23Z

bot:test
TEST_FILE=tests/e2e/test_spyre_cb.py MARKERS="spyre" VLLM_SPYRE_TEST_MODEL_LIST='tiny-granite-3.2-8b'

Signed-off-by: Yannick Schnider <[email protected]>

yannicks1 · 2025-06-27T22:21:32Z

bot:test
TEST_FILE=tests/e2e/test_spyre_cb.py MARKERS="spyre"

yannicks1 · 2025-06-30T12:48:30Z

bot:test
TEST_FILE=tests/e2e/test_spyre_cb.py MARKERS="spyre"

yannicks1 · 2025-06-30T16:18:36Z

Good news, I spun up a pod and tested this, and it actually works 🎉
ran test_spyre_cb.py and cb_spyre_inference.py and validated with aiu-smi that we actually run on Spyre.

ready to review!

prashantgupta24 · 2025-06-30T16:32:37Z

vllm_spyre/v1/worker/spyre_model_runner.py

Wondering if we can add a test for this somehow 🤔

Worth adding a test where we only send 1 request and assert a bunch of stuff? 🤷

@gmarinho2 had something in his PR here - https://github.com/vllm-project/vllm-spyre/pull/252/files but he was assuming we used dummy_req_ids2blocks which we don't anymore

what about an assert on the shape dimension that corresponds to the batch size for the decode path? might be an overkill to write test for this. Plus an assert is saver...

Adding only 1 request and asserting scheduler's tkv could be a good start?

I think that should be sufficient. Reluctant to add even more tests as they are already taking long and this can be easily asserted in the code

Agree, I think the base tests already cover this case.

Ah sorry I worded it incorrectly, yeah we already have test cases that cover the scenario, what I meant was we could add asserts in the tests themselves to cover this specific PR scenario where len(cached_requests) == 1. On the other hand, since we only have access to the model_runner obj during testing, the only thing we could have asserted was model_runner.model.indices.size which is already covered by the assert in the code, so I guess it's okay 🤷

Signed-off-by: Yannick Schnider <[email protected]>

wallashss

LGTM

yannicks1 · 2025-07-01T12:15:44Z

bot:test
TEST_FILE=tests/e2e/test_spyre_cb.py MARKERS="spyre"

yannicks1 · 2025-07-01T14:35:24Z

ran the only failing test manually on the card and it succeeded. Thus merging this PR!

yannicks1 added 2 commits June 24, 2025 08:07

remove dummy_req_ids2blocks, instead duplicate tensors

321405f

Signed-off-by: Yannick Schnider <[email protected]>

fix format

1f15756

Signed-off-by: Yannick Schnider <[email protected]>

yannicks1 self-assigned this Jun 24, 2025

yannicks1 mentioned this pull request Jun 24, 2025

Add testing for dummy request creation #252

Closed

Merge branch 'main' into ysc-remove-dummy-sequence

926d5dd

yannicks1 added 2 commits June 27, 2025 22:03

Merge branch 'main' into ysc-remove-dummy-sequence

63eafcd

Signed-off-by: Yannick Schnider <[email protected]>

solve merge conflict

9291ef5

Signed-off-by: Yannick Schnider <[email protected]>

yannicks1 marked this pull request as ready for review June 30, 2025 16:18

yannicks1 requested review from tdoublep, nikolaospapandreou and sducouedic as code owners June 30, 2025 16:18

yannicks1 requested review from joerunde, wallashss and prashantgupta24 June 30, 2025 16:23

prashantgupta24 reviewed Jun 30, 2025

View reviewed changes

add assertion for batch size >= 2 for decodes

8b8bbfb

Signed-off-by: Yannick Schnider <[email protected]>

wallashss approved these changes Jul 1, 2025

View reviewed changes

yannicks1 mentioned this pull request Jul 1, 2025

[CB] Test minimal batch size of 2 in decode mode #204

Closed

yannicks1 merged commit b324b42 into main Jul 1, 2025
16 of 18 checks passed

yannicks1 deleted the ysc-remove-dummy-sequence branch July 1, 2025 14:35

[CB] use used block ids for dummy batch size 2 #259

[CB] use used block ids for dummy batch size 2 #259

Uh oh!

Conversation

yannicks1 commented Jun 24, 2025