Open
Description
The default run for serve/tests/test_engine.py
first adds 4 requests and then start the engine.
I expected this would form single prefill batch with 4 requests.
However, it shows non-deterministic behavior, sometimes it forms two prefill batches of 2 requests each, sometimes it forms two prefill batches of 1/3 requests each.
Debug log does not provide any useful information.