-
-
Notifications
You must be signed in to change notification settings - Fork 7.8k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Fix] Remove unused opentelemetry-semantic-conventions-ai dependency
ci/build
documentation
Improvements or additions to documentation
#19313
opened Jun 7, 2025 by
conroy-cheers
Loading…
[Bugfix][V1] Fix memory profile to allow multiple servers to start on the same card
v1
#19312
opened Jun 7, 2025 by
yeqcharlotte
Loading…
[Nit][Benchmark]Fix example in benchmark_serving_structured_output.py
ready
ONLY add when PR is ready to merge/full CI is needed
structured-output
#19311
opened Jun 7, 2025 by
draftbk
Loading…
Update compatible packaging version
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#19309
opened Jun 7, 2025 by
pramenku
Loading…
[doc] improve ci doc
ci/build
documentation
Improvements or additions to documentation
#19307
opened Jun 7, 2025 by
reidliu41
Loading…
3 tasks
Use xla flag to improve the quantized model performance
ready
ONLY add when PR is ready to merge/full CI is needed
tpu
Related to Google TPUs
v1
#19303
opened Jun 6, 2025 by
vanbasten23
Loading…
3 tasks done
[Misc] Change tests/compile to use VLLM_V1 by default
ready
ONLY add when PR is ready to merge/full CI is needed
#19302
opened Jun 6, 2025 by
zou3519
Loading…
Add optional token-level progress bar to
LLM.beam_search
using tqdm
frontend
#19301
opened Jun 6, 2025 by
NekoMimiUnagi
Loading…
3 tasks done
[Bugfix] Re-enable use_cudagraph in vLLM v1
ready
ONLY add when PR is ready to merge/full CI is needed
#19299
opened Jun 6, 2025 by
zou3519
Loading…
[Misc] Fixes and Optimizations for DeepEP + DeepGEMM combination.
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#19298
opened Jun 6, 2025 by
varun-sundar-rabindranath
Loading…
[Quantization] Bump compressed-tensors version
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#19295
opened Jun 6, 2025 by
kylesayrs
Loading…
[V1] Add API docs for EncoderCacheManager
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#19294
opened Jun 6, 2025 by
russellb
Loading…
[TPU] support fp8 kv cache quantization
tpu
Related to Google TPUs
v1
#19292
opened Jun 6, 2025 by
yaochengji
Loading…
[Metrics] Compute and log the serving FLOPs
documentation
Improvements or additions to documentation
#19290
opened Jun 6, 2025 by
sysradium
Loading…
[Misc] Add documentation update reminder to PR template
ci/build
#19289
opened Jun 6, 2025 by
Isotr0py
Loading…
1 of 3 tasks
[Frontend] Remove unreachable code from llm.py
frontend
#19288
opened Jun 6, 2025 by
KsuParkhamchuk
Loading…
[CI/Build] Improve Llama GGUF test robustness
ready
ONLY add when PR is ready to merge/full CI is needed
#19287
opened Jun 6, 2025 by
Isotr0py
Loading…
1 of 3 tasks
[Core] Update error message for Whisper + num-scheduler-steps > 1
ready
ONLY add when PR is ready to merge/full CI is needed
#19286
opened Jun 6, 2025 by
russellb
Loading…
[Bugfix]: Fix TypeError: 'float' object cannot be interpreted as an integer
ready
ONLY add when PR is ready to merge/full CI is needed
v1
#19283
opened Jun 6, 2025 by
chaunceyjiang
Loading…
Convert kv_transfer_config from dict to KVTransferConfig to fix #19259
frontend
#19262
opened Jun 6, 2025 by
maobaolong
Loading…
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.