Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[Fix] Remove unused opentelemetry-semantic-conventions-ai dependency ci/build documentation Improvements or additions to documentation
#19313 opened Jun 7, 2025 by conroy-cheers Loading…
[Nit][Benchmark]Fix example in benchmark_serving_structured_output.py ready ONLY add when PR is ready to merge/full CI is needed structured-output
#19311 opened Jun 7, 2025 by draftbk Loading…
[Misc]: refactor: ParallelConfig init func
#19310 opened Jun 7, 2025 by googs1025 Loading…
3 tasks
Update compatible packaging version ci/build ready ONLY add when PR is ready to merge/full CI is needed
#19309 opened Jun 7, 2025 by pramenku Loading…
[doc] improve ci doc ci/build documentation Improvements or additions to documentation
#19307 opened Jun 7, 2025 by reidliu41 Loading…
3 tasks
Use xla flag to improve the quantized model performance ready ONLY add when PR is ready to merge/full CI is needed tpu Related to Google TPUs v1
#19303 opened Jun 6, 2025 by vanbasten23 Loading…
3 tasks done
[Misc] Change tests/compile to use VLLM_V1 by default ready ONLY add when PR is ready to merge/full CI is needed
#19302 opened Jun 6, 2025 by zou3519 Loading…
[Bugfix] Re-enable use_cudagraph in vLLM v1 ready ONLY add when PR is ready to merge/full CI is needed
#19299 opened Jun 6, 2025 by zou3519 Loading…
[Misc] Fixes and Optimizations for DeepEP + DeepGEMM combination. ready ONLY add when PR is ready to merge/full CI is needed v1
#19298 opened Jun 6, 2025 by varun-sundar-rabindranath Loading…
[CI] Update FlashInfer to 0.2.6 ci/build
#19297 opened Jun 6, 2025 by mgoin Loading…
[Quantization] Bump compressed-tensors version ci/build ready ONLY add when PR is ready to merge/full CI is needed
#19295 opened Jun 6, 2025 by kylesayrs Loading…
[V1] Add API docs for EncoderCacheManager ready ONLY add when PR is ready to merge/full CI is needed v1
#19294 opened Jun 6, 2025 by russellb Loading…
[TPU] support fp8 kv cache quantization tpu Related to Google TPUs v1
#19292 opened Jun 6, 2025 by yaochengji Loading…
[Metrics] Compute and log the serving FLOPs documentation Improvements or additions to documentation
#19290 opened Jun 6, 2025 by sysradium Loading…
[Misc] Add documentation update reminder to PR template ci/build
#19289 opened Jun 6, 2025 by Isotr0py Loading…
1 of 3 tasks
[CI/Build] Improve Llama GGUF test robustness ready ONLY add when PR is ready to merge/full CI is needed
#19287 opened Jun 6, 2025 by Isotr0py Loading…
1 of 3 tasks
[Core] Update error message for Whisper + num-scheduler-steps > 1 ready ONLY add when PR is ready to merge/full CI is needed
#19286 opened Jun 6, 2025 by russellb Loading…
[Bugfix]: Fix TypeError: 'float' object cannot be interpreted as an integer ready ONLY add when PR is ready to merge/full CI is needed v1
#19283 opened Jun 6, 2025 by chaunceyjiang Loading…
[V1][Kernel] Flashinfer HND KV cache layout v1
#19280 opened Jun 6, 2025 by NickLucche Loading…
[Frontend] optimize beam_search code
#19267 opened Jun 6, 2025 by zhanggzh Loading…
Fix TorchAOConfig skip layers
#19265 opened Jun 6, 2025 by mobicham Loading…
ProTip! Exclude everything labeled bug with -label:bug.