-
-
Notifications
You must be signed in to change notification settings - Fork 14.3k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Frontend] Avoid startup error log for models without chat template
ready
ONLY add when PR is ready to merge/full CI is needed
#37040
opened Mar 14, 2026 by
DarkLight1337
Loading…
5 tasks
fix: sync delta_token_ids with delta_text during stop-sequence buffering
ci/build
cpu
Related to CPU backends
deepseek
Related to DeepSeek models
documentation
Improvements or additions to documentation
frontend
gpt-oss
Related to GPT-OSS models
kv-connector
llama
Related to Llama models
multi-modality
Related to multi-modality (#4194)
needs-rebase
nvidia
performance
Performance-related issues
qwen
Related to Qwen models
rocm
Related to AMD ROCm
speculative-decoding
structured-output
tool-calling
tpu
Related to Google TPUs
v1
#37039
opened Mar 14, 2026 by
gambletan
Loading…
3 tasks
fix: resolve kv_cache_dtype='auto' from checkpoint kv_cache_quant_algo
#37034
opened Mar 14, 2026 by
alvinttang
Loading…
3 tasks
fix: correct FP8 error message to reference compute capability
#37033
opened Mar 14, 2026 by
alvinttang
Loading…
3 tasks
[Hardware] Replace memory related torch.cuda APIs
nvidia
performance
Performance-related issues
v1
#37031
opened Mar 14, 2026 by
jikunshang
Loading…
5 tasks
[Hardware][XPU][ROCm] Align memory usage with cuda on xpu/rocm
nvidia
rocm
Related to AMD ROCm
#37029
opened Mar 14, 2026 by
jikunshang
Loading…
5 tasks
[WIP][Model Runner V2] Support Streaming Inputs
v1
#37028
opened Mar 14, 2026 by
santiramos27
•
Draft
5 tasks
[CI] Fix flaky tool_use chat completion tests with deterministic seed
tool-calling
#37027
opened Mar 14, 2026 by
sfeng33
Loading…
[UX]: Fix unclean shutdown from ctrl-c with AR Fusion
nvidia
#37026
opened Mar 14, 2026 by
siewcapital
Loading…
Enable in-process engine core for AsyncLLM.
v1
#37021
opened Mar 13, 2026 by
wang2yn84
Loading…
5 tasks
[CI][Bugfix] Fix incorrect status handling with Something isn't working
ci/build
set -e in CI shell scripts
bug
#37020
opened Mar 13, 2026 by
gkapetanakis
Loading…
3 tasks done
[CI] Split V1 Others into 3 separate jobs
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#37016
opened Mar 13, 2026 by
khluu
Loading…
3 tasks
[CI] Shard Multi-Modal Models (Standard) into 4 parallel jobs
ci/build
ready
ONLY add when PR is ready to merge/full CI is needed
#37014
opened Mar 13, 2026 by
khluu
Loading…
2 tasks
[Spec Decode] Update extract_hidden_states to use deferred kv_connector clear
kv-connector
ready
ONLY add when PR is ready to merge/full CI is needed
speculative-decoding
v1
#37013
opened Mar 13, 2026 by
fynnsu
Loading…
3 of 5 tasks
fix: return HTTP 413 when request exceeds max context length
frontend
#37011
opened Mar 13, 2026 by
Chase-Xuu
Loading…
[Bugfix] Fix FusedMoE weight loading with padded hidden dimensions
bug
Something isn't working
#37010
opened Mar 13, 2026 by
SandishKumarHN
Loading…
3 of 4 tasks
[ROCm] issue management - request information for bug issues on ROCm
bug
Something isn't working
ci/build
rocm
Related to AMD ROCm
#37009
opened Mar 13, 2026 by
hongxiayang
Loading…
5 tasks
[Core][Feature] Observation Plugin for Intercepting & Routing on Activations
documentation
Improvements or additions to documentation
needs-rebase
v1
#37002
opened Mar 13, 2026 by
DDDDarrenWB
•
Draft
5 tasks
Enable loading of fused expert weights in the Transformers modelling backend
ready
ONLY add when PR is ready to merge/full CI is needed
#36997
opened Mar 13, 2026 by
hmellor
Loading…
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.