-
-
Notifications
You must be signed in to change notification settings - Fork 7k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[V1][Spec Decode] Always use argmax for sampling draft tokens
v1
#16899
opened Apr 21, 2025 by
WoosukKwon
•
Draft
house keeping - fixing docker build warning
ci/build
#16893
opened Apr 20, 2025 by
wyattearp
Loading…
Added support for HermesToolParser for models without special tokens
frontend
#16890
opened Apr 20, 2025 by
minpeter
Loading…
Restore buffers when wake up from level 2 sleep (#16564)
v1
#16889
opened Apr 20, 2025 by
fingertap
Loading…
[doc] install required python3-dev apt package
documentation
Improvements or additions to documentation
#16888
opened Apr 20, 2025 by
davidxia
Loading…
[Bugfix] Fix Qwen2.5-Omni M-RoPE position ids generation
#16878
opened Apr 19, 2025 by
imkero
Loading…
[TPU][V1] Implicitly adjust page size when there's SMEM OOM
ready
ONLY add when PR is ready to merge/full CI is needed
tpu
Related to Google TPUs
v1
#16871
opened Apr 19, 2025 by
yaochengji
Loading…
Support loading transformers models with named parameters
#16868
opened Apr 18, 2025 by
wuisawesome
Loading…
[WIP][Attention] FA3 decode perf improvement - single mma warp group support for head dim 128
ci/build
v1
#16864
opened Apr 18, 2025 by
LucasWilkinson
•
Draft
[Bugfix]: fix issue with n>1 sampling on v1 requests overriding each other
bug
Something isn't working
v1
#16863
opened Apr 18, 2025 by
jeffrey-dot-li
Loading…
[Kernel] Add expert_map support to Cutlass FP8 MOE
#16861
opened Apr 18, 2025 by
varun-sundar-rabindranath
Loading…
Add default local directory LoRA resolver plugin.
documentation
Improvements or additions to documentation
#16855
opened Apr 18, 2025 by
jberkhahn
Loading…
[Bugfix] Fix moe weight losing all extra attrs after
process_weights_after_loading
.
#16854
opened Apr 18, 2025 by
charlifu
Loading…
[Model][Frontend] Adding timeseries modality support and Qwen2.5-ChatTS model support
frontend
multi-modality
Related to multi-modality (#4194)
#16852
opened Apr 18, 2025 by
chemeris
Loading…
[Kernel] some optimizations for dense marlin and moe marlin
ci/build
#16850
opened Apr 18, 2025 by
jinzhen-lin
Loading…
Previous Next
ProTip!
Filter pull requests by the default branch with base:main.