-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[V1][Core][1/n] Logging and Metrics
ready
ONLY add when PR is ready to merge/full CI is needed
#11962
opened Jan 11, 2025 by
robertgshaw2-neuralmagic
Loading…
[V1] Move more control of kv cache initialization from model_executor to EngineCore
#11960
opened Jan 11, 2025 by
heheda12345
Loading…
[Misc] Add helpers to get pipeline rank & world size
#11946
opened Jan 10, 2025 by
ethnzhng
Loading…
[Ignore] Test
documentation
Improvements or additions to documentation
#11944
opened Jan 10, 2025 by
mgoin
Loading…
Organise installation documentation into categories and tabs
documentation
Improvements or additions to documentation
#11935
opened Jan 10, 2025 by
hmellor
Loading…
[Doc] links Tensorizer example
documentation
Improvements or additions to documentation
#11918
opened Jan 10, 2025 by
guspan-tanadi
Loading…
[Doc] Correct the spelling of GitHub
documentation
Improvements or additions to documentation
#11915
opened Jan 10, 2025 by
Yaminyam
Loading…
[V1] APC + prompt logprobs unsupported (PR 2/N for v1 sample and prompt logprobs support)
#11910
opened Jan 10, 2025 by
afeldman-nm
Loading…
[FP8][Kernel] Dynamic kv cache scaling factors computation
documentation
Improvements or additions to documentation
#11906
opened Jan 9, 2025 by
gshtras
Loading…
[Bugfix] support to run partially 2:4 model with CompressedTensors24 scheme
#11889
opened Jan 9, 2025 by
jiangjiadi
Loading…
Add
device
as parameter to TP and rotary_embedding functions
#11888
opened Jan 9, 2025 by
chunyuan-w
•
Draft
[CI] Add auto update workflow for Dockerfile graph
ci/build
#11879
opened Jan 9, 2025 by
WineChord
Loading…
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.