-
-
Notifications
You must be signed in to change notification settings - Fork 8.4k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[WIP][NOT READY] Refactor CLI Args for a better modular integration
frontend
needs-rebase
#20206
opened Jun 28, 2025 by
kouroshHakha
•
Draft
4 tasks
[BUGFIX][DEEPSEEK][MODEL_LOAD] fix w13, w2 weight not initialized assert
ready
ONLY add when PR is ready to merge/full CI is needed
#20202
opened Jun 27, 2025 by
xuechendi
Loading…
1 of 4 tasks
Validate @config in pre-commit instead of dynamically
#20200
opened Jun 27, 2025 by
lionelvillard
•
Draft
4 tasks
[Do not merge] Add out of place layernorm
performance
Performance-related issues
#20197
opened Jun 27, 2025 by
charlifu
Loading…
[CI][Intel Gaudi][vllm-Plugin]Add CI for hpu-plugin-v1-test
ci/build
documentation
Improvements or additions to documentation
#20196
opened Jun 27, 2025 by
xuechendi
Loading…
3 tasks
[Refactor] Create a function util and cache the results for
has_deepgemm
, has_deepep
, has_pplx
#20187
opened Jun 27, 2025 by
yewentao256
Loading…
[Frontend] Generalize
v1/audio/transcriptions
endpoint
frontend
#20179
opened Jun 27, 2025 by
NickLucche
Loading…
[UT][intel GPU] use current_platform instead of device hardcode in v1 tests
rocm
Related to AMD ROCm
v1
#20169
opened Jun 27, 2025 by
Liangliang-Ma
Loading…
[Bugfix] Fix Maverick correctness by filling zero to cache space in cutlass_moe
#20167
opened Jun 27, 2025 by
minosfuture
Loading…
[Bugfix] Fix topk_ids indices_type for CUTLASS w8a8 FP8 MoE
#20166
opened Jun 27, 2025 by
minosfuture
Loading…
[Feature]: Implement
check_health
for V1
v1
#20164
opened Jun 27, 2025 by
limbaniharsh
Loading…
1 of 3 tasks
[Feature] Enable triton scaled mm for NVIDIA GPUs with ahead-of-time autotuning
performance
Performance-related issues
#20163
opened Jun 27, 2025 by
gau-nernst
•
Draft
3 of 4 tasks
[Tests] Update online DP tests to verify that requests are balanced
v1
#20157
opened Jun 27, 2025 by
njhill
Loading…
[CLI] Improve CLI arg parsing for
-O
/--compilation-config
#20156
opened Jun 26, 2025 by
ProExpertProg
•
Draft
4 tasks done
[Feature] Add async tensor parallelism for scaled mm
#20155
opened Jun 26, 2025 by
cascade812
Loading…
Add pynccl all-gatherv and reducescatterv
#20154
opened Jun 26, 2025 by
trevor-m
Loading…
3 of 4 tasks
[CI] Temporally Remove DP test for Distributed Tests (4 GPUs)
ci/build
#20153
opened Jun 26, 2025 by
yewentao256
Loading…
Previous Next
ProTip!
no:milestone will show everything without a milestone.