Releases: runpod-workers/worker-vllm
Releases · runpod-workers/worker-vllm
v2.11.1
v2.11.0
v2.10.0
- feat: cse 853 vllm template params
- fix(config): update allowed cuda versions in hub and tests config
- fix: remove space from gpuIds
- feat: bump transformers to allow Qwen3-VL
New Contributors
- @eugene-runpod made their first contribution in #230
- @wwydmanski made their first contribution in #225
v2.9.6
- fix: allow also CUDA 12.8 & 12.9
v2.9.5
- chore: update vllm to 0.11.0
- fix: max concurrency = 30 instead of 300
v2.9.4
- removed
HF_TOKENagain
v2.9.3
- fix: added back the HF_TOKEN
v2.9.2
- docs: add reasoning parser
- fix: remove "access token" as this is handled by the platform
v2.9.1
v2.9.0
- feat: prepare worker-vllm for the hub
- fix: allow "None" as value & parse the value of RAW_OPENAI_OUTPUT correctly