Releases · runpod-workers/worker-vllm · GitHub

24 Nov 15:42

TimPietruskyRunPod

v2.11.1 Latest

Latest

chore(deps): update runpod to latest version

Assets 2

17 Nov 18:38

TimPietruskyRunPod

v2.11.0

add ENABLE_EXPERT_PARALLEL engine arg for MoE models

New Contributors

@velaraptor-runpod

Contributors

velaraptor-runpod

Assets 2

14 Nov 16:25

TimPietruskyRunPod

v2.10.0

feat: cse 853 vllm template params
fix(config): update allowed cuda versions in hub and tests config
fix: remove space from gpuIds
feat: bump transformers to allow Qwen3-VL

New Contributors

@eugene-runpod made their first contribution in #230
@wwydmanski made their first contribution in #225

Contributors

wwydmanski and eugene-runpod

Assets 2

24 Oct 16:49

TimPietruskyRunPod

v2.9.6

fix: allow also CUDA 12.8 & 12.9

Assets 2

22 Oct 20:57

TimPietruskyRunPod

v2.9.5

chore: update vllm to 0.11.0
fix: max concurrency = 30 instead of 300

Assets 2

24 Sep 16:31

TimPietrusky

v2.9.4

removed HF_TOKEN again

Assets 2

23 Sep 17:25

TimPietrusky

v2.9.3

fix: added back the HF_TOKEN

Assets 2

19 Sep 19:01

TimPietrusky

v2.9.2

docs: add reasoning parser
fix: remove "access token" as this is handled by the platform

Assets 2

01 Sep 14:48

TimPietrusky

v2.9.1

feat: better hub support & concise README for the main repo (#215)

Assets 2

28 Aug 08:10

TimPietrusky

v2.9.0

feat: prepare worker-vllm for the hub
fix: allow "None" as value & parse the value of RAW_OPENAI_OUTPUT correctly

Assets 2