Skip to content

Releases: vllm-project/semantic-router

v0.2.0 - Athena

10 Mar 09:09
7fc8070

Choose a tag to compare

v0 2

Installation

curl -fsSL https://vllm-semantic-router.com/install.sh | bash

What's Changed

See the full changelog below.

What's Changed

  • [Bugfix] Fix getRecommendedModel to return real model names from config by @henschwartz in #933
  • [CI] test(response-api): add E2E translation tests for Response API by @tao12345666333 in #881
  • [Doc] Fix installation typo by @shernshiou in #965
  • [CI] ci(e2e): add Response API K8s integration test workflow by @tao12345666333 in #882
  • fix: initialize batched embedding modelfor qwen3 semantic cache by @liavweiss in #954
  • fix(dashboard): resolve config update permission denied error by @henschwartz in #962
  • docs: fix index dsiplay error by @yuluo-yx in #959
  • [CI/Build][Dashboard] Fix OpenShift config map paths and dashboard build by @nerdalert in #951
  • test(dashboard): add file persistence tests for config updates by @henschwartz in #963
  • [Doc] website versions multilingual for release v0.1 by @samzong in #925
  • feat(vllm-sr): add dashboard support to Python CLI by @yehuditkerido in #960
  • fix(dashboard): return 404 instead of 500 when tools_db.json not found by @henschwartz in #972
  • [Bugfix] Fix PII mapping path to match deployed model directory by @nerdalert in #970
  • feat: re-enable EmbeddingGemma-300m support by @liavweiss in #816
  • feat [llm-katan] Support running llm-katan on XPU Platform by @chaojun-zhang in #969
  • [Misc] 🔧 chore(pre-commit): show file count instead of list by @samzong in #968
  • [Misc] 🔧 chore(make): add deps.mk for func-e and unified external binary tool management by @samzong in #886
  • Fix(CI): Resolve The E2E Testing by @Xunzhuo in #1013
  • feat(dashboard): add Python CLI config format support by @yehuditkerido in #1019
  • feat(cli): add Podman container runtime support by @Xunzhuo in #982
  • [Misc]: remove legacy Kustomize manifests in favor of Helm chart by @abdallahsamabd in #978
  • [Feat] implement Signal Parallelism in EvaluateAllSignals by @GOavi101 in #1023
  • [Feat][CI/Build] Add OpenShift simulator backend for vSR by @nerdalert in #981
  • [Feat] Add multi-model routing demo by @anushapant in #1025
  • [Feat]: Enable CLI-based Helm deployment with smart defaults by @abdallahsamabd in #957
  • Optimize: Speed-Up Signal Extraction by @Xunzhuo in #1027
  • fix: set file descriptor limits to 65536 to resolve Envoy initialization failure by @liavweiss in #1026
  • [Bugfix] Fix keyword matching inconsistency in e2e tests by @srini-abhiram in #828
  • [Misx] fix: show unique model count in logs instead of alias count by @samzong in #1029
  • [Feat] align Classification API server with signal-driven architecture by @GOavi101 in #1032
  • Refactor: Layout and UI for Dashboard by @Xunzhuo in #1034
  • feat(dashboard): refactor sidebar to align with Python CLI config by @yehuditkerido in #1035
  • Dashboard: Re-Organize the Layout and Functionality by @Xunzhuo in #1036
  • Dashboard: Refactor Data Management by @Xunzhuo in #1038
  • fix(dashboard): read Router Config settings from router-defaults.yaml by @yehuditkerido in #1037
  • Dashboard: Refactor Reasoning Management by @Xunzhuo in #1041
  • Dashboard: Refactor Landing Page by @Xunzhuo in #1043
  • Dashboard: Update Landing Page UX by @Xunzhuo in #1044
  • Preload candidate embeddings once at startup and optionally use HNSW … by @szedan-rh in #1039
  • [Misc] fix: replace fake domain endpoints with 127.0.0.1:1 by @samzong in #1055
  • Docs: Update Dashboard and Release related Docs by @Xunzhuo in #1056
  • [Build] Fix candle-flash-attn dependency to use git source for CUTLASS submodule by @carlory in #1049
  • [UX] add signal CURD for dashboard by @ppppqp in #1063
  • remove cmd/vsr CLI and its dependencies by @carlory in #1061
  • Update the documentation of installation llm-d to use helm chart by @szedan-rh in #1064
  • [Misc] chore: fix models gitignore pattern and markdown linter exclusion by @samzong in #1058
  • [Misc] chore: update testdata outputs with feedback_detector defaults by @samzong in #1057
  • Add a local-up-router.sh script for dockerless environment by @carlory in #1059
  • Update istio documentation how to and helm chart values by @szedan-rh in #1053
  • feat(test): More ExtProc test coverage (Issue #44) by @uestcergs7 in #974
  • feat: Lifetimes should be associated with individual cache entries by @fatelei in #1012
  • [CI] feat(response-api): add conversation chaining E2E tests and instructions inheritance by @tao12345666333 in #1031
  • Fix the models missing and wrong model reference and the tests by @szedan-rh in #1068
  • [Docs] docs/enrich footer and contributing section by @samzong in #1065
  • chore(website): optimize navbar icons and remove deprecated pages by @samzong in #1073
  • [Feat][Router]: Implement Router Replay plugin for debugging routing decisions by @R3hankhan123 in #1075
  • feature: add more signals - language signal by @liavweiss in #1052
  • [Feat] add advanced tool filtering with confidence‑gated category filter and rerank by @samzong in #1033
  • [UX][Quality] Refactor and Decision CURD in dashboard by @ppppqp in #1072
  • fix(classification): enforce confidence threshold for domain signal matching by @asaadbalum in #1074
  • [UX] Virtualize decision config form by @ppppqp in #1078
  • [Misc]💄 style(homepage): adjust hero flex and add spacing after badge icon by @samzong in #1077
  • Feat: Introducing Looper for Implementing Collective Algorithms by @Xunzhuo in #1054
  • [Doc:zh] add more signals - language signal by @windsonsea in #1080
  • fix: enable response api profile e2e test and fix the configuration by @liavweiss in #1090
  • [Doc:en/zh] Fix format issues in overview/signal-driven-decisions.md by @windsonsea in #1087
  • feat(openshift): add auto model config and routing example by @nerdalert in #1079
  • [UX]: Improve UX for unconfigured Grafana and Jaeger services by @uestcergs7 in #1071
  • Docs: Sync Missing Blog from vLLM by @Xunzhuo in #1096
  • feat(selection): implement advanced model selection methods by @asaadbalum in #1089
  • [CLI][Bugfix] Fix init --force behavior and default model naming by @niuguy in #1108
  • [Feat][Memory]: Add Redis storage backend f...
Read more

v0.1.0 - Iris

05 Jan 05:59
8ad0c46

Choose a tag to compare

iris-1

Release Blog: https://blog.vllm.ai/2026/01/05/vllm-sr-iris.html

What's Changed

  • feat: support auto-enable reasoning mode based on intention by @Xunzhuo in #1
  • fix: remove no needed todo and verify CI by @Xunzhuo in #2
  • project: add bench and site owners by @Xunzhuo in #4
  • project: add code of conduct by @Xunzhuo in #5
  • chore: unify docker images by @Xunzhuo in #6
  • fix: use the correct go test file name. by @yafengio in #7
  • ci: disable notify action for now by @Xunzhuo in #10
  • docs: semantic cache stale types and implementation by @gluonfield in #9
  • chore: rm readthedocs as its deprecated by @Xunzhuo in #12
  • Removed redundant / from code img by @tao12345666333 in #13
  • chore: Update CONTRIBUTING.md by @cryo-zd in #17
  • chore: add DCO requirement in CONTRIBUTING.md by @cryo-zd in #18
  • fix(cache): cleanup expired cache entries during update operations by @QIN2DIM in #16
  • chore(logging): unify the logging method by @ZeroZ-lab in #19
  • fix:make reasoning effort configurable by @OneZero-Y in #21
  • docs: add vsr star history diagram by @Xunzhuo in #26
  • docs: add repo link in CONTRIBUTING.md by @cryo-zd in #27
  • project: add acknowledgements to huggingface-candle by @Xunzhuo in #28
  • chore: replace fmt.Printf with log.Printf for logging by @cryo-zd in #29
  • doc: update workflow to create config.yaml by @rootfs in #30
  • feat: implement batch classification API by @OneZero-Y in #24
  • chore: 1) install rust if not present 2) expose bench params in env var by @rootfs in #54
  • feat: Add comprehensive monitoring metrics for batch classification API by @OneZero-Y in #58
  • docs: add pre-commit requirement code quality checks to contributing by @OneZero-Y in #60
  • feat: reasoning model controller by @tao12345666333 in #56
  • test: add unit tests for getModelFamilyAndTemplateParam by @tao12345666333 in #63
  • docs: add reasoning model metrics by @tao12345666333 in #64
  • feat: add test framework for classifier with dependency injection by @aeft in #57
  • project: add vllm semantic router v0.1 roadmap by @Xunzhuo in #22
  • test: add unit test around ttft pkg by @yuluo-yx in #68
  • feat: code polish on classifier by @yuluo-yx in #67
  • feat: robust model name filter for DeepSeek by @tao12345666333 in #69
  • fix: correct candle-binding replace path in go.mod files by @aeft in #65
  • project: add blog section by @Xunzhuo in #70
  • chore: only run the workflow notify-owners on vllm-project/semantic-router by @liangyuanpeng in #72
  • feat(observability): structured JSON logs and event fields by @tao12345666333 in #66
  • chore: Normalize comment punctuation to use English period by @cryo-zd in #79
  • chore: Use (*OpenAIRouter)(nil) for interface compliance check by @cryo-zd in #77
  • pricing: add currency label and change the metric name to llm_model_cost_total by @tao12345666333 in #80
  • test: add go vet to CI by @cryo-zd in #81
  • feat(logging): adopt zap as unified logging library by @tao12345666333 in #83
  • docs: add python install setups in install-local by @yuluo-yx in #78
  • feat(config): watch config file and hot-reload router without restart by @tao12345666333 in #84
  • chore: remove GPU and model params in config. Backend and model aware optimization will be handled in the control plane by @rootfs in #93
  • chore: add go mod tidy check by @Xunzhuo in #99
  • fix: startup config for docker-compose by @liangyuanpeng in #73
  • fix: don't set reasoning effort for non-reasoning models by @rootfs in #97
  • chore: add github action badge in README by @yuluo-yx in #102
  • refactor: use slices.Contains for readability and consistency by @cryo-zd in #104
  • test: add more test cases and refactor SelectBestModelForCategory/SelectBestModelFromList/InitializeJailbreakClassifier for testability by @aeft in #101
  • docs: add github action badge for docs index by @yuluo-yx in #103
  • feat: add milvus persistent storage support by @rootfs in #105
  • Slight readme changes by @LysandreJik in #25
  • refactor: move classifier model init to classifier.go and unify the classifier model init logic by @aeft in #113
  • docs: add eslint check for docs website by @yuluo-yx in #114
  • Refactor: use worker pool for batch classification concurrency by @cryo-zd in #115
  • feat: add comprehensive unit tests for entropy-based routing. Tests c… by @rootfs in #112
  • docs: reasoning quickstart by @tao12345666333 in #110
  • o11y: Add TTFT and TPOT histograms for SLOs by @tao12345666333 in #126
  • docs: add markdown lint check and fix md lint style by @yuluo-yx in #117
  • Feature Enhancement: Batch Inference Support in candle-binding by @OneZero-Y in #71
  • infra: add yaml lint check and fix yaml style by @yuluo-yx in #131
  • perf: enable concurrent classification via Arc+clone by @cryo-zd in #127
  • feat: implement dataset-agnostic router reasoning benchmark by @rootfs in #125
  • o11y: Add request error counters by @tao12345666333 in #132
  • logging: unify stdlib log usage to pkg/observability (zap) by @tao12345666333 in #134
  • fix: add comments for readability by @JaredforReal in #135
  • docs(installation): update Go version requirement and add test tip for model downloads by @samzong in #146
  • docs: reorder the quickstart pages by @Xunzhuo in #143
  • project: add ack for kubernetes by @Xunzhuo in #141
  • docs: sync blog from official vLLM by @Xunzhuo in #142
  • infra: refactor makefile by @yuluo-yx in #149
  • infra: update Dockerfile.extproc by @yuluo-yx in #158
  • fix: use request id to locate the correct cache entry to update by @aeft in #154
  • feat: add codespell check and tidy linter check config files by @yuluo-yx in #159
  • fix: miss copy tools dir in dockerfile by @lengrongfu in #161
  • metrics: Add request-level token histograms by @tao12345666333 in #157
  • docs: add repo URL in docker/README.md by @cryo-zd in https://github.com/vllm-proje...
Read more