Blog: Inside vLLM’s New Offloading Connector #149

orozery · 2025-12-29T12:16:57Z

No description provided.

chatgpt-codex-connector · 2025-12-29T12:17:00Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

_posts/2025-12-29-offloading-connector.md

esmeetu · 2026-01-07T12:33:26Z

_posts/2025-12-29-offloading-connector.md

+* CPU block size 16 tokens  
+* De/Tokenization disabled
+
+Our benchmark code can be found [here](https://github.com/orozery/playground/blob/kv-offloading-blog-dec-2025/kvcache/kv_offload_benchmark.py).


Maybe we can drop this benchmark script with small change into https://github.com/vllm-project/vllm/tree/main/benchmarks.

Good idea! I think this script is useful and will get more exposure there.
However, it will take some time to get it in there, and I think getting the blog post now is a higher priority.

esmeetu · 2026-01-07T12:40:04Z

Thanks for the great contribution! Along with the comments above, it might be helpful to add a few more offloading connector use cases at the end of the blog.
Adding a Slack channel could also help the community collaborate and iterate on this over time.

orozery · 2026-01-07T17:14:32Z

Thanks for the great contribution! Along with the comments above, it might be helpful to add a few more offloading connector use cases at the end of the blog. Adding a Slack channel could also help the community collaborate and iterate on this over time.

Thanks for reviewing!
I don't have any more use-cases to elaborate on at this point.
I've added a section at the end to engage community discussion (including a Slack channel).

esmeetu · 2026-01-08T00:05:10Z

LGTM! Thanks @orozery

Signed-off-by: Or Ozeri <[email protected]>

mgoin · 2026-01-06T16:52:39Z

_posts/2025-12-29-offloading-connector.md

@@ -0,0 +1,234 @@
+---
+layout: post
+title: "Inside vLLM’s New Offloading Connector: Smarter Memory Transfer for Maximizing Inference Throughput"


I feel like the title should be "KV Offloading" since it isn't clear if we are talking about model offloading

Agreed. This applies to the blog name as well.
@orozery Could we update this in another PR before we promote on social medias?

vercel bot deployed to Preview December 29, 2025 12:17 View deployment

orozery force-pushed the offloading-connector-dec-2025 branch from 002e08e to 07b2642 Compare January 6, 2026 15:03

vercel bot deployed to Preview January 6, 2026 15:04 View deployment

esmeetu approved these changes Jan 7, 2026

View reviewed changes

orozery force-pushed the offloading-connector-dec-2025 branch from 07b2642 to da93e1f Compare January 7, 2026 13:05

vercel bot deployed to Preview January 7, 2026 13:06 View deployment

orozery force-pushed the offloading-connector-dec-2025 branch from da93e1f to fe7d325 Compare January 7, 2026 17:09

vercel bot deployed to Preview January 7, 2026 17:09 View deployment

Blog: Inside vLLM’s New Offloading Connector

9059fb9

Signed-off-by: Or Ozeri <[email protected]>

esmeetu force-pushed the offloading-connector-dec-2025 branch from fe7d325 to 9059fb9 Compare January 8, 2026 00:06

vercel bot deployed to Preview January 8, 2026 00:06 View deployment

esmeetu merged commit 7c10446 into vllm-project:main Jan 8, 2026
2 checks passed

mgoin reviewed Jan 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Blog: Inside vLLM’s New Offloading Connector #149

Blog: Inside vLLM’s New Offloading Connector #149

Uh oh!

orozery commented Dec 29, 2025

Uh oh!

chatgpt-codex-connector bot commented Dec 29, 2025

Uh oh!

Uh oh!

esmeetu Jan 7, 2026

Uh oh!

orozery Jan 7, 2026

Uh oh!

esmeetu commented Jan 7, 2026

Uh oh!

orozery commented Jan 7, 2026

Uh oh!

esmeetu commented Jan 8, 2026

Uh oh!

Uh oh!

mgoin Jan 6, 2026

Uh oh!

esmeetu Jan 8, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Blog: Inside vLLM’s New Offloading Connector #149

Blog: Inside vLLM’s New Offloading Connector #149

Uh oh!

Conversation

orozery commented Dec 29, 2025

Uh oh!

chatgpt-codex-connector bot commented Dec 29, 2025

Uh oh!

Uh oh!

esmeetu Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

orozery Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

esmeetu commented Jan 7, 2026

Uh oh!

orozery commented Jan 7, 2026

Uh oh!

esmeetu commented Jan 8, 2026

Uh oh!

Uh oh!

mgoin Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

esmeetu Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

esmeetu Jan 8, 2026 •

edited

Loading