Skip to content

Outlines's cache not reusable across vllm startup #1130

Open
@Lap1n

Description

@Lap1n

Describe the issue as clearly as possible:

When using vllm and outlines, when running it from a VM, it seems that the diskcache functionality is not working correctly. Every time the server is startup, it doesn't seem to be able to reuse the previously computed FSM cache.

One way that can fix this issue is to serialize the cache key object as a string.
The changes can be found in this PR that I submitted.

Steps/code to reproduce the bug:

- Start vllm server
- send a request
- FSM computation happens
- Stops and relaunch the server
- send a request
- FSM computation still happens

Expected result:

- Start vllm server
- send a request
- FSM computation happens
- Stops and relaunch the server
- send a request
- FSM computation does not happens as it is already in the cache

Error message:

No response

Outlines/Python version information:

Version information

``` (command output here) ```
Latest from main.

Context for the issue:

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    vLLMThings involving vLLM support

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions