Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 27 additions & 1 deletion serverless/endpoints/model-caching.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,32 @@ flowchart TD
```
</div>

## Where models are stored

Cached models are stored in a Runpod-managed Docker volume mounted at `/runpod-volume/huggingface-cache/hub/`. The model cache is automatically managed and persists across requests on the same worker.

<Note>
While cached models use the same mount path as network volumes (`/runpod-volume/`), the model loaded from the cache will load significantly faster than the same model loaded from a network volume.
</Note>

## Accessing cached models in your application

Models are cached on your workers at `/runpod-volume/huggingface-cache/hub/` following Hugging Face cache conventions. The directory structure replaces forward slashes (`/`) from the original model name with double dashes (`--`), and includes a version hash subdirectory.

The path structure follows this pattern:

```
/runpod-volume/huggingface-cache/hub/models--HF_ORGANIZATION--MODEL_NAME/snapshots/VERSION_HASH/
```

For example, the model `gensyn/qwen2.5-0.5b-instruct` would be stored at:

```
/runpod-volume/huggingface-cache/hub/models--gensyn--qwen2.5-0.5b-instruct/snapshots/317b7eb96312eda0c431d1dab1af958a308cb35e/
```

If your application requires specific paths, configure it to scan `/runpod-volume/huggingface-cache/hub/` for models.

## Enabling cached models

<Frame alt="Cached model setting">
Expand All @@ -85,4 +111,4 @@ Follow these steps to select and add a cached model to your Serverless endpoint:
</Step>
</Steps>

You can add a cached model to an existing endpoint by selecting **Manage → Edit Endpoint** in the endpoint details page and updating the **Model (optional)** field.
You can add a cached model to an existing endpoint by selecting **Manage → Edit Endpoint** in the endpoint details page and updating the **Model (optional)** field.