runpod · promptless · Nov 4, 2025 · Nov 18, 2025 · Nov 18, 2025 · Nov 18, 2025
diff --git a/serverless/endpoints/model-caching.mdx b/serverless/endpoints/model-caching.mdx
@@ -59,6 +59,32 @@ flowchart TD
 ```
 </div>
 
+## Where models are stored
+
+Cached models are stored in a Runpod-managed Docker volume mounted at `/runpod-volume/huggingface-cache/hub/`. The model cache is automatically managed and persists across requests on the same worker.
+
+<Note>
+While cached models use the same mount path as network volumes (`/runpod-volume/`), the model loaded from the cache will load significantly faster than the same model loaded from a network volume.
+</Note>
+
+## Accessing cached models in your application
+
+Models are cached on your workers at `/runpod-volume/huggingface-cache/hub/` following Hugging Face cache conventions. The directory structure replaces forward slashes (`/`) from the original model name with double dashes (`--`), and includes a version hash subdirectory.
+
+The path structure follows this pattern:
+
+```
+/runpod-volume/huggingface-cache/hub/models--HF_ORGANIZATION--MODEL_NAME/snapshots/VERSION_HASH/
+```
+
+For example, the model `gensyn/qwen2.5-0.5b-instruct` would be stored at:
+
+```
+/runpod-volume/huggingface-cache/hub/models--gensyn--qwen2.5-0.5b-instruct/snapshots/317b7eb96312eda0c431d1dab1af958a308cb35e/
+```
+
+If your application requires specific paths, configure it to scan `/runpod-volume/huggingface-cache/hub/` for models.
+
 ## Enabling cached models
 
 <Frame alt="Cached model setting">
@@ -85,4 +111,4 @@ Follow these steps to select and add a cached model to your Serverless endpoint:
   </Step>
 </Steps>
 
-You can add a cached model to an existing endpoint by selecting **Manage → Edit Endpoint** in the endpoint details page and updating the **Model (optional)** field.
+You can add a cached model to an existing endpoint by selecting **Manage → Edit Endpoint** in the endpoint details page and updating the **Model (optional)** field.