[Doc] add example page and readme (#241)

Siddhant-Ray · web-flow · commit 9d6d1e0a8087 · 2025-03-06T12:51:49.000-06:00
* add example page and readme

Signed-off-by: Siddhant Ray &lt;siddhant.r98@gmail.com&gt;

* fix linting

Signed-off-by: Siddhant Ray &lt;siddhant.r98@gmail.com&gt;

---------

Signed-off-by: Siddhant Ray &lt;siddhant.r98@gmail.com&gt;
diff --git a/docs/README.md b/docs/README.md
@@ -0,0 +1,30 @@
+
+# Build documentation locally
+
+## Install Prerequisites
+
+```bash
+pip install -r requirements-docs.txt
+```
+
+## Build docs
+
+First run
+
+```bash
+make clean
+```
+
+To build HTML
+
+```bash
+make html
+```
+
+Serve documentation page locally
+
+```bash
+python -m http.server 8000 -d build/html/
+```
+
+### Launch your browser and open localhost:8000
diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -19,16 +19,17 @@
 author = "vLLM Production Stack Team"
 
 extensions = [
+    "sphinx_copybutton",
     "sphinx.ext.napoleon",
     "sphinx.ext.linkcode",
     "sphinx.ext.intersphinx",
-    "sphinx_copybutton",
     "sphinx.ext.autodoc",
     "sphinx.ext.autosummary",
-    "myst_parser",
+    # "myst_parser",
     "sphinxarg.ext",
-    "sphinx_design",
-    "sphinx_togglebutton",
+    # "sphinx_design",
+    # "sphinx_togglebutton",
+    "sphinx_click",
 ]
 
 # -- General configuration ---------------------------------------------------
@@ -39,6 +40,9 @@
 templates_path = ["_templates"]
 exclude_patterns = []
 
+copybutton_prompt_text = r"\$ "
+copybutton_prompt_is_regexp = True
+
 
 class MockedClassDocumenter(autodoc.ClassDocumenter):
     """Remove note about base class when a class is
diff --git a/docs/source/getting_started/examples.rst b/docs/source/getting_started/examples.rst
@@ -3,4 +3,168 @@
 Minimal Example
 ===============
 
-Add simple tutorial here.
+Introduction
+------------
+
+This is a minimal working example of the vLLM Production Stack using one vLLM instance with the ``facebook/opt-125m`` model.
+The goal is to have a working deployment of vLLM on a Kubernetes environment with GPU.
+
+Prerequisites
+-------------
+
+- A Kubernetes environment with GPU support. If not set up, follow the `install-kubernetes-env <https://github.com/vllm-project/production-stack/blob/main/tutorials/00-install-kubernetes-env.md>`_ guide.
+- Helm installed. Refer to the `install-helm.sh <https://github.com/vllm-project/production-stack/blob/main/utils/install-helm.sh>`_ script for instructions.
+- kubectl should be installed. Refer to the `install-kubectl.sh <https://github.com/vllm-project/production-stack/blob/main/utils/install-kubectl.sh>`_ script for instructions.
+- The project repository cloned: `vLLM Production Stack repository <https://github.com/vllm-project/production-stack>`_.
+- Basic familiarity with Kubernetes and Helm.
+
+Steps to follow
+---------------
+
+1. Deploy vLLM Instance
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+1.1 Use existing configuration
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The vLLM Production Stack repository provides a predefined configuration file, `values-01-minimal-example.yaml`, located `here <https://github.com/vllm-project/production-stack/blob/main/tutorials/assets/values-01-minimal-example.yaml>`_.
+This file contains the following content:
+
+.. code-block:: yaml
+
+    servingEngineSpec:
+    runtimeClassName: ""
+    modelSpec:
+    - name: "opt125m"
+        repository: "vllm/vllm-openai"
+        tag: "latest"
+        modelURL: "facebook/opt-125m"
+
+        replicaCount: 1
+
+        requestCPU: 6
+        requestMemory: "16Gi"
+        requestGPU: 1
+
+
+1.2 Deploy the stack
+^^^^^^^^^^^^^^^^^^^^
+
+Deploy the Helm chart using the predefined configuration file:
+
+.. code-block:: bash
+
+    sudo helm repo add vllm https://vllm-project.github.io/production-stack
+    sudo helm install vllm vllm/vllm-stack -f tutorials/assets/values-01-minimal-example.yaml
+
+
+2. Validate Installation
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+2.1 Monitor Deployment Status
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Monitor the deployment status using:
+
+.. code-block:: bash
+
+    sudo kubectl get pods
+
+
+Expected output:
+
+.. code-block:: console
+
+    NAME                                           READY   STATUS    RESTARTS   AGE
+    vllm-deployment-router-859d8fb668-2x2b7        1/1     Running   0          2m38s
+    vllm-opt125m-deployment-vllm-84dfc9bd7-vb9bs   1/1     Running   0          2m38s
+
+.. note::
+
+    It may take some time for the containers to download the Docker images and LLM weights.
+
+3. Send a Query to the Stack
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+3.1 Forward the Service Port
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Expose the `vllm-router-service` port to the host machine:
+
+.. code-block:: bash
+
+    sudo kubectl port-forward svc/vllm-router-service 30080:80
+
+
+3.2 Query the OpenAI-Compatible API to list the available models
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Test the stack's OpenAI-compatible API by querying the available models:
+
+.. code-block:: bash
+
+    curl -o- http://localhost:30080/models
+
+
+Expected output:
+
+.. code-block:: json
+
+    {
+      "object": "list",
+      "data": [
+        {
+          "id": "facebook/opt-125m",
+          "object": "model",
+          "created": 1737428424,
+          "owned_by": "vllm",
+          "root": null
+        }
+      ]
+    }
+
+
+
+3.3 Query the OpenAI Completion Endpoint
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Send a query to the OpenAI `/completion` endpoint to generate a completion for a prompt:
+
+.. code-block:: bash
+
+    curl -X POST http://localhost:30080/completions \
+      -H "Content-Type: application/json" \
+      -d '{
+        "model": "facebook/opt-125m",
+        "prompt": "Once upon a time,",
+        "max_tokens": 10
+      }'
+
+
+Expected output:
+
+.. code-block:: json
+
+    {
+      "id": "completion-id",
+      "object": "text_completion",
+      "created": 1737428424,
+      "model": "facebook/opt-125m",
+      "choices": [
+        {
+          "text": " there was a brave knight who...",
+          "index": 0,
+          "finish_reason": "length"
+        }
+      ]
+    }
+
+
+4. Uninstall
+~~~~~~~~~~~~
+
+To remove the deployment, run:
+
+.. code-block:: bash
+
+    sudo helm uninstall vllm
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -52,8 +52,8 @@ Documentation
    :caption: Getting Started
 
    getting_started/installation
-   getting_started/troubleshooting
    getting_started/examples
+   getting_started/troubleshooting
 
 .. toctree::
    :maxdepth: 1