Skip to content

Commit 9d6d1e0

Browse files
authored
[Doc] add example page and readme (#241)
* add example page and readme Signed-off-by: Siddhant Ray <[email protected]> * fix linting Signed-off-by: Siddhant Ray <[email protected]> --------- Signed-off-by: Siddhant Ray <[email protected]>
1 parent a315284 commit 9d6d1e0

File tree

4 files changed

+204
-6
lines changed

4 files changed

+204
-6
lines changed

docs/README.md

+30
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
2+
# Build documentation locally
3+
4+
## Install Prerequisites
5+
6+
```bash
7+
pip install -r requirements-docs.txt
8+
```
9+
10+
## Build docs
11+
12+
First run
13+
14+
```bash
15+
make clean
16+
```
17+
18+
To build HTML
19+
20+
```bash
21+
make html
22+
```
23+
24+
Serve documentation page locally
25+
26+
```bash
27+
python -m http.server 8000 -d build/html/
28+
```
29+
30+
### Launch your browser and open localhost:8000

docs/source/conf.py

+8-4
Original file line numberDiff line numberDiff line change
@@ -19,16 +19,17 @@
1919
author = "vLLM Production Stack Team"
2020

2121
extensions = [
22+
"sphinx_copybutton",
2223
"sphinx.ext.napoleon",
2324
"sphinx.ext.linkcode",
2425
"sphinx.ext.intersphinx",
25-
"sphinx_copybutton",
2626
"sphinx.ext.autodoc",
2727
"sphinx.ext.autosummary",
28-
"myst_parser",
28+
# "myst_parser",
2929
"sphinxarg.ext",
30-
"sphinx_design",
31-
"sphinx_togglebutton",
30+
# "sphinx_design",
31+
# "sphinx_togglebutton",
32+
"sphinx_click",
3233
]
3334

3435
# -- General configuration ---------------------------------------------------
@@ -39,6 +40,9 @@
3940
templates_path = ["_templates"]
4041
exclude_patterns = []
4142

43+
copybutton_prompt_text = r"\$ "
44+
copybutton_prompt_is_regexp = True
45+
4246

4347
class MockedClassDocumenter(autodoc.ClassDocumenter):
4448
"""Remove note about base class when a class is

docs/source/getting_started/examples.rst

+165-1
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,168 @@
33
Minimal Example
44
===============
55

6-
Add simple tutorial here.
6+
Introduction
7+
------------
8+
9+
This is a minimal working example of the vLLM Production Stack using one vLLM instance with the ``facebook/opt-125m`` model.
10+
The goal is to have a working deployment of vLLM on a Kubernetes environment with GPU.
11+
12+
Prerequisites
13+
-------------
14+
15+
- A Kubernetes environment with GPU support. If not set up, follow the `install-kubernetes-env <https://github.com/vllm-project/production-stack/blob/main/tutorials/00-install-kubernetes-env.md>`_ guide.
16+
- Helm installed. Refer to the `install-helm.sh <https://github.com/vllm-project/production-stack/blob/main/utils/install-helm.sh>`_ script for instructions.
17+
- kubectl should be installed. Refer to the `install-kubectl.sh <https://github.com/vllm-project/production-stack/blob/main/utils/install-kubectl.sh>`_ script for instructions.
18+
- The project repository cloned: `vLLM Production Stack repository <https://github.com/vllm-project/production-stack>`_.
19+
- Basic familiarity with Kubernetes and Helm.
20+
21+
Steps to follow
22+
---------------
23+
24+
1. Deploy vLLM Instance
25+
~~~~~~~~~~~~~~~~~~~~~~~~
26+
27+
1.1 Use existing configuration
28+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
29+
30+
The vLLM Production Stack repository provides a predefined configuration file, `values-01-minimal-example.yaml`, located `here <https://github.com/vllm-project/production-stack/blob/main/tutorials/assets/values-01-minimal-example.yaml>`_.
31+
This file contains the following content:
32+
33+
.. code-block:: yaml
34+
35+
servingEngineSpec:
36+
runtimeClassName: ""
37+
modelSpec:
38+
- name: "opt125m"
39+
repository: "vllm/vllm-openai"
40+
tag: "latest"
41+
modelURL: "facebook/opt-125m"
42+
43+
replicaCount: 1
44+
45+
requestCPU: 6
46+
requestMemory: "16Gi"
47+
requestGPU: 1
48+
49+
50+
1.2 Deploy the stack
51+
^^^^^^^^^^^^^^^^^^^^
52+
53+
Deploy the Helm chart using the predefined configuration file:
54+
55+
.. code-block:: bash
56+
57+
sudo helm repo add vllm https://vllm-project.github.io/production-stack
58+
sudo helm install vllm vllm/vllm-stack -f tutorials/assets/values-01-minimal-example.yaml
59+
60+
61+
2. Validate Installation
62+
~~~~~~~~~~~~~~~~~~~~~~~~
63+
64+
2.1 Monitor Deployment Status
65+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
66+
67+
Monitor the deployment status using:
68+
69+
.. code-block:: bash
70+
71+
sudo kubectl get pods
72+
73+
74+
Expected output:
75+
76+
.. code-block:: console
77+
78+
NAME READY STATUS RESTARTS AGE
79+
vllm-deployment-router-859d8fb668-2x2b7 1/1 Running 0 2m38s
80+
vllm-opt125m-deployment-vllm-84dfc9bd7-vb9bs 1/1 Running 0 2m38s
81+
82+
.. note::
83+
84+
It may take some time for the containers to download the Docker images and LLM weights.
85+
86+
3. Send a Query to the Stack
87+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
88+
89+
3.1 Forward the Service Port
90+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
91+
92+
Expose the `vllm-router-service` port to the host machine:
93+
94+
.. code-block:: bash
95+
96+
sudo kubectl port-forward svc/vllm-router-service 30080:80
97+
98+
99+
3.2 Query the OpenAI-Compatible API to list the available models
100+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
101+
102+
Test the stack's OpenAI-compatible API by querying the available models:
103+
104+
.. code-block:: bash
105+
106+
curl -o- http://localhost:30080/models
107+
108+
109+
Expected output:
110+
111+
.. code-block:: json
112+
113+
{
114+
"object": "list",
115+
"data": [
116+
{
117+
"id": "facebook/opt-125m",
118+
"object": "model",
119+
"created": 1737428424,
120+
"owned_by": "vllm",
121+
"root": null
122+
}
123+
]
124+
}
125+
126+
127+
128+
3.3 Query the OpenAI Completion Endpoint
129+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
130+
131+
Send a query to the OpenAI `/completion` endpoint to generate a completion for a prompt:
132+
133+
.. code-block:: bash
134+
135+
curl -X POST http://localhost:30080/completions \
136+
-H "Content-Type: application/json" \
137+
-d '{
138+
"model": "facebook/opt-125m",
139+
"prompt": "Once upon a time,",
140+
"max_tokens": 10
141+
}'
142+
143+
144+
Expected output:
145+
146+
.. code-block:: json
147+
148+
{
149+
"id": "completion-id",
150+
"object": "text_completion",
151+
"created": 1737428424,
152+
"model": "facebook/opt-125m",
153+
"choices": [
154+
{
155+
"text": " there was a brave knight who...",
156+
"index": 0,
157+
"finish_reason": "length"
158+
}
159+
]
160+
}
161+
162+
163+
4. Uninstall
164+
~~~~~~~~~~~~
165+
166+
To remove the deployment, run:
167+
168+
.. code-block:: bash
169+
170+
sudo helm uninstall vllm

docs/source/index.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -52,8 +52,8 @@ Documentation
5252
:caption: Getting Started
5353

5454
getting_started/installation
55-
getting_started/troubleshooting
5655
getting_started/examples
56+
getting_started/troubleshooting
5757

5858
.. toctree::
5959
:maxdepth: 1

0 commit comments

Comments
 (0)