Skip to content

Commit f2577c5

Browse files
committed
reorg CB demo
1 parent 31352e7 commit f2577c5

File tree

4 files changed

+31
-50
lines changed

4 files changed

+31
-50
lines changed

demos/continuous_batching/README.md

+14-48
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,10 @@ That makes it easy to use and efficient especially on on Intel® Xeon® processo
55

66
> **Note:** This demo was tested on Intel® Xeon® processors Gen4 and Gen5 and Intel dGPU ARC and Flex models on Ubuntu22/24 and RedHat8/9.
77
8+
## Prerequisites
9+
- **For Linux users**: Installed Docker Engine
10+
- **For Windows users**: Installed OVMS binary package according to the [baremetal deployment guide](../../docs/deploying_server_baremetal.md)
11+
812
## Model preparation
913
> **Note** Python 3.9 or higher is need for that step
1014
Here, the original Pytorch LLM model and the tokenizer will be converted to IR format and optionally quantized.
@@ -46,9 +50,7 @@ models
4650
└── tokenizer.json
4751
```
4852

49-
The default configuration of the `LLMExecutor` should work in most cases but the parameters can be tuned inside the `node_options` section in the `graph.pbtxt` file.
50-
Note that the `models_path` parameter in the graph file can be an absolute path or relative to the `base_path` from `config.json`.
51-
Check the [LLM calculator documentation](../../docs/llm/reference.md) to learn about configuration options.
53+
The default configuration should work in most cases but the parameters can be tuned via `export_model.py` script arguments. Run the script with `--help` argument to check available parameters and see the [LLM calculator documentation](../../docs/llm/reference.md) to learn more about configuration options.
5254

5355

5456
## Deploying with Docker
@@ -70,62 +72,26 @@ python demos/common/export_models/export_model.py text_generation --source_model
7072
docker run -d --rm -p 8000:8000 --device /dev/dri --group-add=$(stat -c "%g" /dev/dri/render* | head -n 1) -v $(pwd)/models:/workspace:ro openvino/model_server:latest-gpu --rest_port 8000 --config_path /workspace/config.json
7173
```
7274

73-
### Build Image From Source (Linux Host)
74-
75-
In case you want to try out features that have not been released yet, you can build the image from source code yourself.
76-
```bash
77-
git clone https://github.com/openvinotoolkit/model_server.git
78-
cd model_server
79-
make release_image GPU=1
80-
```
81-
It will create an image called `openvino/model_server:latest`.
82-
> **Note:** This operation might take 40min or more depending on your build host.
83-
> **Note:** `GPU` parameter in image build command is needed to include dependencies for GPU device.
84-
> **Note:** The public image from the last release might be not compatible with models exported using the the latest export script. Check the [demo version from the last release](https://github.com/openvinotoolkit/model_server/tree/releases/2024/4/demos/continuous_batching) to use the public docker image.
85-
8675
## Deploying on Bare Metal
8776

88-
Download model server archive and unpack it to `model_server` directory. The package contains OVMS binary and all of its dependencies.
89-
90-
```console
91-
curl https://github.com/openvinotoolkit/model_server/releases/download/<release>/<dist>
92-
tar -xf <dist>
93-
```
94-
where:
95-
96-
- `<release>` - model server version: `v2024.4`, `v2024.5` etc.
97-
- `<dist>` - package for desired OS, one of: `ovms_redhat.tar.gz`, `ovms_ubuntu22.tar.gz`, `ovms_win.zip`
98-
99-
For correct Python initialization also set `PYTHONHOME` environment variable in the shell that will be used to launch model server.
100-
It may also be required to add OVMS-provided Python catalog to `PATH` to make it a primary choice for the serving during startup.
77+
Assuming you have unpacked model server package to your current working directory run `setupvars` script for environment setup:
10178

102-
**Linux**
103-
104-
```bash
105-
export PYTHONHOME=$PWD/ovms/python
106-
export PATH=$PWD/ovms/python;$PATH
107-
```
108-
109-
**Windows Command Line**:
79+
**Windows Command Line**
11080
```bat
111-
set PYTHONHOME="$pwd\ovms\python"
112-
set PATH="$pwd\ovms\python;%PATH%"
81+
./ovms/setupvars.bat
11382
```
11483

115-
**Windows PowerShell**:
84+
**Windows PowerShell**
11685
```powershell
117-
$env:PYTHONHOME="$pwd\ovms\python"
118-
$env:PATH="$pwd\ovms\python;$env:PATH"
86+
./ovms/setupvars.ps1
11987
```
12088

121-
Once it's set, you can launch the model server.
122-
12389
### CPU
12490

12591
In model preparation section, configuration is set to load models on CPU, so you can simply run the binary pointing to the configuration file and selecting port for the HTTP server to expose inference endpoint.
12692

127-
```console
128-
./ovms/ovms --rest_port 8000 --config_path ./models/config.json
93+
```bat
94+
ovms --rest_port 8000 --config_path ./models/config.json
12995
```
13096

13197

@@ -138,8 +104,8 @@ python demos/common/export_models/export_model.py text_generation --source_model
138104
```
139105
Then rerun above command as configuration file has already been adjusted to deploy model on GPU:
140106

141-
```console
142-
./ovms/ovms --rest_port 8000 --config_path ./models/config.json
107+
```bat
108+
ovms --rest_port 8000 --config_path ./models/config.json
143109
```
144110

145111
### Check readiness

docs/deploying_server_docker.md

+15-2
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
## Deploying Model Server in Docker Container
22

3-
This is a step-by-step guide on how to deploy OpenVINO&trade; Model Server on Linux, using a pre-build Docker Container.
3+
This is a step-by-step guide on how to deploy OpenVINO&trade; Model Server on Linux, using Docker.
44

55
**Before you start, make sure you have:**
66

@@ -72,4 +72,17 @@ print(imagenet_classes[result_index])' >> predict.py
7272
python predict.py
7373
zebra
7474
```
75-
If everything is set up correctly, you will see 'zebra' prediction in the output.
75+
If everything is set up correctly, you will see 'zebra' prediction in the output.
76+
77+
### Build Image From Source
78+
79+
In case you want to try out features that have not been released yet, you can build the image from source code yourself.
80+
```bash
81+
git clone https://github.com/openvinotoolkit/model_server.git
82+
cd model_server
83+
make release_image GPU=1
84+
```
85+
It will create an image called `openvino/model_server:latest`.
86+
> **Note:** This operation might take 40min or more depending on your build host.
87+
> **Note:** `GPU` parameter in image build command is needed to include dependencies for GPU device.
88+
> **Note:** The public image from the last release might be not compatible with models exported using the the latest export script. Check the [demo version from the last release](https://github.com/openvinotoolkit/model_server/tree/releases/2024/4/demos/continuous_batching) to use the public docker image.

setupvars.bat

+1
Original file line numberDiff line numberDiff line change
@@ -18,4 +18,5 @@ setlocal EnableExtensions EnableDelayedExpansion
1818
set "OVMS_DIR=%~dp0"
1919
set "PYTHONHOME=%OVMS_DIR%\python"
2020
set "PATH=%OVMS_DIR%;%PYTHONHOME%;%PATH%"
21+
echo "OpenVINO Model Server Environment Initialized"
2122
endlocal

setupvars.ps1

+1
Original file line numberDiff line numberDiff line change
@@ -17,3 +17,4 @@
1717
$env:OVMS_DIR=$PSScriptRoot
1818
$env:PYTHONHOME="$env:OVMS_DIR\python"
1919
$env:PATH="$env:OVMS_DIR;$env:PYTHONHOME;$env:PATH"
20+
echo "OpenVINO Model Server Environment Initialized"

0 commit comments

Comments
 (0)