Skip to content

Commit 9fde9bb

Browse files
0.4.0 release (#215)
Update tools for UIF 1.2 Update quickstart to wait for the server Update readme links Add shape to ImageInferenceRequest Bump to 0.4.0 Bump up to ROCM 5.6.1 Exclude Py3.6 from wheels Signed-off-by: Varun Sharma <[email protected]>
1 parent 92666f5 commit 9fde9bb

35 files changed

+161
-72
lines changed

CHANGELOG.rst

Lines changed: 38 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,39 @@ Unreleased
2828
Added
2929
^^^^^
3030

31+
* N/A
32+
33+
Changed
34+
^^^^^^^
35+
36+
* N/A
37+
38+
Deprecated
39+
^^^^^^^^^^
40+
41+
* N/A
42+
43+
Removed
44+
^^^^^^^
45+
46+
* N/A
47+
48+
Fixed
49+
^^^^^
50+
51+
* N/A
52+
53+
Security
54+
^^^^^^^^
55+
56+
* N/A
57+
58+
:github:`0.4.0 <Xilinx/inference-server/releases/tag/v0.4.0>` - 2023-09-07
59+
--------------------------------------------------------------------------
60+
61+
Added
62+
^^^^^
63+
3164
* An example MLPerf app using the inference server API (:pr:`129`)
3265
* Google Benchmark for writing performance-tracking tests (:pr:`147`)
3366
* Custom memory storage classes in the memory pool (:pr:`166`)
@@ -37,7 +70,8 @@ Added
3770
* Tests with FP16 (:pr:`189` and :pr:`203`)
3871
* Versioned models (:pr:`190`)
3972
* Expand benchmarking with MLPerf app (:pr:`197`) and add to data to docs (:pr:`198`)
40-
73+
* Custom environment configuration per test (:pr:`214`)
74+
* VCK5000 test (:pr:`214`)
4175

4276
Changed
4377
^^^^^^^
@@ -57,7 +91,7 @@ Changed
5791
* Close dynamically opened libraries (:pr:`186`)
5892
* Replace Jaeger exporter with OTLP (:pr:`187`)
5993
* Change STRING type to BYTES and shape type from uint64 to int64 (:pr:`190`)
60-
* Rename ONNX file to MXR correctly (:pr:`202`)
94+
* Include the correct tensor name in ModelMetadata in the XModel backend (:pr:`207`)
6195

6296
Deprecated
6397
^^^^^^^^^^
@@ -67,7 +101,7 @@ Deprecated
67101
Removed
68102
^^^^^^^
69103

70-
* N/A
104+
* Python 3.6 support (:pr:`215`)
71105

72106
Fixed
73107
^^^^^
@@ -78,6 +112,7 @@ Fixed
78112
* Fix building with different CMake options (:pr:`170`)
79113
* Fix wheel generation with vcpkg (:pr:`191`)
80114
* Load models at startup correctly (:pr:`195`)
115+
* Fix handling MIGraphX models with dots in the names (:pr:`202`)
81116

82117
Security
83118
^^^^^^^^

CMakeLists.txt

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,12 @@ endif()
9191

9292
list(APPEND VCPKG_MANIFEST_FEATURES "testing")
9393

94+
# In CMake 3.27+, find_package uses <PACKAGE_NAME>_ROOT variables. We're using
95+
# AKS_ROOT in the environment currently.
96+
if(${CMAKE_VERSION} VERSION_GREATER "3.27")
97+
cmake_policy(SET CMP0144 OLD)
98+
endif()
99+
94100
# set the project name
95101
project(
96102
amdinfer

README.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -38,14 +38,14 @@ The AMD Inference Server is integrated with the following libraries out of the g
3838
* TensorFlow and PyTorch models with `ZenDNN <https://developer.amd.com/zendnn/>`__ on CPUs (optimized for AMD CPUs)
3939
* ONNX models with `MIGraphX <https://github.com/ROCmSoftwarePlatform/AMDMIGraphX>`__ on AMD GPUs
4040
* XModel models with `Vitis AI <https://www.xilinx.com/products/design-tools/vitis/vitis-ai.html>`__ on AMD FPGAs
41-
* A graph of computation including as pre- and post-processing can be written using `AKS <https://github.com/Xilinx/Vitis-AI/tree/v3.0/src/AKS>`__ on AMD FPGAs for end-to-end inference
41+
* A graph of computation including as pre- and post-processing can be written using `AKS <https://github.com/Xilinx/Vitis-AI/tree/bbd45838d4a93f894cfc9f232140dc65af2398d1/src/AKS>`__ on AMD FPGAs for end-to-end inference
4242

4343
Quick Start Deployment and Inference
4444
------------------------------------
4545

4646
The following example demonstrates how to deploy the server locally and run a sample inference.
4747
This example runs on the CPU and does not require any special hardware.
48-
You can see a more detailed version of this example in the `quickstart <https://xilinx.github.io/inference-server/main/quickstart_inference.html>`__.
48+
You can see a more detailed version of this example in the `quickstart <https://xilinx.github.io/inference-server/main/quickstart.html>`__.
4949

5050
.. code-block:: bash
5151
@@ -80,7 +80,7 @@ Learn more
8080

8181
The documentation for the AMD Inference Server is available `online <https://xilinx.github.io/inference-server/>`__.
8282

83-
Check out the quickstart guides online to help you get started based on your use case(s): `inference <https://xilinx.github.io/inference-server/main/quickstart_inference.html>`__, `deployment <https://xilinx.github.io/inference-server/main/quickstart_deployment.html>`__ and `development <https://xilinx.github.io/inference-server/main/quickstart_development.html>`__.
83+
Check out the `quickstart <https://xilinx.github.io/inference-server/main/quickstart.html>`__ online to help you get started.
8484

8585
Support
8686
-------

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
0.4.0-dev
1+
0.4.0

docker/generate.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -316,13 +316,13 @@ def get_xrm_xrt_packages(package_manager):
316316
if package_manager == "apt":
317317
return textwrap.dedent(
318318
"""\
319-
&& wget --quiet -O xrt.deb https://www.xilinx.com/bin/public/openDownload?filename=xrt_202220.2.14.354_20.04-amd64-xrt.deb \\
319+
&& wget --quiet -O xrt.deb https://www.xilinx.com/bin/public/openDownload?filename=xrt_202220.2.14.418_20.04-amd64-xrt.deb \\
320320
&& wget --quiet -O xrm.deb https://www.xilinx.com/bin/public/openDownload?filename=xrm_202220.1.5.212_20.04-x86_64.deb \\"""
321321
)
322322
elif package_manager == "yum":
323323
return textwrap.dedent(
324324
"""\
325-
&& wget --quiet -O xrt.rpm https://www.xilinx.com/bin/public/openDownload?filename=xrt_202220.2.14.354_7.8.2003-x86_64-xrt.rpm \\
325+
&& wget --quiet -O xrt.rpm https://www.xilinx.com/bin/public/openDownload?filename=xrt_202220.2.14.418_7.8.2003-x86_64-xrt.rpm \\
326326
&& wget --quiet -O xrm.rpm https://www.xilinx.com/bin/public/openDownload?filename=xrm_202220.1.5.212_7.8.2003-x86_64.rpm \\"""
327327
)
328328
raise ValueError(f"Unknown base image type: {package_manager}")
@@ -576,8 +576,8 @@ def install_dev_packages(manager: PackageManager, core):
576576

577577

578578
def install_migraphx(manager: PackageManager, custom_backends):
579-
migraphx_apt_repo = 'echo "deb [arch=amd64 trusted=yes] http://repo.radeon.com/rocm/apt/5.4.1/ ubuntu main" > /etc/apt/sources.list.d/rocm.list'
580-
migraphx_yum_repo = '"[ROCm]\\nname=ROCm\\nbaseurl=https://repo.radeon.com/rocm/yum/5.4.1/\\nenabled=1\\ngpgcheck=1\\ngpgkey=https://repo.radeon.com/rocm/rocm.gpg.key" > /etc/yum.repos.d/rocm.repo'
579+
migraphx_apt_repo = 'echo "deb [arch=amd64 trusted=yes] http://repo.radeon.com/rocm/apt/5.6.1/ ubuntu main" > /etc/apt/sources.list.d/rocm.list'
580+
migraphx_yum_repo = '"[ROCm]\\nname=ROCm\\nbaseurl=https://repo.radeon.com/rocm/yum/5.6.1/\\nenabled=1\\ngpgcheck=1\\ngpgkey=https://repo.radeon.com/rocm/rocm.gpg.key" > /etc/yum.repos.d/rocm.repo'
581581

582582
if manager.name == "apt":
583583
add_repo = (

docs/backends/vitis_ai.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ While not every model is tested on every FPGA, the Vitis AI backend has run at l
4444

4545
Alveo,U250,DPUCADF8H
4646
Versal,VCK5000,DPUCVDX8H
47+
Alveo,V70,DPUCV2DX8G
4748

4849
Other devices and DPUs may also work but are currently untested.
4950

docs/conf.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -169,6 +169,9 @@ def hide_private_module(app, what, name, obj, options, signature, return_annotat
169169

170170
# strip leading $ from bash code blocks
171171
copybutton_prompt_text = "$ "
172+
copybutton_here_doc_delimiter = "EOF"
173+
# selecting the literal block doesn't work to show the copy button correctly
174+
# copybutton_selector = ":is(div.highlight pre, pre.literal-block)"
172175

173176
# raise a warning if a cross-reference cannot be found
174177
nitpicky = True
@@ -256,7 +259,7 @@ def hide_private_module(app, what, name, obj, options, signature, return_annotat
256259

257260
html_context["languages"] = [("en", "/" + "inference-server/" + version + "/")]
258261

259-
versions = ["0.1.0", "0.2.0", "0.3.0"]
262+
versions = ["0.1.0", "0.2.0", "0.3.0", "0.4.0"]
260263
versions.append("main")
261264
html_context["versions"] = []
262265
for version in versions:

docs/dependencies.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -180,7 +180,7 @@ The following packages are installed from Github.
180180
:github:`protocolbuffers/protobuf`,3.19.4,BSD-3,Dynamically linked by amdinfer-server and Vitis libraries\ :superscript:`a 0`
181181
:github:`fpagliughi/sockpp`,e5c51b5,BSD-3,Dynamically linked by amdinfer-server :superscript:`a 0`
182182
:github:`gabime/spdlog`,1.8.2,MIT,Statically linked by amdinfer-server for logging\ :superscript:`a 0`
183-
:github:`Xilinx/Vitis-AI`,3.0,Apache 2.0,VART is dynamically linked by amdinfer-server\ :superscript:`a 1`
183+
:github:`Xilinx/Vitis-AI`,3.5,Apache 2.0,VART is dynamically linked by amdinfer-server\ :superscript:`a 1`
184184
:github:`wg/wrk`,4.1.0,modified Apache 2.0,Executable used for benchmarking amdinfer-server\ :superscript:`d 0`
185185

186186
Others
@@ -203,8 +203,8 @@ The following packages are installed from Xilinx.
203203
:header: Name,Version,License,Usage
204204
:widths: auto
205205

206-
:xilinxDownload:`XRM <xrm_202120.1.3.29_18.04-x86_64.deb>`,1.3.29,Apache 2.0,Used for FPGA resource management\ :superscript:`a 1`
207-
:xilinxDownload:`XRT <xrt_202120.2.12.427_18.04-amd64-xrt.deb>`,2.12.427,Apache 2.0,Used for communicating to the FPGA\ :superscript:`a 1`
206+
:xilinxDownload:`XRM <xrm_202220.1.5.212_20.04-x86_64.deb>`,1.15.212,Apache 2.0,Used for FPGA resource management\ :superscript:`a 1`
207+
:xilinxDownload:`XRT <xrt_202220.2.14.418_20.04-amd64-xrt.deb>`,2.14.418,Apache 2.0,Used for communicating to the FPGA\ :superscript:`a 1`
208208

209209
AMD
210210
^^^

docs/dry.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -62,15 +62,15 @@ In this case, the endpoint is defined in the model's configuration file in the r
6262
.. code-tab:: console CPU
6363

6464
# this image is not available on Dockerhub yet but you can build it yourself from the repository
65-
$ docker pull amdih/serve:uif1.1_zendnn_amdinfer_0.4.0
65+
$ docker pull amdih/serve:uif1.2_zendnn_amdinfer_0.4.0
6666

6767
.. code-tab:: text GPU
6868

6969
# this image is not available on Dockerhub yet but you can build it yourself from the repository
70-
$ docker pull amdih/serve:uif1.1_migraphx_amdinfer_0.4.0
70+
$ docker pull amdih/serve:uif1.2_migraphx_amdinfer_0.4.0
7171

7272
.. code-tab:: console FPGA
7373

7474
# this image is not available on Dockerhub yet but you can build it yourself from the repository
75-
$ docker pull amdih/serve:uif1.1_vai_amdinfer_0.4.0
75+
$ docker pull amdih/serve:uif1.2_vai_amdinfer_0.4.0
7676
-docker_pull_deployment_images

docs/quickstart.rst

Lines changed: 19 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -81,17 +81,19 @@ The CPU version has no special hardware requirements to run so you can always ru
8181

8282
.. code-tab:: console FPGA
8383

84+
# this example assumes a U250. If you're using a different board, download the appropriate model for your board instead
8485
$ wget -O vitis.tar.gz https://www.xilinx.com/bin/public/openDownload?filename=resnet_v1_50_tf-u200-u250-r2.5.0.tar.gz
8586
$ tar -xzf vitis.tar.gz "resnet_v1_50_tf/resnet_v1_50_tf.xmodel"
8687
$ mkdir -p ./model_repository/resnet50/1
8788
$ mv ./resnet_v1_50_tf/resnet_v1_50_tf.xmodel ./model_repository/resnet50/1/
8889

89-
For the models used here, their corresponding ``config.toml`` should be placed in the chosen model repository (``./model_repository/resnet50/``):
90+
For the models used here, you can save their corresponding ``config.toml`` to the correct path with:
9091

9192
.. tabs::
9293

93-
.. code-tab:: toml CPU
94+
.. code-tab:: shell CPU
9495

96+
cat <<EOF > "./model_repository/resnet50/config.toml"
9597
name = "resnet50"
9698
platform = "tensorflow_graphdef"
9799

@@ -104,9 +106,11 @@ For the models used here, their corresponding ``config.toml`` should be placed i
104106
name = "resnet_v1_50/predictions/Reshape_1"
105107
datatype = "FP32"
106108
shape = [1000]
109+
EOF
107110

108-
.. code-tab:: text GPU
111+
.. code-tab:: shell GPU
109112

113+
cat <<EOF > "./model_repository/resnet50/config.toml"
110114
name = "resnet50"
111115
platform = "onnx_onnxv1"
112116

@@ -119,9 +123,11 @@ For the models used here, their corresponding ``config.toml`` should be placed i
119123
name = "output"
120124
datatype = "FP32"
121125
shape = [1000]
126+
EOF
122127

123-
.. code-tab:: console FPGA
128+
.. code-tab:: shell FPGA
124129

130+
cat <<EOF > "./model_repository/resnet50/config.toml"
125131
name = "resnet50"
126132
platform = "vitis_xmodel"
127133

@@ -134,6 +140,7 @@ For the models used here, their corresponding ``config.toml`` should be placed i
134140
name = "output"
135141
datatype = "INT8"
136142
shape = [1000]
143+
EOF
137144

138145
The name must match the name of the model directory: it defines the endpoint that will be used for inference.
139146
The platform identifies the type of the model and determines the file extension of the model file.
@@ -173,15 +180,15 @@ The flags used in this sample command are:
173180

174181
.. code-tab:: console CPU
175182

176-
$ docker run -d --volume $(pwd)/model_repository:/mnt/models:rw --net=host amdih/serve:uif1.1_zendnn_amdinfer_0.4.0
183+
$ docker run -d --volume $(pwd)/model_repository:/mnt/models:rw --net=host amdih/serve:uif1.2_zendnn_amdinfer_0.4.0
177184

178185
.. code-tab:: console GPU
179186

180-
$ docker run -d --device /dev/kfd --device /dev/dri --volume $(pwd)/model_repository:/mnt/models:rw --publish 127.0.0.1::8998 --publish 127.0.0.1::50051 amdih/serve:uif1.1_migraphx_amdinfer_0.4.0
187+
$ docker run -d --device /dev/kfd --device /dev/dri --volume $(pwd)/model_repository:/mnt/models:rw --net=host amdih/serve:uif1.2_migraphx_amdinfer_0.4.0
181188

182189
.. code-tab:: console FPGA
183190

184-
$ docker run -d --device /dev/dri --device /dev/xclmgmt<id> --volume $(pwd)/model_repository:/mnt/models:rw --publish 127.0.0.1::8998 --publish 127.0.0.1::50051 amdih/serve:uif1.1_vai_amdinfer_0.4.0
191+
$ docker run -d --device /dev/dri --device /dev/xclmgmt<id> --volume $(pwd)/model_repository:/mnt/models:rw --net=host amdih/serve:uif1.2_vai_amdinfer_0.4.0
185192

186193
The endpoints for each model will be the name of the model in the ``config.toml``, which should match the name of the parent directory in the model repository.
187194
In this example, it would be "resnet50".
@@ -195,7 +202,7 @@ Server deployment summary
195202
After setting up the server as above, you have the following information:
196203

197204
* IP address: 127.0.0.1 since the server is running on the same machine where you will run the inference
198-
* Ports: 8998 and 50051 for HTTP and gRPC, respectively. If you used ``--publish``, your port numbers may be different and you can see what they are using ``docker ps``.
205+
* Ports: 8998 and 50051 for HTTP and gRPC, respectively. If you used ``--publish`` in the ``docker run`` command to remap the ports, your port numbers may be different and you can see what they are using ``docker ps``.
199206
* Endpoint: "resnet50" since that is what the model name was used in the model repository and in the configuration file
200207

201208
The rest of this example will use these values in the sample code so substitute your own values if they are different.
@@ -239,21 +246,22 @@ These results are post-processed and the top 5 labels for the image are printed.
239246
.. parsed-literal::
240247
241248
$ wget :amdinferRawFull:`examples/resnet50/tfzendnn.py`
242-
$ python3 tfzendnn.py --ip 127.0.0.1 --grpc-port 50051 --endpoint resnet50 --image ./dog-3619020_640.jpg --labels ./imagenet_classes.txt
249+
$ python3 tfzendnn.py --ip 127.0.0.1 --grpc-port 50051 --endpoint resnet50 --image ./dog-3619020_640.jpg --labels ./imagenet_classes.txt --wait
243250
244251
.. group-tab:: GPU
245252

246253
.. parsed-literal::
247254
248255
$ wget :amdinferRawFull:`examples/resnet50/migraphx.py`
249-
$ python3 migraphx.py --ip 127.0.0.1 --http-port 8998 --endpoint resnet50 --image ./dog-3619020_640.jpg --labels ./imagenet_classes.txt
256+
# This will take some time initially as MIGraphX will compile the ONNX model to MXR
257+
$ python3 migraphx.py --ip 127.0.0.1 --http-port 8998 --endpoint resnet50 --image ./dog-3619020_640.jpg --labels ./imagenet_classes.txt --wait
250258
251259
.. group-tab:: FPGA
252260

253261
.. parsed-literal::
254262
255263
$ wget :amdinferRawFull:`examples/resnet50/vitis.py`
256-
$ python3 vitis.py --ip 127.0.0.1 --http-port 8998 --endpoint resnet50 --image ./dog-3619020_640.jpg --labels ./imagenet_classes.txt
264+
$ python3 vitis.py --ip 127.0.0.1 --http-port 8998 --endpoint resnet50 --image ./dog-3619020_640.jpg --labels ./imagenet_classes.txt --wait
257265
258266
After running the script, you should get output similar to the following.
259267
The exact output may be slightly different depending on whether you used CPU, GPU or FPGA versions of the example.

0 commit comments

Comments
 (0)