Skip to content

Commit 3137cdd

Browse files
committed
UIF1.2
1 parent 1e5612d commit 3137cdd

File tree

66 files changed

+2230
-902
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

66 files changed

+2230
-902
lines changed

LICENSE

+25-25
Original file line numberDiff line numberDiff line change
@@ -218,10 +218,10 @@ Apache License
218218
Advanced Micro Devices software license terms, and open source software
219219
license terms. These separate license terms govern your use of the third
220220
party programs as set forth in the "THIRD-PARTY-PROGRAMS" file.
221-
222-
===============================================================================
223-
224-
ADVANCED MICRO DEVICES, INC.
221+
222+
=========================================================================
223+
224+
ADVANCED MICRO DEVICES, INC.
225225
LICENSE AGREEMENT FOR NON-COMMERCIAL MODELS
226226

227227

@@ -298,14 +298,13 @@ OFA-depthwise-resnet50,
298298
This License Agreement for Non-Commercial Models (“Agreement”) is a legal
299299
agreement between you (either an individual or an entity) and Advanced Micro
300300
Devices, Inc. on behalf of itself and its subsidiaries and affiliates (collectively
301-
“AMD”). DO NOT USE THE TRAINED MODELS IDENTIFIED ABOVE UNTIL YOU HAVE CAREFULLY
302-
READ THIS AGREEMENT. BY USING, INSTALLING, MODIFYING, COPYING, TRAINING,
303-
BENCHMARKING, OR DISTRIBUTING THE TRAINED MODELS, YOU AGREE TO AND ACCEPT ALL
304-
TERMS AND CONDITIONS OF THIS AGREEMENT. If you do not accept these terms, do not
305-
use the Trained Models.
306-
307-
1. Subject to your compliance with this Agreement, AMD grants you a license to
308-
use, modify, and distribute the Trained Models solely for non-commercial and research
301+
“AMD”). DO NOT USE THE TRAINED MODELS IDENTIFIED ABOVE UNTIL YOU HAVE CAREFULLY READ
302+
THIS AGREEMENT. BY USING, INSTALLING, MODIFYING, COPYING, TRAINING, BENCHMARKING, OR
303+
DISTRIBUTING THE TRAINED MODELS, YOU AGREE TO AND ACCEPT ALL TERMS AND CONDITIONS OF
304+
THIS AGREEMENT. If you do not accept these terms, do not use the Trained Models.
305+
306+
1. Subject to your compliance with this Agreement, AMD grants you a license to use,
307+
modify, and distribute the Trained Models solely for non-commercial and research
309308
purposes. This means you may use the Trained Models for benchmarking, testing, and
310309
evaluating the Trained Models (including non-commercial research undertaken by or
311310
funded by a commercial entity) but you cannot use the Trained Models in any commercial
@@ -314,17 +313,18 @@ exchange for money or other consideration.
314313

315314
2. Your license to the Trained Models is subject to the following conditions:
316315
(a) you cannot alter any copyright, trademark, or other notice in the Trained Models;
317-
(b) you cannot sublicense or distribute the Trained Models under any other terms or conditions;
318-
(c) you cannot use AMD’s trademarks in your applications or technologies in a way that suggests
319-
your applications or technologies are endorsed by AMD; (d) if you distribute a Trained Model,
320-
you must provide corresponding source code for such Trained Model; and (e) if the
321-
Trained Models include any code or content subject to an open source license or third party
322-
license (“Third Party Materials”), you agree to comply with such license terms.
323-
324-
3. THE TRAINED MODELS (INCLUDING THIRD PARTY MATERIALS, IF ANY) ARE PROVIDED “AS IS”
325-
AND WITHOUT A WARRANTY OF ANY KIND, WHETHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED
326-
TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
327-
YOU BEAR ALL RISK OF USING THE TRAINED MODELS (INCLUDING THIRD PARTY MATERIALS, IF ANY) AND
328-
YOU AGREE TO RELEASE AMD FROM ANY LIABILITY OR DAMAGES FOR ANY CLAIM OR ACTION ARISING OUT
329-
OF OR IN CONNECTION WITH YOUR USE OF THE TRAINED MODELS AND/OR THIRD PARTY MATERIALS.
316+
(b) you cannot sublicense or distribute the Trained Models under any other terms or conditions;
317+
(c) you cannot use AMD’s trademarks in your applications or technologies in a way that suggests
318+
your applications or technologies are endorsed by AMD; (d) if you distribute a Trained Model,
319+
you must provide corresponding source code for such Trained Model; and
320+
(e) if the Trained Models include any code or content subject to an open source license or
321+
third party license (“Third Party Materials”), you agree to comply with such license terms.
322+
323+
3. THE TRAINED MODELS (INCLUDING THIRD PARTY MATERIALS, IF ANY) ARE PROVIDED “AS IS” AND
324+
WITHOUT A WARRANTY OF ANY KIND, WHETHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO
325+
THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT.
326+
YOU BEAR ALL RISK OF USING THE TRAINED MODELS (INCLUDING THIRD PARTY MATERIALS, IF ANY)
327+
AND YOU AGREE TO RELEASE AMD FROM ANY LIABILITY OR DAMAGES FOR ANY CLAIM OR ACTION ARISING
328+
OUT OF OR IN CONNECTION WITH YOUR USE OF THE TRAINED MODELS AND/OR THIRD PARTY MATERIALS.
329+
330330

README.md

+40-54
Original file line numberDiff line numberDiff line change
@@ -1,77 +1,71 @@
11
<table width="100%">
22
<tr width="100%">
3-
<td align="center"><img src="https://raw.githubusercontent.com/Xilinx/Image-Collateral/main/xilinx-logo.png" width="30%"/><h1>Unified Inference Frontend (UIF) 1.1 User Guide </h1>
3+
<td align="center"><img src="https://raw.githubusercontent.com/Xilinx/Image-Collateral/main/xilinx-logo.png" width="30%"/><h1>Unified Inference Frontend (UIF) 1.2 User Guide </h1>
44
</td>
55
</table>
66

77
# Unified Inference Frontend
88

9-
Unified Inference Frontend (UIF) is an effort to consolidate the following compute platforms under one AMD inference solution with unified tools and runtime:
9+
Unified Inference Frontend (UIF) consolidates the following compute platforms under one AMD inference solution with unified tools and runtime:
1010

11-
- AMD EPYC&trade; processors
12-
- AMD Instinct™ GPUs
13-
- AMD Ryzen&trade; processors
14-
- Versal&trade; ACAP
11+
- AMD EPYC&trade; and AMD Ryzen&trade; processors
12+
- AMD Instinct&trade; and AMD Radeon&trade; GPUs
13+
- AMD Versal&trade; Adaptive SoCs
1514
- Field Programmable Gate Arrays (FPGAs)
1615

17-
UIF accelerates deep learning inference applications on all AMD compute platforms for popular machine learning frameworks, including TensorFlow, PyTorch, and ONNXRT. It consists of tools, libraries, models, and example designs optimized for AMD platforms that enable deep learning applications and framework developers to improve inference performance across various workloads such as computer vision, natural language processing, and recommender systems.
16+
UIF accelerates deep learning inference applications on all AMD compute platforms for popular machine learning frameworks, including TensorFlow, PyTorch, and ONNXRT. It consists of tools, libraries, models, and example designs optimized for AMD platforms. These enable deep learning application and framework developers to enhance inference performance across various workloads, including computer vision, natural language processing, and recommender systems.
1817

18+
# Release Highlights
1919

20-
![](/images/slide24.png)
21-
22-
* **Note:** WinML is supported on Windows OS only.
23-
24-
# Unified Inference Frontend 1.1
25-
26-
UIF 1.1 extends the support to AMD Instinct GPUs in addition to EPYC CPUs starting from UIF 1.0. Currently, [MIGraphX](https://github.com/ROCmSoftwarePlatform/AMDMIGraphX) is the acceleration library for Instinct GPUs for Deep Learning Inference. UIF 1.1 provides 45 optimized models for Instinct GPUs and 84 for EPYC CPUs. The Vitis&trade; AI Optimizer tool is released as part of the Vitis AI 3.0 stack. UIF Quantizer is released in the PyTorch and TensorFlow Docker® images. Leveraging the UIF Optimizer and Quantizer enables performance benefits for customers when running with the MIGraphX and ZenDNN backends for Instinct GPUs and EPYC CPUs, respectively. This release also adds MIGraphX backend for [AMD Inference Server](https://github.com/Xilinx/inference-server). This document provides information about downloading, building, and running the UIF 1.1 release.
27-
28-
## AMD Instinct GPU
29-
30-
UIF 1.1 targets support for AMD GPUs. While UIF 1.0 enabled Vitis AI Model Zoo for TensorFlow+ZenDNN and PyTorch+ZenDNN, UIF v1.1 adds support for AMD Instinct&trade; GPUs.
20+
UIF 1.2 adds support for AMD Radeon&trade; GPUs in addition to AMD Instinct&trade; GPUs. Currently, [MIGraphX](https://github.com/ROCmSoftwarePlatform/AMDMIGraphX) is the acceleration library for both Radeon and Instinct GPUs for Deep Learning Inference. UIF supports 50 optimized models for Instinct and Radeon GPUs and 84 for EPYC CPUs. The AMD Vitis&trade; AI Optimizer tool is released as part of the Vitis AI 3.5 stack. UIF Quantizer is released in the PyTorch and TensorFlow Docker® images. Leveraging the UIF Optimizer and Quantizer enables performance benefits for customers when running with the MIGraphX and ZenDNN backends for Instinct and Radeon GPUs and EPYC CPUs, respectively. This release also adds MIGraphX backend for [AMD Inference Server](https://github.com/Xilinx/inference-server). This document provides information about downloading, building, and running the UIF v1.2 release.
3121

32-
UIF 1.1 also introduces tools for optimizing inference models. GPU support includes the ability to use AMD GPUs for optimizing inference as well the ability to deploy inference using the AMD ROCm™ platform. Additionally, UIF 1.1 has expanded the set of models available for AMD CPUs and introduces new models for AMD GPUs as well.
22+
The highlights of this release are as follows:
3323

34-
# Release Highlights
24+
AMD Radeon&trade; GPU:
25+
* Support for AMD Radeon&trade; PRO V620 and W6800 GPUs.
26+
For more information about the product, see https://www.amd.com/en/products/professional-graphics/amd-radeon-pro-w6800.
27+
* Tools for optimizing inference models and deploying inference using the AMD ROCm™ platform.
28+
* Inclusion of the [rocAL](https://docs.amd.com/projects/rocAL/en/docs-5.5.0/user_guide/ch1.html) library.
3529

36-
The highlights of this release are as follows:
30+
Model Zoo:
31+
* Expanded set of models for AMD CPUs and new models for AMD GPUs.
3732

3833
ZenDNN:
3934
* TensorFlow, PyTorch, and ONNXRT with ZenDNN packages for download (from the ZenDNN web site)
40-
* 84 model packages containing FP32/BF16/INT8 models enabled to be run on TensorFlow+ZenDNN, PyTorch+ZenDNN and ONNXRT+ZenDNN
41-
* Up to 20.5x the throughput (images/second) running Medical EDD RefineDet with the Xilinx Vitis AI Model Zoo 3.0 88% pruned INT8 model on 2P AMD Eng Sample: 100-000000894-04
42-
of the EPYC 9004 96-core processor powered server with ZenDNN v4.0 compared to the baseline FP32 Medical EDD RefineDet model from the same Model Zoo. ([ZD-036](#zd036))
43-
* Docker containers for running AMD Inference Server
4435

4536
ROCm:
4637
* Docker containers containing tools for optimizing models for inference
47-
* 30 quantized models enabled to run on AMD ROCm platform using MIGraphX inference engine
48-
* Up to 5.3x the throughput (images/second) running PT-OFA-ResNet50 with the Xilinx Vitis AI Model Zoo 3.0 88% pruned FP16 model on an AMD MI100 accelerator powered production server compared to the baseline FP32 PT- ResNet50v1.5 model from the same Model Zoo. ([ZD-041](#zd041))
38+
* 50 models enabled to run on AMD ROCm platform using MIGraphX inference engine
39+
* Up to 5.3x the throughput (images/second) running PT-OFA-ResNet50 with 78% pruned FP16 model on an AMD MI100 accelerator powered production server compared to the baseline FP32 PT- ResNet50v1.5 model. ([ZD-041](#zd041))
4940
* Docker containers for running AMD Inference Server
5041

5142
AMD Inference Server provides a common interface for all inference modes:
5243
* Common C++ and server APIs for model deployment
5344
* Backend interface for using TensorFlow/PyTorch in inference for ZenDNN
54-
* Additional UIF 1.1 optimized models examples for Inference Server
45+
* Additional UIF 1.2 optimized models examples for Inference Server
5546
* Integration with KServe
5647

48+
[Introducing Once-For-All (OFA)](/docs/2_model_setup/uifmodelsetup.md#213-once-for-all-ofa-efficient-model-customization-for-various-platforms), a neural architecture search method that efficiently customizes sub-networks for diverse hardware platforms, avoiding high computation costs. OFA can achieve up to 1.69x speedup on MI100 GPUs compared to ResNet50 baselines.
49+
5750
# Prerequisites
5851

5952
The following prerequisites must be met for this release of UIF:
60-
61-
* Hardware based on target platform:
62-
* CPU: AMD EPYC [9004](https://www.amd.com/en/processors/epyc-9004-series) or [7003](https://www.amd.com/en/processors/epyc-7003-series) Series Processors
63-
* GPU: AMD Instinct&trade; [MI200](https://www.amd.com/en/graphics/instinct-server-accelerators) or [MI100](https://www.amd.com/en/products/server-accelerators/instinct-mi100) Series GPU
64-
* FPGA/AI Engine: Zynq&trade; SoCs or Versal devices supported in [Vitis AI 3.0](https://github.com/Xilinx/Vitis-AI)
65-
66-
* Software based on target platform:
67-
* OS: Ubuntu® 18.04 LTS and later, Red Hat® Enterprise Linux® (RHEL) 8.0 and later, CentOS 7.9 and later
68-
* ZenDNN 4.0 for AMD EPYC CPU
69-
* MIGraphX 2.4 for AMD Instinct GPU
70-
* Vitis AI 3.0 FPGA/AIE
71-
* Vitis AI 3.0 Model Zoo
72-
* Inference Server 0.3
73-
74-
## Implementing UIF 1.1
53+
| Component | Supported Hardware |
54+
|--------------------|---------------------------------------------------------|
55+
| CPU | AMD EPYC 9004 or 7003 Series Processors |
56+
| GPU | AMD Radeon™ PRO V620 and W6800, AMD Instinct™ MI200 or MI100 Series GPU |
57+
| FPGA/AI Engine | AMD Zynq™ SoCs or Versal devices supported in Vitis AI 3.5<br>**Note**: The inference server currently supports Vitis AI 3.0 devices|
58+
59+
| Component | Supported Software |
60+
|-----------------------|-------------------------------------------------------|
61+
| Operating Systems | Ubuntu® 20.04 LTS and later, Red Hat® Enterprise Linux® 8.0 and later, CentOS 7.9 and later |
62+
| ZenDNN | Version 4.0 for AMD EPYC CPU |
63+
| MIGraphX | Version 2.6 for AMD Instinct GPU |
64+
| Vitis AI | Version 3.5 for FPGA/AIE, Model Zoo |
65+
| Inference Server | Version 0.4 |
66+
67+
68+
## Getting Started with UIF v1.2
7569

7670
### Step 1: Installation
7771

@@ -115,16 +109,8 @@ The following pages outline debugging and profiling strategies:
115109
- <a href="/docs/5_debugging_and_profiling/debugging_and_profiling.md#51-debug-on-gpu">5.1: Debug on GPU</a>
116110
- <a href="/docs/5_debugging_and_profiling/debugging_and_profiling.md#52-debug-on-cpu">5.2: Debug on CPU</a>
117111
- <a href="/docs/5_debugging_and_profiling/debugging_and_profiling.md#53-debug-on-fpga">5.3: Debug on FPGA</a>
118-
119-
120-
### Step 6: Deploying on PyTorch and Tensorflow
121-
122-
The following pages outline deploying strategies on PyTorch and Tensorflow:
123112

124-
- <a href="https://github.com/amd/UIF/blob/main/docs/6_deployment_guide/PyTorch.md">PyTorch</a>
125-
- <a href="https://github.com/amd/UIF/blob/main/docs/6_deployment_guide/Tensorflow.md">Tensorflow</a>
126-
127-
<hr/>
113+
<hr/>
128114

129115
[Next >](/docs/1_installation/installation.md)
130116

@@ -166,11 +152,11 @@ AOCC CPU OPTIMIZATIONS BINARY IS SUBJECT TO THE LICENSE AGREEMENT ENCLOSED IN TH
166152

167153
#### ZD036:
168154

169-
Testing conducted by AMD Performance Labs as of Thursday, January 12, 2023, on the ZenDNN v4.0 software library, Xilinx Vitis AI Model Zoo 3.0, on test systems comprising of AMD Eng Sample of the EPYC 9004 96-core processor, dual socket, with hyperthreading on, 2150 MHz CPU frequency (Max 3700 MHz), 786GB RAM (12 x 64GB DIMMs @ 4800 MT/s; DDR5 - 4800MHz 288-pin Low Profile ECC Registered RDIMM 2RX4), NPS1 mode, Ubuntu® 20.04.5 LTS version, kernel version 5.4.0-131-generic, BIOS TQZ1000F, GCC/G++ version 11.1.0, GNU ID 2.31, Python 3.8.15, AOCC version 4.0, AOCL BLIS version 4.0, TensorFlow version 2.10. Pruning was performed by the Xilinx Vitis AI pruning and quantization tool v3.0. Performance may vary based on use of latest drivers and other factors. ZD036
155+
Testing conducted by AMD Performance Labs as of Thursday, January 12, 2023, on the ZenDNN v4.0 software library, Xilinx Vitis AI Model Zoo 3.5, on test systems comprising of AMD Eng Sample of the EPYC 9004 96-core processor, dual socket, with hyperthreading on, 2150 MHz CPU frequency (Max 3700 MHz), 786GB RAM (12 x 64GB DIMMs @ 4800 MT/s; DDR5 - 4800MHz 288-pin Low Profile ECC Registered RDIMM 2RX4), NPS1 mode, Ubuntu® 20.04.5 LTS version, kernel version 5.4.0-131-generic, BIOS TQZ1000F, GCC/G++ version 11.1.0, GNU ID 2.31, Python 3.8.15, AOCC version 4.0, AOCL BLIS version 4.0, TensorFlow version 2.10. Pruning was performed by the Xilinx Vitis AI pruning and quantization tool v3.5. Performance may vary based on use of latest drivers and other factors. ZD036
170156

171157
#### ZD041:
172158

173-
Testing conducted by AMD Performance Labs as of Wednesday, January 18, 2023, on test systems comprising of: AMD MI100, 1200 MHz CPU frequency, 8x32GB GPU Memory, NPS1 mode, Ubuntu® 20.04 version, kernel version 4.15.0-166-generic, BIOS 2.5.6, GCC/G++ version 9.4.0, GNU ID 2.34, Python 3.7.13, xcompiler version 3.0.0, pytorch-nndct version 3.0.0, xir version 3.0.0, target_factory version 3.0.0, unilog version 3.0.0, ROCm version 5.4.1.50401-84~20.04. Pruning was performed by the Xilinx Vitis AI pruning and quantization tool v3.0. Performance may vary based on use of latest drivers and other factors. ZD-041
159+
Testing conducted by AMD Performance Labs as of Wednesday, January 18, 2023, on test systems comprising of: AMD MI100, 1200 MHz CPU frequency, 8x32GB GPU Memory, NPS1 mode, Ubuntu® 20.04 version, kernel version 4.15.0-166-generic, BIOS 2.5.6, GCC/G++ version 9.4.0, GNU ID 2.34, Python 3.7.13, xcompiler version 3.5.0, pytorch-nndct version 3.5.0, xir version 3.5.0, target_factory version 3.5.0, unilog version 3.5.0, ROCm version 5.4.1.50401-84~20.04. Pruning was performed by the Xilinx Vitis AI pruning and quantization tool v3.5. Performance may vary based on use of latest drivers and other factors. ZD-041
174160

175161

176162

0 commit comments

Comments
 (0)