-
Notifications
You must be signed in to change notification settings - Fork 84
Closed
Labels
Description
Release Manager
Endgame
- Code freeze: July 5th, 2023
- Bug Bash date: July 8th, 2023
- Release date: July 19th, 2023
Main Features
SuperBench Improvement
-
- Support Ctrl+C and interrupt to stop all SuperBench testing. (Runner - Add signal handler in runner #530)
-
- Support CPU docker (Benchmarks: Build Pipeline - Add suppport for cpu-only perftest in makefile #480)
-
- Support Windows Docker for VDI/Gaming GPU (Dockerfile - Add SuperBench Windows Dockerfile #534)
-
- Support DirectX for Nvidia and AMD GPU (Benchmarks - Add support for DirectX GPU platform #536)
-
- Add System Config Info feature in SB runner. (Tools - Add runner for sys info and update docs #532)
-
- Support DirectX test pipeline (CI/CD - Support DirectX test pipeline #545)
Micro-benchmark Improvement
-
- Add DirectXGPUCopyBw Benchmark to measure HtoD/DtoH bandwidth (Benchmarks: Add benchmark - Add source code of DirectxGPUCopy microbenchmark #486 and Benchmarks: micro benchmarks - add python code for DirectXGPUCopy #546)
-
- Add DirectXGPUCoreFLops Benchmark to measure peak FLOPS (Benchmarks: Add benchmark - Add source code of DirectXGPUCoreFLOPs microbenchmark #488 and Benchmarks: micro benchmarks - add python code for DirectXGPUCoreFlops #542)
-
- Add DirectXGPUMemBw Benchmark to measure GPU memory bandwidth (Benchmarks: Add benchmark - Add source code of DirectxGPUMemBw microbenchmark #487 and Benchmarks: micro benchmarks - add python code for DirecXGPUMemBw #547)
-
- Add DirectXVCNEncodingLatency Benchmark to measure the VCN hardware encoding latency (Benchmarks: Build Pipeline - add AMF in third party and build AMF encoding latency test #543 and Benchmarks: micro benchmarks - add python code for DirectXGPUEncodingLatency #548)
-
- Support best algorithm selection in cudnn-function. (Related to [Enhancement] maybe algo argument can be omitted in cudnn-function? #384) (Benchmarks: microbenchmark - add auto selecting algorithm support for cudnn functions #540)
-
- Revise step time collection in distributed inference benchmark (Benchmarks - Revise step time collection in distributed inference benchmark #524)
Model Benchmark Improvement
-
- Fix early stop logic due to num_steps. (ModelBenchmarks - Fix early stop logic due to num_steps. #522)
-
- Support TensorRT models on Nvidia H100 (Benchmarks - Update result parsing in tensorrt inference #541)
Documentation
-
- Improve documentation for System Config Info. (Tools - Add runner for sys info and update docs #532)
-
- Update outdate references (Benchmarks - Update outdate references #539)
-
- Update outdate references in micro-benchmarks.md (Doc - Update outdate references in micro-benchmarks.md #544)
Backlog
Micro-benchmark Improvement
- Add HPL random generator to gemm-flops with ROCm (Related to Run benchmark failed (superbenchmark-0.8.0) #518)
- Support Monitoring for AMD GPUs
- Support cuDNN Backend API in cudnn-function.
- Add DirectXGPURenderFPS Benchmark to measure the FPS of rendering simple frames
Inference Benchmark Improvement
- Support VGG, LSTM, and GPT-2 small in TensorRT Inference Backend
- Support VGG, LSTM, and GPT-2 small in ORT Inference Backend
- Support more TensorRT parameters (Related to TensorRT parameter passing can be enhanced #366)
Reactions are currently unavailable