ROCm support #545

umireon · 2024-02-09T08:04:50Z

I've added the documentation for ROCm-supported build.

payom · 2024-02-10T16:33:54Z

Can't comment on the build instructions as I don't operate in an Ubuntu environment. However, with the new changes, the plugin builds successfully on my machine. Upon installation, OBS loads the plugin and evidently is making use of the GPU when selecting "GPU - TensorRT" in the settings (which I would recommend to change to avoid confusion).

With the SINet, Mediapipe and PPHumanSeg models, the plugins runs perfectly in my environment. - I can't discern any difference when switching between running over the CPU or the GPU. Unfortunately, the remaining segmentation models don't work. When using the Selfie Segmentation model, I get garbage, and my OBS application outright crashes when selecting either Robust Video Matting or TCMonoDepth due to a "memory access fault" in the HIP backend.

My environment:

GPU: Radeon RX 6800 XT
ROCm version 6.0.2
ONNX Runtime 1.16.3
OBS Studio 30.0.2

umireon · 2024-02-10T16:42:04Z

@payom Thank you for your feedback! It would be so helpful for us if you posted the whole part of the OBS log when OBS crashed. You can get OBS logs and crash reports from the OBS help menu.

payom · 2024-02-10T18:02:23Z

Here's the log while running OBS with verbose logging enabled

Unfortunately, the crash doesn't appear to have been captured in the log.

Setting the MIOPEN_ENABLE_LOGGING_CMD flag and running OBS from my terminal, I also get this print out which is what I used to identify what crashed in my earlier message

libDeckLinkAPI.so: cannot open shared object file: No such file or directory
MIOpen(HIP): Command [Pooling_logging_cmd] ./bin/MIOpenDriver pool -M 0 --input 1x3x192x192,110592x36864x192x1 -y 2 -x 2 -p 0 -q 0 -v 2 -u 2 -m avg -F 1 -t 1
MIOpen(HIP): Command [Pooling_logging_cmd] ./bin/MIOpenDriver pool -M 0 --input 1x3x96x96,27648x9216x96x1 -y 2 -x 2 -p 0 -q 0 -v 2 -u 2 -m avg -F 1 -t 1
MIOpen(HIP): Command [Pooling_logging_cmd] ./bin/MIOpenDriver pool -M 0 --input 1x3x48x48,6912x2304x48x1 -y 2 -x 2 -p 0 -q 0 -v 2 -u 2 -m avg -F 1 -t 1
MIOpen(HIP): Command [LogCmdFindConvolution] ./bin/MIOpenDriver conv -n 1 -c 3 -H 192 -W 192 -k 16 -y 3 -x 3 -p 1 -q 1 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -F 1 -t 1
MIOpen(HIP): Command [LogCmdConvolution] ./bin/MIOpenDriver conv -n 1 -c 3 -H 192 -W 192 -k 16 -y 3 -x 3 -p 1 -q 1 -u 2 -v 2 -l 1 -j 1 -m conv -g 1 -F 1 -t 1
MIOpen(HIP): Command [LogCmdFindConvolution] ./bin/MIOpenDriver conv -n 1 -c 16 -H 96 -W 96 -k 16 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -m conv -g 16 -F 1 -t 1
MIOpen(HIP): Command [LogCmdFusion] ./bin/MIOpenDriver CBAInfer -F 4 -n 1 -c 16 -H 96 -W 96 -k 16 -y 3 -x 3 -p 1 -q 1 -u 1 -v 1 -l 1 -j 1 -g 16 -S 0
Memory access fault by GPU node-1 (Agent handle: 0x5bad78b80150) on address 0x7f69b5200000. Reason: Page not present or supervisor privilege.

umireon · 2024-02-10T18:36:48Z

@royshil How should we implement ROCm support?

royshil · 2024-02-10T18:42:02Z

i think we need to research ROCm execution provider outside of the plugin to see why it behaves in this way with these particular models, and not with others.
Unfortunately i can't do it on my machines, since i don't have AMD GPUs.

the other option would be to switch away from ONNX Runtime to a different neural net framework that has more seemless support for the various accelerator vendors like Nvidia, AMD, Intel, DirectX etc.

fritz-fritz · 2024-08-03T07:56:22Z

i think we need to research ROCm execution provider outside of the plugin to see why it behaves in this way with these particular models, and not with others. Unfortunately i can't do it on my machines, since i don't have AMD GPUs.

On a related note. When using CPU inference, I have noticed considerable CPU consumption, encoder overload, and fps drop when using particular models. It seems specific to the better models such as PPHumanSeg or Robust Video Matting. Using a lessor quality model such as MediaPipe or Selfie Segmentation works without noticeable impacts on CPU/encoder/fps.

I have not tested the ROCm branch here (I have a nvidia dgpu and amd igpu so theoretically can test both ROCm and CUDA). But as there was discussion around models here I thought I'd chime in with some anecdotal experience.

danir-de · 2024-08-26T20:02:00Z

For me I can compile the branch umireon/rocm just fine.
But when enabling the filter with setting it to TensorRT with my RX 7900 XTX, obs crashes:

info: [obs-backgroundremoval] Background Removal Filter Options:
info: [obs-backgroundremoval]   Source: Background Removal
info: [obs-backgroundremoval]   Model: models/rvm_mobilenetv3_fp32.onnx
info: [obs-backgroundremoval]   Inference Device: tensorrt
info: [obs-backgroundremoval]   Num Threads: 0
info: [obs-backgroundremoval]   Enable Threshold: true
info: [obs-backgroundremoval]   Threshold: 0.300000
info: [obs-backgroundremoval]   Contour Filter: 0.050000
info: [obs-backgroundremoval]   Smooth Contour: 0.500000
info: [obs-backgroundremoval]   Feather: 0.550000
info: [obs-backgroundremoval]   Mask Every X Frames: 1
info: [obs-backgroundremoval]   Enable Image Similarity: false
info: [obs-backgroundremoval]   Image Similarity Threshold: 30.000000
info: [obs-backgroundremoval]   Blur Background: 20
info: [obs-backgroundremoval]   Enable Focal Blur: true
info: [obs-backgroundremoval]   Blur Focus Point: 0.250000
info: [obs-backgroundremoval]   Blur Focus Depth: 0.240000
info: [obs-backgroundremoval]   Disabled: true
info: [obs-backgroundremoval]   Model file path: /usr/share/obs/obs-plugins/obs-backgroundremoval/models/rvm_mobilenetv3_fp32.onnx
Memory access fault by GPU node-1 (Agent handle: 0x5691c0e491e0) on address 0x18800020000. Reason: Page not present or supervisor privilege.
[1]    100486 IOT instruction (core dumped)  obs

l33tlinuxh4x0r · 2024-09-08T06:01:01Z

I would love to test this... however I can't seem to be able to build it on Gentoo. Any help would be appreciated. Or maybe a flatpak release?

I got past a couple of errors... looks like I needed to install OBS non flatpak version, onnx, miopen and onnxruntime. I will update this post with further developments.

If I get this all compiled I will post a detailed guide.

danir-de · 2024-12-27T14:48:35Z

The latest version of branch umireon/rocm under OBS 31.0.0 under Linux 6.12.6 with ROCm 6.2.4 now crashes with when setting the Inference device to GPU - TensorRT:

info: [obs-backgroundremoval] Background filter updated
obs: symbol lookup error: /usr/lib/obs-plugins/obs-backgroundremoval.so: undefined symbol: OrtSessionOptionsAppendExecutionProvider_ROCM

It was build with these build instructions (with DENABLE_QT=OFF).

Setting the Inference device to GPU - CUDA doesn't seem to have any effect and shows the same CPU load as the CPU-setting.

danir-de · 2024-12-27T15:30:45Z

The issue was on my side, my onnxruntime was misconfigured for ROCm, it works perfectly!

As I see this, it needs a new button for ROCm, for example GPU - ROCm for this to be merged, right?

umireon added 9 commits February 9, 2024 16:01

Enable ROCm

5bd6250

Add ROCm support

9bb4c32

Fix

7e64a94

Fix

05582dd

Update CMakeLists.txt

569fa5c

Fix

5691fb6

Update CMakeLists.txt

ca8f3b4

Fix

3da1062

Update BUILDING-ROCM-UBUNTU.md

4b17999

umireon requested a review from royshil February 9, 2024 08:04

umireon mentioned this pull request Feb 9, 2024

PyTorch+ROCm for AMD Linux maybe? #535

Closed

GloriousEggroll mentioned this pull request Jun 6, 2024

Hard requirement for cuda? #584

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

ROCm support #545

ROCm support #545

Uh oh!

umireon commented Feb 9, 2024

Uh oh!

payom commented Feb 10, 2024

Uh oh!

umireon commented Feb 10, 2024 •

edited

Loading

Uh oh!

payom commented Feb 10, 2024 •

edited

Loading

Uh oh!

umireon commented Feb 10, 2024

Uh oh!

royshil commented Feb 10, 2024

Uh oh!

fritz-fritz commented Aug 3, 2024

Uh oh!

danir-de commented Aug 26, 2024

Uh oh!

l33tlinuxh4x0r commented Sep 8, 2024 •

edited

Loading

Uh oh!

danir-de commented Dec 27, 2024 •

edited

Loading

Uh oh!

danir-de commented Dec 27, 2024

Uh oh!

Uh oh!

Uh oh!

ROCm support #545

Are you sure you want to change the base?

ROCm support #545

Uh oh!

Conversation

umireon commented Feb 9, 2024

Uh oh!

payom commented Feb 10, 2024

Uh oh!

umireon commented Feb 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

payom commented Feb 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

umireon commented Feb 10, 2024

Uh oh!

royshil commented Feb 10, 2024

Uh oh!

fritz-fritz commented Aug 3, 2024

Uh oh!

danir-de commented Aug 26, 2024

Uh oh!

l33tlinuxh4x0r commented Sep 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

danir-de commented Dec 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

danir-de commented Dec 27, 2024

Uh oh!

Uh oh!

umireon commented Feb 10, 2024 •

edited

Loading

payom commented Feb 10, 2024 •

edited

Loading

l33tlinuxh4x0r commented Sep 8, 2024 •

edited

Loading

danir-de commented Dec 27, 2024 •

edited

Loading