[Detector Support]: OpenVINO Intel CPU faster than iGPU #20729

andy2301 · 2025-10-30T18:52:37Z

andy2301
Oct 30, 2025

Describe the problem you are having

I have some interesting observations. Appreciate if you can help check whether my configuration can be optimized. Thanks.

1) Intel CPU is 2-3x faster than iGPU with OpenVINO and mobilenet_v2;
with OpenVINO and mobilenet_v2, Intel CPU latency is 3ms, and Intel iGPU is 8ms (System Metrics -> Detector Inference Speed). Is this expected? Is there anything wrong with my detector configuration?

This is single stream/detector. iGPU might be able to scale better with multiple streams/detectors.

2) latency reported on System Metrics -> Detector Inference Speed for iGPU is 2x higher than benchmark_app.

Using Intel iGPU, the average latency reported on "Detector Inference Speed" is 8ms, but running benchmark_app on a sample image from the same camera has an average latency of 4.3ms;
Interestingly, the latency reported in benchmark_app for Intel CPU is 2.9ms, similar to the corresponding latency shown as Detector Inference Speed;
Is there any item in my configuration that can be optimized so that the actual detector can also get the 4.3ms response time with Intel iGPU?

Note: the only configuration change between CPU and iGPU is the following:

detectors:
  ov:
    type: openvino
    device: GPU
    #device: CPU

Version

0.16.2-4d58206

Frigate config file

go2rtc:
  streams:
    yarddoor:
      - rtsp://192.168.1.72:554/ch0_0.h264
      - "ffmpeg:yarddoor#audio=opus"
    yarddoor_sub:
      - rtsp://192.168.1.72:554/ch0_1.h264
      - "ffmpeg:yarddoor_sub#audio=opus"
  webrtc:
    candidates:
      - 192.168.1.51:8555
      - stun:8555

cameras:
  yarddoor:
    ffmpeg:
      output_args:
        record: preset-record-generic-audio-copy
      inputs:
        - path: rtsp://127.0.0.1:8554/yarddoor
          input_args: preset-rtsp-restream
          roles:
            - record
        - path: rtsp://127.0.0.1:8554/yarddoor_sub
          input_args: preset-rtsp-restream
          roles:
            - audio
            - detect
    detect:
      enabled: true
      fps: 5

detect:
  enabled: true

detectors:
  ov:
    type: openvino
    device: GPU
    #device: CPU

model:
  width: 300
  height: 300
  input_tensor: nhwc
  input_pixel_format: bgr
  path: /openvino-model/ssdlite_mobilenet_v2.xml
  labelmap_path: /openvino-model/coco_91cl_bkgr.txt

docker-compose file or Docker CLI command

docker run -itd \
--privileged \
-p 8971:8971 \
-p 8554:8554 \
-p 8555:8555/tcp \
-p 8555:8555/udp \
--stop-timeout 30 \
--mount type=tmpfs,target=/tmp/cache,tmpfs-size=1000000000 \
--device /dev/dri/renderD128 \
--shm-size=512m \
-v $(pwd)/config:/config \
-v /etc/localtime:/etc/localtime:ro \
ghcr.io/blakeblackshear/frigate:stable

Relevant Frigate log output

No errors in log

Install method

Docker CLI

Object Detector

OpenVino

Screenshots of the Frigate UI's System metrics pages

Any other information that may be helpful

I used the following to get the benchmark_app latency.

wget -O test_frame.jpg "http://<frigate_ip>:5000/api/yarddoor/latest.jpg"

# benchmark_app -m /openvino-model/ssdlite_mobilenet_v2.xml -i ./test_frame.jpg -d CPU -hint latency -t 10
....
[ WARNING ] Image is resized from ((720, 1280)) to ((300, 300))
[Step 10/11] Measuring performance (Start inference asynchronously, 1 inference requests, limits: 10000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 4.49 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices:['CPU']
[ INFO ] Count:            3364 iterations
[ INFO ] Duration:         10004.32 ms
[ INFO ] Latency:
[ INFO ]    Median:        2.90 ms
[ INFO ]    Average:       2.89 ms
[ INFO ]    Min:           2.60 ms
[ INFO ]    Max:           3.82 ms
[ INFO ] Throughput:   336.25 FPS

# benchmark_app -m /openvino-model/ssdlite_mobilenet_v2.xml -i ./test_frame.jpg -d GPU -hint latency -t 10
...
[ WARNING ] Image is resized from ((720, 1280)) to ((300, 300))
[Step 10/11] Measuring performance (Start inference asynchronously, 1 inference requests, limits: 10000 ms duration)
[ INFO ] Benchmarking in inference only mode (inputs filling are not included in measurement loop).
[ INFO ] First inference took 7.97 ms
[Step 11/11] Dumping statistics report
[ INFO ] Execution Devices:['GPU.0']
[ INFO ] Count:            2274 iterations
[ INFO ] Duration:         10006.53 ms
[ INFO ] Latency:
[ INFO ]    Median:        4.27 ms
[ INFO ]    Average:       4.31 ms
[ INFO ]    Min:           4.17 ms
[ INFO ]    Max:           6.71 ms
[ INFO ] Throughput:   227.25 FPS

2025-10-30T18:53:11Z

inkeep-ai-assistant[bot]
bot Oct 30, 2025

\

Your observations are valid, and there are several factors at play:

1) CPU Faster Than iGPU - This Can Be Expected

For the lightweight SSDLite MobileNet v2 model at 300×300 resolution, Intel CPUs can indeed outperform iGPUs in single-stream scenarios(1)(2). The documentation notes that OpenVINO is supported on 6th Gen Intel platforms (Skylake) and newer, and will run on both CPUs and GPUs(1)(2).

Your observation that the iGPU might scale better with multiple streams is correct - GPUs typically show their advantage when handling parallel workloads across multiple cameras(3).

2) Detector Inference Speed vs benchmark_app Discrepancy

The "Detector CPU Usage" metric tracks the CPU usage required to preprocess model inputs before they run on the iGPU - these preprocessing operations cannot run on the GPU itself(4). This preprocessing overhead explains part of the latency difference you're seeing between Frigate's reported inference speed (8ms) and benchmark_app (4.3ms).

The most important metric is inference time, not just the preprocessing overhead. Your 8ms inference time is still respectable and indicates the iGPU is being used(4).

Configuration Recommendations

Your configuration looks correct. The OpenVINO detector is properly configured with device: GPU(1)(2):

detectors:
  ov:
    type: openvino
    device: GPU

model:
  width: 300
  height: 300
  input_tensor: nhwc
  input_pixel_format: bgr
  path: /openvino-model/ssdlite_mobilenet_v2.xml
  labelmap_path: /openvino-model/coco_91cl_bkgr.txt

(1)(2)

When using many cameras, you can define multiple detectors assuming GPU resources are available(1)(2):

detectors:
  ov_0:
    type: openvino
    device: GPU
  ov_1:
    type: openvino
    device: GPU

(1)(2)

Verification

If you want to confirm the GPU is actually being used (not just CPU), note that there's a known bug with Intel's GPU monitoring tools where GPU usage may not display correctly, even when the GPU is actively being used(4).

Your current setup is working as designed. The CPU detector shows better single-stream performance for this lightweight model, but the GPU configuration will scale better as you add more cameras.

📚 Sources:

Have another question? Just tag @inkeep.

0 replies

hawkeye217 · 2025-11-01T14:26:23Z

hawkeye217
Nov 1, 2025
Collaborator

What hardware are you running on?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Detector Support]: OpenVINO Intel CPU faster than iGPU #20729

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

[Detector Support]: OpenVINO Intel CPU faster than iGPU #20729

Uh oh!

andy2301 Oct 30, 2025

Describe the problem you are having

Version

Frigate config file

docker-compose file or Docker CLI command

Relevant Frigate log output

Install method

Object Detector

Screenshots of the Frigate UI's System metrics pages

Any other information that may be helpful

Replies: 2 comments

Uh oh!

inkeep-ai-assistant[bot] bot Oct 30, 2025

1) CPU Faster Than iGPU - This Can Be Expected

2) Detector Inference Speed vs benchmark_app Discrepancy

Configuration Recommendations

Verification

Uh oh!

hawkeye217 Nov 1, 2025 Collaborator

andy2301
Oct 30, 2025

inkeep-ai-assistant[bot]
bot Oct 30, 2025

hawkeye217
Nov 1, 2025
Collaborator