Description
Request Description
When running benchmark_app
, certain activation layers such as HardSwish or ReLU are frequently reported with Status.NOT_RUN
in the detailed counters report. We believe this is due to layer fusion, where these operations are merged with adjacent compute-heavy layers (e.g., Convolution), making them unprofiled individually.
We would like to request a feature or flag that allows us to disable such fusion behavior during inference, especially for profiling and analysis use cases. This would allow more fine-grained visibility into the latency and execution of lightweight layers that are currently hidden due to fusion.
Here is an excerpt from the benchmark_detailed_counters_report.csv
where some layers like HardSwish
and Relu
are marked as Status.NOT_RUN
, likely due to fusion:
layerName;execStatus;layerType;execType;realTime (ms);cpuTime (ms) images;Status.NOT_RUN;Parameter;unknown_I8;0.000;0.000 Convert_904;Status.EXECUTED;Subgraph;jit_avx512_I8;0.133;0.133 Convert_904_abcd_acdb_/backbone/conv_first/block/conv/Conv/WithoutBiases;Status.EXECUTED;Reorder;jit_uni_f32;0.127;0.127 /backbone/conv_first/block/conv/Conv/WithoutBiases;Status.EXECUTED;Convolution;brgconv_avx512_f32;0.128;0.128 /backbone/conv_first/block/act/HardSwish;Status.NOT_RUN;HSwish;undef;0.000;0.000 /backbone/stages.0/stages.0.0/block/block.0/block/conv/Conv/WithoutBiases;Status.EXECUTED;GroupConvolution;jit_avx512_dw_f32;0.024;0.024 /backbone/stages.0/stages.0.0/block/block.0/block/act/Relu;Status.NOT_RUN;Relu;undef;0.000;0.000 /backbone/stages.0/stages.0.0/block/block.1/avgpool/GlobalAveragePool;Status.EXECUTED;ReduceMean;jit_avx512_f32;0.011;0.011
Feature Use Case
This feature would be especially useful for model developers and performance engineers who need detailed per-layer latency reporting. For example, when analyzing quantized or optimized models, it's important to understand the cost of each layer, including non-linearities like HardSwish, which are currently absorbed into fused ops.
Issue submission checklist
- The feature request or improvement must be related to OpenVINO