Skip to content

Conversation

@niwinanto
Copy link
Collaborator

@niwinanto niwinanto commented Mar 26, 2025

Coalescing widening copy from innerloop has positive impact when there is no inline spilling. However, in case of spilling, it inversely affects the stack size.

|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|-------------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|-------------------------|-------------------------|-------------------------------|------------------------------|--------------|--------------|-----------------|-----------------------------|----------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|---------------|--------------|------------|-------------|-------------|-------------|
| Core_StackSize                           | Conv2D_bfp16_9 | ElemDiv_aie2_1 | GEMM_Bfp16_opt_5 | TanhTemplated_aie2_bfloat16_0 | TanhTemplated_aie2_bfloat16_1 | AddAttributeBroadcasting_aie2_bf16 | AddBf16_aie2_0 | AddBf16_aie2_1 | AddBf16_aie2_2 | AvgPool2D_bfloat16_0 | AvgPool2D_bfloat16_1 | AvgPool2D_bfloat16_2 | AvgPool2D_bfloat16_3 | AvgPool2dVariant_bf16_0 | AvgPool2dVariant_bf16_1 | AvgPool2dVariant_bf16_2 | AvgPool2dVariant_bf16_3 | AvgPool2dVariant_bf16_4 | AvgPool2dVariant_bf16_5 | AvgPool2dVariant_bf16_6 | AvgPool2dVariant_bf16_7 | AvgPool2dVariant_bf16_8 | AvgPool2dVariant_bf16_9 | AvgPool2dVariant_bf16_10 | AvgPool2dVariant_bf16_11 | AvgPool2dVariant_bf16_12 | AvgPool2dVariant_bf16_13 | AvgPool2dVariant_bf16_14 | AvgPool2dVariant_bf16_15 | AvgPool2dVariant_bf16_16 | Clip_aie2_bf16 | CompareOpsAttributeBroadcasting_bf16 | CompareOpsAttributeBroadcasting_bf16_1 | Conv2D_DW_bf16_0 | Conv2D_DW_bf16_1 | Conv2D_DW_bf16_2 | Conv2D_DW_bf16_3 | Conv2D_DW_bf16_4 | Conv2D_bfp16_0 | Conv2D_bfp16_1 | Conv2D_bfp16_2 | Conv2D_bfp16_3 | Conv2D_bfp16_4 | Conv2D_bfp16_5 | Conv2D_bfp16_6 | Conv2D_bfp16_7 | Conv2D_bfp16_8 | Conv2D_bfp16_10 | Conv2D_bfp16_11 | Conv2D_bfp16_12 | Conv2D_bfp16_13 | Conv2D_bfp16_14 | Conv2D_bfp16_OC8_0 | Conv2D_bfp16_OC8_1 | Conv2D_bfp16_OC8_2 | Conv2D_bfp16_OC8_3 | Conv2D_bfp16_OC8_4 | Conv2D_bfp16_OC8_5 | Conv2D_bfp16_OC8_6 | Conv2D_bfp16_OC8_7 | Conv2D_bfp16_OC8_8 | Conv2D_bfp16_OC8_9 | Conv2D_bfp16_OC8_10 | Conv2D_bfp16_PSUM_FLOAT_0 | Conv2D_bfp16_PSUM_FLOAT_1 | Conv2D_bfp16_PSUM_FLOAT_2 | GEMM_Bfp16_opt_0 | GEMM_Bfp16_opt_1 | GEMM_Bfp16_opt_2 | GEMM_Bfp16_opt_3 | GEMM_Bfp16_opt_4 | GeluTemplated_aie2_bf16 | Hardswish_aie2_1 | MaxPool2D_bf16_0 | MaxPool2D_bf16_1 | MaxPool2D_bf16_2 | MaxPool2D_bf16_3 | MaxPool2D_bf16_4 | MulAttributeBroadcasting_aie2_bf16_0 | MulBf16_aie2_0 | Neg_aie2_1   | Pad2D_bf16_0 | ReduceMaxAxis_1_aie2_bf16 | ReduceMaxAxis_2_aie2_bf16 | ReduceMaxAxis_3_aie2_bf16 | ReduceMaxAxis_4_aie2_bf16 | ReduceMaxAxis_5_aie2_bf16 | ReduceMaxAxis_6_aie2_bf16 | ReduceMaxAxis_7_aie2_bf16 | ReduceSumAxis_1_aie2_bf16 | ReduceSumAxis_2_aie2_bf16 | ReduceSumAxis_3_aie2_bf16 | ReduceSumAxis_4_aie2_bf16 | ReduceSumAxis_5_aie2_bf16 | ReduceSumAxis_6_aie2_bf16 | ReduceSumAxis_7_aie2_bf16 | SigmoidTemplated_bf16_0 | SigmoidTemplated_bf16_1 | SigmoidTemplated_bf16_1_AIE2p | Sigmoidmode1Templated_bf16_0 | Sqrt_bf16_0  | Sqrt_bf16_1  | Sub_aie2_bf16_0 | TanhTemplatedmode1_bfloat16 | SiLU_aie2_bf16 | ReduceMeanAxis_1_aie2_bf16 | ReduceMeanAxis_2_aie2_bf16 | ReduceMeanAxis_3_aie2_bf16 | ReduceMeanAxis_4_aie2_bf16 | ReduceMeanAxis_5_aie2_bf16 | ReduceMeanAxis_6_aie2_bf16 | ReduceMeanAxis_7_aie2_bf16 | Sin_aie2_bf16 | Average diff | Diff stdev | Quantile #1 | Quantile #2 | Quantile #3 |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|-------------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|-------------------------|-------------------------|-------------------------------|------------------------------|--------------|--------------|-----------------|-----------------------------|----------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|---------------|--------------|------------|-------------|-------------|-------------|
| mllib_Innerloop_with_coalescing_peano    |                |                |                  |                               |                               |                                704 |            640 |            640 |            640 |                  448 |                  448 |                  448 |                  448 |                     832 |                     832 |                     832 |                     832 |                     832 |                     832 |                     832 |                     832 |                     832 |                     832 |                      832 |                      832 |                      832 |                      832 |                      832 |                      832 |                      832 |            384 |                                  576 |                                    576 |              320 |              320 |              320 |              320 |              320 |            896 |            896 |            896 |            896 |            896 |            896 |            896 |            896 |            896 |             896 |             896 |             896 |             896 |             896 |                512 |               1280 |                512 |                512 |                512 |               1344 |                512 |               1280 |                512 |                512 |                 512 |                      1152 |                      1152 |                      1152 |              768 |              768 |              704 |              768 |              768 |                     448 |              448 |              320 |              320 |              320 |              320 |              320 |                                  704 |            512 |          512 |          192 |                       384 |                       384 |                       384 |                       384 |                       384 |                       384 |                       384 |                       640 |                       640 |                       640 |                       640 |                       640 |                       640 |                       640 |                     448 |                     448 |                           448 |                          576 |          640 |          640 |             640 |                         640 |            832 |                        640 |                        640 |                        640 |                        640 |                        640 |                        640 |                        640 |          2240 | +0.00%       |       0.00 | +0.00%      | +0.00%      | +0.00%      |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|-------------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|-------------------------|-------------------------|-------------------------------|------------------------------|--------------|--------------|-----------------|-----------------------------|----------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|---------------|--------------|------------|-------------|-------------|-------------|
| mllib_Innerloop_without_coalescing_peano |            896 |                |                  |                               |                               |                                704 |            640 |            640 |            640 |                  448 |                  448 |                  448 |                  448 |                     832 |                     832 |                     832 |                     832 |                     832 |                     832 |                     832 |                     832 |                     832 |                     832 |                      832 |                      832 |                      832 |                      832 |                      832 |                      832 |                      832 |            384 |                                  576 |                                    576 |              320 |              320 |              320 |              320 |              320 |            896 |            896 |            896 |            896 |            896 |            896 |            896 |            896 |            896 |             896 |             896 |             896 |             896 |             896 |                512 |               1280 |                512 |                512 |                512 |               1344 |                512 |               1280 |                512 |                512 |                 512 |                      1152 |                      1152 |                      1152 |              768 |              768 |              704 |              768 |              768 |                     448 |              448 |              320 |              320 |              320 |              320 |              320 |                                  704 |            512 |          512 |          192 |                       384 |                       384 |                       384 |                       384 |                       384 |                       384 |                       384 |                       640 |                       640 |                       640 |                       640 |                       640 |                       640 |                       640 |                     448 |                     448 |                           448 |                          576 |          640 |          640 |             640 |                         640 |            704 |                        512 |                        512 |                        512 |                        512 |                        512 |                        512 |                        512 |          1472 | -1.76%       |       6.02 | +0.00%      | +0.00%      | +0.00%      |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|-------------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|-------------------------|-------------------------|-------------------------------|------------------------------|--------------|--------------|-----------------|-----------------------------|----------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|---------------|--------------|------------|-------------|-------------|-------------|
| Total diff                               |                |                |                  |                               |                               | SAME(+0.00%)                       | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)         | SAME(+0.00%)         | SAME(+0.00%)         | SAME(+0.00%)         | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)   | SAME(+0.00%)                         | SAME(+0.00%)                           | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)    | SAME(+0.00%)    | SAME(+0.00%)    | SAME(+0.00%)    | SAME(+0.00%)    | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)        | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)            | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)                         | SAME(+0.00%)   | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)                  | SAME(+0.00%)                 | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%)    | SAME(+0.00%)                | IMPR(-15.38%)  | IMPR(-20.00%)              | IMPR(-20.00%)              | IMPR(-20.00%)              | IMPR(-20.00%)              | IMPR(-20.00%)              | IMPR(-20.00%)              | IMPR(-20.00%)              | IMPR(-34.29%) | -1.76%       |       6.02 | +0.00%      | +0.00%      | +0.00%      |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|-------------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|-------------------------|-------------------------|-------------------------------|------------------------------|--------------|--------------|-----------------|-----------------------------|----------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|---------------|--------------|------------|-------------|-------------|-------------|

|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|--------------|-------------------------|--------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------|------------------------------|-----------------|-----------------------------|-------------------------|----------------------------|----------------------------|-------------------------|-------------------------------|----------------------------|----------------------------|----------------------------|----------------------------|---------------|--------------|------------|-------------|-------------|-------------|
| Core_Compute_Insn_Count                  | Conv2D_bfp16_9 | ElemDiv_aie2_1 | GEMM_Bfp16_opt_5 | TanhTemplated_aie2_bfloat16_0 | TanhTemplated_aie2_bfloat16_1 | Sqrt_bf16_1  | GeluTemplated_aie2_bf16 | Sqrt_bf16_0  | AddAttributeBroadcasting_aie2_bf16 | AddBf16_aie2_0 | AddBf16_aie2_1 | AddBf16_aie2_2 | AvgPool2D_bfloat16_0 | AvgPool2D_bfloat16_1 | AvgPool2D_bfloat16_2 | AvgPool2D_bfloat16_3 | AvgPool2dVariant_bf16_0 | AvgPool2dVariant_bf16_1 | AvgPool2dVariant_bf16_2 | AvgPool2dVariant_bf16_3 | AvgPool2dVariant_bf16_4 | AvgPool2dVariant_bf16_5 | AvgPool2dVariant_bf16_6 | AvgPool2dVariant_bf16_7 | AvgPool2dVariant_bf16_8 | AvgPool2dVariant_bf16_9 | AvgPool2dVariant_bf16_10 | AvgPool2dVariant_bf16_11 | AvgPool2dVariant_bf16_12 | AvgPool2dVariant_bf16_13 | AvgPool2dVariant_bf16_14 | AvgPool2dVariant_bf16_15 | AvgPool2dVariant_bf16_16 | Clip_aie2_bf16 | CompareOpsAttributeBroadcasting_bf16 | CompareOpsAttributeBroadcasting_bf16_1 | Conv2D_DW_bf16_0 | Conv2D_DW_bf16_1 | Conv2D_DW_bf16_2 | Conv2D_DW_bf16_3 | Conv2D_DW_bf16_4 | Conv2D_bfp16_0 | Conv2D_bfp16_1 | Conv2D_bfp16_2 | Conv2D_bfp16_3 | Conv2D_bfp16_4 | Conv2D_bfp16_5 | Conv2D_bfp16_6 | Conv2D_bfp16_7 | Conv2D_bfp16_8 | Conv2D_bfp16_10 | Conv2D_bfp16_11 | Conv2D_bfp16_12 | Conv2D_bfp16_13 | Conv2D_bfp16_14 | Conv2D_bfp16_OC8_0 | Conv2D_bfp16_OC8_1 | Conv2D_bfp16_OC8_2 | Conv2D_bfp16_OC8_3 | Conv2D_bfp16_OC8_4 | Conv2D_bfp16_OC8_5 | Conv2D_bfp16_OC8_6 | Conv2D_bfp16_OC8_7 | Conv2D_bfp16_OC8_8 | Conv2D_bfp16_OC8_9 | Conv2D_bfp16_OC8_10 | Conv2D_bfp16_PSUM_FLOAT_0 | Conv2D_bfp16_PSUM_FLOAT_1 | Conv2D_bfp16_PSUM_FLOAT_2 | GEMM_Bfp16_opt_0 | GEMM_Bfp16_opt_1 | GEMM_Bfp16_opt_2 | GEMM_Bfp16_opt_3 | GEMM_Bfp16_opt_4 | Hardswish_aie2_1 | MaxPool2D_bf16_0 | MaxPool2D_bf16_1 | MaxPool2D_bf16_2 | MaxPool2D_bf16_3 | MaxPool2D_bf16_4 | MulAttributeBroadcasting_aie2_bf16_0 | MulBf16_aie2_0 | Neg_aie2_1   | Pad2D_bf16_0 | ReduceMaxAxis_1_aie2_bf16 | ReduceMaxAxis_2_aie2_bf16 | ReduceMaxAxis_3_aie2_bf16 | ReduceMaxAxis_4_aie2_bf16 | ReduceMaxAxis_5_aie2_bf16 | ReduceMaxAxis_6_aie2_bf16 | ReduceMaxAxis_7_aie2_bf16 | ReduceMeanAxis_7_aie2_bf16 | ReduceSumAxis_1_aie2_bf16 | ReduceSumAxis_2_aie2_bf16 | ReduceSumAxis_3_aie2_bf16 | ReduceSumAxis_4_aie2_bf16 | ReduceSumAxis_5_aie2_bf16 | ReduceSumAxis_6_aie2_bf16 | ReduceSumAxis_7_aie2_bf16 | SiLU_aie2_bf16 | Sigmoidmode1Templated_bf16_0 | Sub_aie2_bf16_0 | TanhTemplatedmode1_bfloat16 | SigmoidTemplated_bf16_0 | ReduceMeanAxis_3_aie2_bf16 | ReduceMeanAxis_6_aie2_bf16 | SigmoidTemplated_bf16_1 | SigmoidTemplated_bf16_1_AIE2p | ReduceMeanAxis_5_aie2_bf16 | ReduceMeanAxis_2_aie2_bf16 | ReduceMeanAxis_4_aie2_bf16 | ReduceMeanAxis_1_aie2_bf16 | Sin_aie2_bf16 | Average diff | Diff stdev | Quantile #1 | Quantile #2 | Quantile #3 |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|--------------|-------------------------|--------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------|------------------------------|-----------------|-----------------------------|-------------------------|----------------------------|----------------------------|-------------------------|-------------------------------|----------------------------|----------------------------|----------------------------|----------------------------|---------------|--------------|------------|-------------|-------------|-------------|
| mllib_Innerloop_with_coalescing_peano    |                |                |                  |                               |                               |         1791 |                    3363 |        13775 |                                456 |            668 |            795 |            795 |                 1118 |                  822 |                  532 |                  532 |                    3108 |                    1820 |                    3888 |                     759 |                    1219 |                    2440 |                    2746 |                    2251 |                    3418 |                    2744 |                     2972 |                     3223 |                     2735 |                     2914 |                     3275 |                     4664 |                     1042 |            146 |                                 1236 |                                   1242 |             1036 |             4424 |             1884 |             1036 |             1012 |          10459 |          21611 |          10459 |          11700 |          10703 |          40909 |           5581 |           2049 |           1466 |           35809 |           35809 |            7586 |           20616 |           14400 |               7222 |              16200 |               4835 |               9898 |               4718 |              29452 |              12910 |              14668 |              23652 |               5232 |               30046 |                      6573 |                      4901 |                      4973 |             1282 |             4084 |             3479 |             4084 |             4084 |              862 |             1695 |             1191 |              694 |              694 |              694 |                                  688 |            231 |          132 |         2357 |                      8572 |                      8588 |                      3381 |                      8608 |                      3381 |                      3362 |                      2601 |                       9037 |                     16528 |                     16544 |                      6523 |                     16564 |                      6523 |                      6504 |                      5100 |           1170 |                         6568 |             658 |                        7338 |                     969 |                      10165 |                      10146 |                     521 |                           521 |                      11322 |                      18544 |                      27924 |                      27916 |          1717 | +0.00%       |       0.00 | +0.00%      | +0.00%      | +0.00%      |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|--------------|-------------------------|--------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------|------------------------------|-----------------|-----------------------------|-------------------------|----------------------------|----------------------------|-------------------------|-------------------------------|----------------------------|----------------------------|----------------------------|----------------------------|---------------|--------------|------------|-------------|-------------|-------------|
| mllib_Innerloop_without_coalescing_peano |          35989 |                |                  |                               |                               |         1793 |                    3364 |        13777 |                                456 |            668 |            795 |            795 |                 1118 |                  822 |                  532 |                  532 |                    3108 |                    1820 |                    3888 |                     759 |                    1219 |                    2440 |                    2746 |                    2251 |                    3418 |                    2744 |                     2972 |                     3223 |                     2735 |                     2914 |                     3275 |                     4664 |                     1042 |            146 |                                 1236 |                                   1242 |             1036 |             4424 |             1884 |             1036 |             1012 |          10459 |          21611 |          10459 |          11700 |          10703 |          40909 |           5581 |           2049 |           1466 |           35809 |           35809 |            7586 |           20616 |           14400 |               7222 |              16200 |               4835 |               9898 |               4718 |              29452 |              12910 |              14668 |              23652 |               5232 |               30046 |                      6573 |                      4901 |                      4973 |             1282 |             4084 |             3479 |             4084 |             4084 |              862 |             1695 |             1191 |              694 |              694 |              694 |                                  688 |            231 |          132 |         2357 |                      8572 |                      8588 |                      3381 |                      8608 |                      3381 |                      3362 |                      2601 |                       9037 |                     16528 |                     16544 |                      6523 |                     16564 |                      6523 |                      6504 |                      5100 |           1170 |                         6568 |             658 |                        7338 |                     968 |                      10149 |                      10130 |                     520 |                           520 |                      11290 |                      18416 |                      27668 |                      27660 |          1493 | -0.15%       |       1.26 | +0.00%      | +0.00%      | +0.00%      |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|--------------|-------------------------|--------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------|------------------------------|-----------------|-----------------------------|-------------------------|----------------------------|----------------------------|-------------------------|-------------------------------|----------------------------|----------------------------|----------------------------|----------------------------|---------------|--------------|------------|-------------|-------------|-------------|
| Total diff                               |                |                |                  |                               |                               | REGR(+0.11%) | SAME(+0.03%)            | SAME(+0.01%) | SAME(+0.00%)                       | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)         | SAME(+0.00%)         | SAME(+0.00%)         | SAME(+0.00%)         | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)   | SAME(+0.00%)                         | SAME(+0.00%)                           | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)    | SAME(+0.00%)    | SAME(+0.00%)    | SAME(+0.00%)    | SAME(+0.00%)    | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)        | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)                         | SAME(+0.00%)   | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)               | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)   | SAME(+0.00%)                 | SAME(+0.00%)    | SAME(+0.00%)                | IMPR(-0.10%)            | IMPR(-0.16%)               | IMPR(-0.16%)               | IMPR(-0.19%)            | IMPR(-0.19%)                  | IMPR(-0.28%)               | IMPR(-0.69%)               | IMPR(-0.92%)               | IMPR(-0.92%)               | IMPR(-13.05%) | -0.15%       |       1.26 | +0.00%      | +0.00%      | +0.00%      |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|--------------|-------------------------|--------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------|------------------------------|-----------------|-----------------------------|-------------------------|----------------------------|----------------------------|-------------------------|-------------------------------|----------------------------|----------------------------|----------------------------|----------------------------|---------------|--------------|------------|-------------|-------------|-------------|

|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|-------------------------|-------------------------|-------------------------------|--------------|--------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|-------------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|------------------------------|-----------------|-----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------|---------------|--------------|------------|-------------|-------------|-------------|
| Core_PMSize                              | Conv2D_bfp16_9 | ElemDiv_aie2_1 | GEMM_Bfp16_opt_5 | TanhTemplated_aie2_bfloat16_0 | TanhTemplated_aie2_bfloat16_1 | SigmoidTemplated_bf16_0 | SigmoidTemplated_bf16_1 | SigmoidTemplated_bf16_1_AIE2p | Sqrt_bf16_1  | Sqrt_bf16_0  | AddAttributeBroadcasting_aie2_bf16 | AddBf16_aie2_0 | AddBf16_aie2_1 | AddBf16_aie2_2 | AvgPool2D_bfloat16_0 | AvgPool2D_bfloat16_1 | AvgPool2D_bfloat16_2 | AvgPool2D_bfloat16_3 | AvgPool2dVariant_bf16_0 | AvgPool2dVariant_bf16_1 | AvgPool2dVariant_bf16_2 | AvgPool2dVariant_bf16_3 | AvgPool2dVariant_bf16_4 | AvgPool2dVariant_bf16_5 | AvgPool2dVariant_bf16_6 | AvgPool2dVariant_bf16_7 | AvgPool2dVariant_bf16_8 | AvgPool2dVariant_bf16_9 | AvgPool2dVariant_bf16_10 | AvgPool2dVariant_bf16_11 | AvgPool2dVariant_bf16_12 | AvgPool2dVariant_bf16_13 | AvgPool2dVariant_bf16_14 | AvgPool2dVariant_bf16_15 | AvgPool2dVariant_bf16_16 | Clip_aie2_bf16 | CompareOpsAttributeBroadcasting_bf16 | CompareOpsAttributeBroadcasting_bf16_1 | Conv2D_DW_bf16_0 | Conv2D_DW_bf16_1 | Conv2D_DW_bf16_2 | Conv2D_DW_bf16_3 | Conv2D_DW_bf16_4 | Conv2D_bfp16_0 | Conv2D_bfp16_1 | Conv2D_bfp16_2 | Conv2D_bfp16_3 | Conv2D_bfp16_4 | Conv2D_bfp16_5 | Conv2D_bfp16_6 | Conv2D_bfp16_7 | Conv2D_bfp16_8 | Conv2D_bfp16_10 | Conv2D_bfp16_11 | Conv2D_bfp16_12 | Conv2D_bfp16_13 | Conv2D_bfp16_14 | Conv2D_bfp16_OC8_0 | Conv2D_bfp16_OC8_1 | Conv2D_bfp16_OC8_2 | Conv2D_bfp16_OC8_3 | Conv2D_bfp16_OC8_4 | Conv2D_bfp16_OC8_5 | Conv2D_bfp16_OC8_6 | Conv2D_bfp16_OC8_7 | Conv2D_bfp16_OC8_8 | Conv2D_bfp16_OC8_9 | Conv2D_bfp16_OC8_10 | Conv2D_bfp16_PSUM_FLOAT_0 | Conv2D_bfp16_PSUM_FLOAT_1 | Conv2D_bfp16_PSUM_FLOAT_2 | GEMM_Bfp16_opt_0 | GEMM_Bfp16_opt_1 | GEMM_Bfp16_opt_2 | GEMM_Bfp16_opt_3 | GEMM_Bfp16_opt_4 | GeluTemplated_aie2_bf16 | Hardswish_aie2_1 | MaxPool2D_bf16_0 | MaxPool2D_bf16_1 | MaxPool2D_bf16_2 | MaxPool2D_bf16_3 | MaxPool2D_bf16_4 | MulAttributeBroadcasting_aie2_bf16_0 | MulBf16_aie2_0 | Neg_aie2_1   | Pad2D_bf16_0 | ReduceMaxAxis_1_aie2_bf16 | ReduceMaxAxis_2_aie2_bf16 | ReduceMaxAxis_3_aie2_bf16 | ReduceMaxAxis_4_aie2_bf16 | ReduceMaxAxis_5_aie2_bf16 | ReduceMaxAxis_6_aie2_bf16 | ReduceMaxAxis_7_aie2_bf16 | ReduceSumAxis_1_aie2_bf16 | ReduceSumAxis_2_aie2_bf16 | ReduceSumAxis_3_aie2_bf16 | ReduceSumAxis_4_aie2_bf16 | ReduceSumAxis_5_aie2_bf16 | ReduceSumAxis_6_aie2_bf16 | ReduceSumAxis_7_aie2_bf16 | Sigmoidmode1Templated_bf16_0 | Sub_aie2_bf16_0 | TanhTemplatedmode1_bfloat16 | ReduceMeanAxis_1_aie2_bf16 | ReduceMeanAxis_2_aie2_bf16 | ReduceMeanAxis_3_aie2_bf16 | ReduceMeanAxis_4_aie2_bf16 | ReduceMeanAxis_5_aie2_bf16 | ReduceMeanAxis_6_aie2_bf16 | ReduceMeanAxis_7_aie2_bf16 | SiLU_aie2_bf16 | Sin_aie2_bf16 | Average diff | Diff stdev | Quantile #1 | Quantile #2 | Quantile #3 |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|-------------------------|-------------------------|-------------------------------|--------------|--------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|-------------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|------------------------------|-----------------|-----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------|---------------|--------------|------------|-------------|-------------|-------------|
| mllib_Innerloop_with_coalescing_peano    |                |                |                  |                               |                               |                    2260 |                    2260 |                          2260 |         2436 |         2452 |                               2724 |           2820 |           2868 |           2868 |                 3556 |                 3556 |                 3556 |                 3556 |                    4196 |                    4212 |                    4196 |                    4212 |                    4212 |                    4212 |                    4212 |                    4212 |                    4212 |                    4212 |                     4212 |                     4212 |                     4212 |                     4212 |                     4212 |                     4212 |                     4212 |           2116 |                                 3060 |                                   3060 |             3252 |             3252 |             3252 |             3252 |             3252 |           5764 |           5748 |           5764 |           5748 |           5748 |           5748 |           5828 |           5828 |           5716 |            5748 |            5748 |            5652 |            5748 |            5748 |               6756 |               7396 |               6852 |               6756 |               6756 |               7492 |               6756 |               7396 |               6836 |               6756 |                6740 |                      6116 |                      6116 |                      6116 |             5860 |             5860 |             5604 |             5860 |             5860 |                    2932 |             2340 |             2788 |             2788 |             2788 |             2788 |             2788 |                                 2612 |           2532 |         2100 |         3092 |                      6180 |                      6180 |                      6180 |                      6180 |                      6180 |                      6180 |                      6164 |                      6836 |                      6836 |                      6836 |                      6836 |                      6836 |                      6836 |                      6820 |                         3444 |            2756 |                        3124 |                       6900 |                       6900 |                       6900 |                       6900 |                       6900 |                       6900 |                       6884 |           2324 |          2596 | +0.00%       |       0.00 | +0.00%      | +0.00%      | +0.00%      |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|-------------------------|-------------------------|-------------------------------|--------------|--------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|-------------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|------------------------------|-----------------|-----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------|---------------|--------------|------------|-------------|-------------|-------------|
| mllib_Innerloop_without_coalescing_peano |           5812 |                |                  |                               |                               |                    2292 |                    2292 |                          2292 |         2452 |         2468 |                               2724 |           2820 |           2868 |           2868 |                 3556 |                 3556 |                 3556 |                 3556 |                    4196 |                    4212 |                    4196 |                    4212 |                    4212 |                    4212 |                    4212 |                    4212 |                    4212 |                    4212 |                     4212 |                     4212 |                     4212 |                     4212 |                     4212 |                     4212 |                     4212 |           2116 |                                 3060 |                                   3060 |             3252 |             3252 |             3252 |             3252 |             3252 |           5764 |           5748 |           5764 |           5748 |           5748 |           5748 |           5828 |           5828 |           5716 |            5748 |            5748 |            5652 |            5748 |            5748 |               6756 |               7396 |               6852 |               6756 |               6756 |               7492 |               6756 |               7396 |               6836 |               6756 |                6740 |                      6116 |                      6116 |                      6116 |             5860 |             5860 |             5604 |             5860 |             5860 |                    2932 |             2340 |             2788 |             2788 |             2788 |             2788 |             2788 |                                 2612 |           2532 |         2100 |         3092 |                      6180 |                      6180 |                      6180 |                      6180 |                      6180 |                      6180 |                      6164 |                      6836 |                      6836 |                      6836 |                      6836 |                      6836 |                      6836 |                      6820 |                         3444 |            2756 |                        3124 |                       6884 |                       6884 |                       6884 |                       6884 |                       6884 |                       6884 |                       6868 |           2308 |          2548 | +0.01%       |       0.32 | +0.00%      | +0.00%      | +0.00%      |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|-------------------------|-------------------------|-------------------------------|--------------|--------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|-------------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|------------------------------|-----------------|-----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------|---------------|--------------|------------|-------------|-------------|-------------|
| Total diff                               |                |                |                  |                               |                               | REGR(+1.42%)            | REGR(+1.42%)            | REGR(+1.42%)                  | REGR(+0.66%) | REGR(+0.65%) | SAME(+0.00%)                       | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)         | SAME(+0.00%)         | SAME(+0.00%)         | SAME(+0.00%)         | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)   | SAME(+0.00%)                         | SAME(+0.00%)                           | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)    | SAME(+0.00%)    | SAME(+0.00%)    | SAME(+0.00%)    | SAME(+0.00%)    | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)        | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)            | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)                         | SAME(+0.00%)   | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)                 | SAME(+0.00%)    | SAME(+0.00%)                | IMPR(-0.23%)               | IMPR(-0.23%)               | IMPR(-0.23%)               | IMPR(-0.23%)               | IMPR(-0.23%)               | IMPR(-0.23%)               | IMPR(-0.23%)               | IMPR(-0.69%)   | IMPR(-1.85%)  | +0.01%       |       0.32 | +0.00%      | +0.00%      | +0.00%      |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|-------------------------|-------------------------|-------------------------------|--------------|--------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|-------------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|------------------------------|-----------------|-----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------|---------------|--------------|------------|-------------|-------------|-------------|

|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|--------------|--------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------|-----------------|-------------------------|-------------------------|----------------------------|----------------------------|-------------------------|-------------------------------|------------------------------|----------------------------|-----------------------------|----------------------------|----------------------------|----------------------------|---------------|--------------|------------|-------------|-------------|-------------|
| Core_Compute_Cycle_Count                 | Conv2D_bfp16_9 | ElemDiv_aie2_1 | GEMM_Bfp16_opt_5 | TanhTemplated_aie2_bfloat16_0 | TanhTemplated_aie2_bfloat16_1 | Sqrt_bf16_1  | Sqrt_bf16_0  | AddAttributeBroadcasting_aie2_bf16 | AddBf16_aie2_0 | AddBf16_aie2_1 | AddBf16_aie2_2 | AvgPool2D_bfloat16_0 | AvgPool2D_bfloat16_1 | AvgPool2D_bfloat16_2 | AvgPool2D_bfloat16_3 | AvgPool2dVariant_bf16_0 | AvgPool2dVariant_bf16_1 | AvgPool2dVariant_bf16_2 | AvgPool2dVariant_bf16_3 | AvgPool2dVariant_bf16_4 | AvgPool2dVariant_bf16_5 | AvgPool2dVariant_bf16_6 | AvgPool2dVariant_bf16_7 | AvgPool2dVariant_bf16_8 | AvgPool2dVariant_bf16_9 | AvgPool2dVariant_bf16_10 | AvgPool2dVariant_bf16_11 | AvgPool2dVariant_bf16_12 | AvgPool2dVariant_bf16_13 | AvgPool2dVariant_bf16_14 | AvgPool2dVariant_bf16_15 | AvgPool2dVariant_bf16_16 | Clip_aie2_bf16 | CompareOpsAttributeBroadcasting_bf16 | CompareOpsAttributeBroadcasting_bf16_1 | Conv2D_DW_bf16_0 | Conv2D_DW_bf16_1 | Conv2D_DW_bf16_2 | Conv2D_DW_bf16_3 | Conv2D_DW_bf16_4 | Conv2D_bfp16_0 | Conv2D_bfp16_1 | Conv2D_bfp16_2 | Conv2D_bfp16_3 | Conv2D_bfp16_4 | Conv2D_bfp16_5 | Conv2D_bfp16_6 | Conv2D_bfp16_7 | Conv2D_bfp16_8 | Conv2D_bfp16_10 | Conv2D_bfp16_11 | Conv2D_bfp16_12 | Conv2D_bfp16_13 | Conv2D_bfp16_14 | Conv2D_bfp16_OC8_0 | Conv2D_bfp16_OC8_1 | Conv2D_bfp16_OC8_2 | Conv2D_bfp16_OC8_3 | Conv2D_bfp16_OC8_4 | Conv2D_bfp16_OC8_5 | Conv2D_bfp16_OC8_6 | Conv2D_bfp16_OC8_7 | Conv2D_bfp16_OC8_8 | Conv2D_bfp16_OC8_9 | Conv2D_bfp16_OC8_10 | Conv2D_bfp16_PSUM_FLOAT_0 | Conv2D_bfp16_PSUM_FLOAT_1 | Conv2D_bfp16_PSUM_FLOAT_2 | GEMM_Bfp16_opt_0 | GEMM_Bfp16_opt_1 | GEMM_Bfp16_opt_2 | GEMM_Bfp16_opt_3 | GEMM_Bfp16_opt_4 | Hardswish_aie2_1 | MaxPool2D_bf16_0 | MaxPool2D_bf16_1 | MaxPool2D_bf16_2 | MaxPool2D_bf16_3 | MaxPool2D_bf16_4 | MulAttributeBroadcasting_aie2_bf16_0 | MulBf16_aie2_0 | Neg_aie2_1   | Pad2D_bf16_0 | ReduceMaxAxis_1_aie2_bf16 | ReduceMaxAxis_2_aie2_bf16 | ReduceMaxAxis_3_aie2_bf16 | ReduceMaxAxis_4_aie2_bf16 | ReduceMaxAxis_5_aie2_bf16 | ReduceMaxAxis_6_aie2_bf16 | ReduceMaxAxis_7_aie2_bf16 | ReduceMeanAxis_7_aie2_bf16 | ReduceSumAxis_1_aie2_bf16 | ReduceSumAxis_2_aie2_bf16 | ReduceSumAxis_3_aie2_bf16 | ReduceSumAxis_4_aie2_bf16 | ReduceSumAxis_5_aie2_bf16 | ReduceSumAxis_6_aie2_bf16 | ReduceSumAxis_7_aie2_bf16 | SiLU_aie2_bf16 | Sub_aie2_bf16_0 | SigmoidTemplated_bf16_0 | GeluTemplated_aie2_bf16 | ReduceMeanAxis_3_aie2_bf16 | ReduceMeanAxis_6_aie2_bf16 | SigmoidTemplated_bf16_1 | SigmoidTemplated_bf16_1_AIE2p | Sigmoidmode1Templated_bf16_0 | ReduceMeanAxis_5_aie2_bf16 | TanhTemplatedmode1_bfloat16 | ReduceMeanAxis_2_aie2_bf16 | ReduceMeanAxis_4_aie2_bf16 | ReduceMeanAxis_1_aie2_bf16 | Sin_aie2_bf16 | Average diff | Diff stdev | Quantile #1 | Quantile #2 | Quantile #3 |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|--------------|--------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------|-----------------|-------------------------|-------------------------|----------------------------|----------------------------|-------------------------|-------------------------------|------------------------------|----------------------------|-----------------------------|----------------------------|----------------------------|----------------------------|---------------|--------------|------------|-------------|-------------|-------------|
| mllib_Innerloop_with_coalescing_peano    |                |                |                  |                               |                               |         1791 |        13775 |                                517 |            668 |            795 |            795 |                 1118 |                  822 |                  532 |                  532 |                    3108 |                    1820 |                    3888 |                     759 |                    1219 |                    2440 |                    2746 |                    2251 |                    3418 |                    2744 |                     2972 |                     3223 |                     2735 |                     2914 |                     3275 |                     4664 |                     1042 |            146 |                                 1236 |                                   1242 |             1036 |             4424 |             1884 |             1036 |             1012 |          10459 |          21611 |          10459 |          11700 |          10703 |          40909 |           5581 |           2049 |           1466 |           35809 |           35809 |            7638 |           21035 |           14774 |               7222 |              16200 |               4835 |               9898 |               4718 |              29452 |              12910 |              14668 |              23652 |               5232 |               30046 |                      6573 |                      4901 |                      4973 |             1282 |             4084 |             3479 |             4084 |             4084 |              862 |             1695 |             1191 |              694 |              694 |              694 |                                  688 |            231 |          132 |         2357 |                      8629 |                      8646 |                      3443 |                      8665 |                      3443 |                      3424 |                      2601 |                       9037 |                     16574 |                     16590 |                      6569 |                     16610 |                      6569 |                      6550 |                      5100 |           1170 |             658 |                     969 |                    3745 |                      10291 |                      10271 |                     521 |                           521 |                         7312 |                      11448 |                        7940 |                      18670 |                      28050 |                      28042 |          1717 | +0.00%       |       0.00 | +0.00%      | +0.00%      | +0.00%      |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|--------------|--------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------|-----------------|-------------------------|-------------------------|----------------------------|----------------------------|-------------------------|-------------------------------|------------------------------|----------------------------|-----------------------------|----------------------------|----------------------------|----------------------------|---------------|--------------|------------|-------------|-------------|-------------|
| mllib_Innerloop_without_coalescing_peano |          35989 |                |                  |                               |                               |         1793 |        13777 |                                517 |            668 |            795 |            795 |                 1118 |                  822 |                  532 |                  532 |                    3108 |                    1820 |                    3888 |                     759 |                    1219 |                    2440 |                    2746 |                    2251 |                    3418 |                    2744 |                     2972 |                     3223 |                     2735 |                     2914 |                     3275 |                     4664 |                     1042 |            146 |                                 1236 |                                   1242 |             1036 |             4424 |             1884 |             1036 |             1012 |          10459 |          21611 |          10459 |          11700 |          10703 |          40909 |           5581 |           2049 |           1466 |           35809 |           35809 |            7638 |           21035 |           14774 |               7222 |              16200 |               4835 |               9898 |               4718 |              29452 |              12910 |              14668 |              23652 |               5232 |               30046 |                      6573 |                      4901 |                      4973 |             1282 |             4084 |             3479 |             4084 |             4084 |              862 |             1695 |             1191 |              694 |              694 |              694 |                                  688 |            231 |          132 |         2357 |                      8629 |                      8646 |                      3443 |                      8665 |                      3443 |                      3424 |                      2601 |                       9037 |                     16574 |                     16590 |                      6569 |                     16610 |                      6569 |                      6550 |                      5100 |           1170 |             658 |                     968 |                    3741 |                      10275 |                      10255 |                     520 |                           520 |                         7296 |                      11416 |                        7892 |                      18542 |                      27794 |                      27786 |          1493 | -0.16%       |       1.26 | +0.00%      | +0.00%      | +0.00%      |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|--------------|--------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------|-----------------|-------------------------|-------------------------|----------------------------|----------------------------|-------------------------|-------------------------------|------------------------------|----------------------------|-----------------------------|----------------------------|----------------------------|----------------------------|---------------|--------------|------------|-------------|-------------|-------------|
| Total diff                               |                |                |                  |                               |                               | REGR(+0.11%) | SAME(+0.01%) | SAME(+0.00%)                       | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)         | SAME(+0.00%)         | SAME(+0.00%)         | SAME(+0.00%)         | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)   | SAME(+0.00%)                         | SAME(+0.00%)                           | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)    | SAME(+0.00%)    | SAME(+0.00%)    | SAME(+0.00%)    | SAME(+0.00%)    | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)        | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)                         | SAME(+0.00%)   | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)               | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)   | SAME(+0.00%)    | IMPR(-0.10%)            | IMPR(-0.11%)            | IMPR(-0.16%)               | IMPR(-0.16%)               | IMPR(-0.19%)            | IMPR(-0.19%)                  | IMPR(-0.22%)                 | IMPR(-0.28%)               | IMPR(-0.60%)                | IMPR(-0.69%)               | IMPR(-0.91%)               | IMPR(-0.91%)               | IMPR(-13.05%) | -0.16%       |       1.26 | +0.00%      | +0.00%      | +0.00%      |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|--------------|--------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------|-----------------|-------------------------|-------------------------|----------------------------|----------------------------|-------------------------|-------------------------------|------------------------------|----------------------------|-----------------------------|----------------------------|----------------------------|----------------------------|---------------|--------------|------------|-------------|-------------|-------------|

|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|----------------------------|------------------|------------------|----------------|--------------------------|----------------------------|------------------|----------------|------------------|----------------|------------------|-------------------------|---------------------------|------------------------------------|----------------------------------------|----------------|------------------------------|-----------------|--------------------|------------------|----------------------------|------------------|-----------------|----------------------------|--------------------|---------------------------|----------------------------|--------------------|---------------------|------------------|-----------------------------|---------------|--------------------|---------------------------|-------------------------|---------------------------|----------------|---------------------------|------------------|----------------------------|----------------------|--------------------|-------------------------|-------------------------|-----------------|-------------------------|-------------------------|----------------|--------------------|------------------|-----------------|-------------------------|--------------------------------------|-------------------------|---------------------------|-------------------------|---------------------------|----------------|--------------------|-------------------------------|-------------------------|----------------|-----------------|--------------|-------------------------|--------------------|---------------------------|--------------------|---------------------------|---------------------------|-------------------------|-------------------------|--------------------------|--------------------------|----------------|--------------------|---------------|-----------------|---------------------------|----------------|----------------|----------------|---------------------------|--------------------------------------|----------------------------|--------------|------------------|----------------|----------------|---------------------------|--------------------------|------------------|---------------|--------------------------|------------------|--------------------------|----------------------|------------------|---------------------------|---------------------------|--------------------------|------------------|------------------|----------------------|---------------------------|---------------------------|----------------------|----------------|--------------|------------|-------------|-------------|-------------|
| Core_Cycle_Count                         | Conv2D_bfp16_9 | ElemDiv_aie2_1 | GEMM_Bfp16_opt_5 | TanhTemplated_aie2_bfloat16_0 | TanhTemplated_aie2_bfloat16_1 | ReduceMeanAxis_3_aie2_bf16 | Conv2D_DW_bf16_0 | MaxPool2D_bf16_0 | MulBf16_aie2_0 | AvgPool2dVariant_bf16_12 | ReduceMeanAxis_5_aie2_bf16 | Conv2D_DW_bf16_1 | Clip_aie2_bf16 | GEMM_Bfp16_opt_2 | AddBf16_aie2_0 | Conv2D_DW_bf16_3 | AvgPool2dVariant_bf16_4 | ReduceMaxAxis_3_aie2_bf16 | AddAttributeBroadcasting_aie2_bf16 | CompareOpsAttributeBroadcasting_bf16_1 | SiLU_aie2_bf16 | Sigmoidmode1Templated_bf16_0 | Sub_aie2_bf16_0 | Conv2D_bfp16_OC8_3 | Conv2D_DW_bf16_2 | ReduceMeanAxis_6_aie2_bf16 | GEMM_Bfp16_opt_0 | Conv2D_bfp16_12 | ReduceMeanAxis_1_aie2_bf16 | Conv2D_bfp16_OC8_0 | ReduceSumAxis_7_aie2_bf16 | ReduceMeanAxis_4_aie2_bf16 | Conv2D_bfp16_OC8_7 | Conv2D_bfp16_OC8_10 | MaxPool2D_bf16_1 | TanhTemplatedmode1_bfloat16 | Pad2D_bf16_0  | Conv2D_bfp16_OC8_2 | ReduceSumAxis_2_aie2_bf16 | AvgPool2dVariant_bf16_9 | ReduceMaxAxis_4_aie2_bf16 | Conv2D_bfp16_6 | Conv2D_bfp16_PSUM_FLOAT_0 | MaxPool2D_bf16_4 | ReduceMeanAxis_2_aie2_bf16 | AvgPool2D_bfloat16_1 | Conv2D_bfp16_OC8_9 | AvgPool2dVariant_bf16_2 | AvgPool2dVariant_bf16_5 | Conv2D_bfp16_10 | GeluTemplated_aie2_bf16 | AvgPool2dVariant_bf16_8 | Conv2D_bfp16_4 | Conv2D_bfp16_OC8_4 | Hardswish_aie2_1 | Conv2D_bfp16_11 | AvgPool2dVariant_bf16_0 | CompareOpsAttributeBroadcasting_bf16 | SigmoidTemplated_bf16_0 | ReduceSumAxis_3_aie2_bf16 | AvgPool2dVariant_bf16_3 | ReduceMaxAxis_1_aie2_bf16 | AddBf16_aie2_1 | Conv2D_bfp16_OC8_6 | SigmoidTemplated_bf16_1_AIE2p | AvgPool2dVariant_bf16_7 | Conv2D_bfp16_3 | Conv2D_bfp16_13 | Neg_aie2_1   | SigmoidTemplated_bf16_1 | Conv2D_bfp16_OC8_5 | ReduceSumAxis_1_aie2_bf16 | Conv2D_bfp16_OC8_8 | ReduceSumAxis_6_aie2_bf16 | ReduceSumAxis_4_aie2_bf16 | AvgPool2dVariant_bf16_1 | AvgPool2dVariant_bf16_6 | AvgPool2dVariant_bf16_14 | AvgPool2dVariant_bf16_16 | Conv2D_bfp16_7 | Conv2D_bfp16_OC8_1 | Sin_aie2_bf16 | Conv2D_bfp16_14 | Conv2D_bfp16_PSUM_FLOAT_2 | Conv2D_bfp16_2 | Conv2D_bfp16_1 | Conv2D_bfp16_5 | Conv2D_bfp16_PSUM_FLOAT_1 | MulAttributeBroadcasting_aie2_bf16_0 | ReduceMeanAxis_7_aie2_bf16 | Sqrt_bf16_0  | GEMM_Bfp16_opt_4 | AddBf16_aie2_2 | Conv2D_bfp16_0 | ReduceMaxAxis_7_aie2_bf16 | AvgPool2dVariant_bf16_15 | MaxPool2D_bf16_3 | Sqrt_bf16_1   | AvgPool2dVariant_bf16_11 | GEMM_Bfp16_opt_1 | AvgPool2dVariant_bf16_10 | AvgPool2D_bfloat16_2 | Conv2D_DW_bf16_4 | ReduceSumAxis_5_aie2_bf16 | ReduceMaxAxis_2_aie2_bf16 | AvgPool2dVariant_bf16_13 | MaxPool2D_bf16_2 | GEMM_Bfp16_opt_3 | AvgPool2D_bfloat16_3 | ReduceMaxAxis_6_aie2_bf16 | ReduceMaxAxis_5_aie2_bf16 | AvgPool2D_bfloat16_0 | Conv2D_bfp16_8 | Average diff | Diff stdev | Quantile #1 | Quantile #2 | Quantile #3 |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|----------------------------|------------------|------------------|----------------|--------------------------|----------------------------|------------------|----------------|------------------|----------------|------------------|-------------------------|---------------------------|------------------------------------|----------------------------------------|----------------|------------------------------|-----------------|--------------------|------------------|----------------------------|------------------|-----------------|----------------------------|--------------------|---------------------------|----------------------------|--------------------|---------------------|------------------|-----------------------------|---------------|--------------------|---------------------------|-------------------------|---------------------------|----------------|---------------------------|------------------|----------------------------|----------------------|--------------------|-------------------------|-------------------------|-----------------|-------------------------|-------------------------|----------------|--------------------|------------------|-----------------|-------------------------|--------------------------------------|-------------------------|---------------------------|-------------------------|---------------------------|----------------|--------------------|-------------------------------|-------------------------|----------------|-----------------|--------------|-------------------------|--------------------|---------------------------|--------------------|---------------------------|---------------------------|-------------------------|-------------------------|--------------------------|--------------------------|----------------|--------------------|---------------|-----------------|---------------------------|----------------|----------------|----------------|---------------------------|--------------------------------------|----------------------------|--------------|------------------|----------------|----------------|---------------------------|--------------------------|------------------|---------------|--------------------------|------------------|--------------------------|----------------------|------------------|---------------------------|---------------------------|--------------------------|------------------|------------------|----------------------|---------------------------|---------------------------|----------------------|----------------|--------------|------------|-------------|-------------|-------------|
| mllib_Innerloop_with_coalescing_peano    |                |                |                  |                               |                               |                      24961 |            10343 |            18001 |          22570 |                    19543 |                      32435 |            14594 |          20190 |            21972 |          24216 |             9426 |                   21194 |                     27347 |                              19982 |                                  24389 |          26000 |                        26055 |           20652 |              32611 |            13068 |                      33755 |            17029 |           28754 |                      50497 |              27098 |                     26186 |                      52439 |              46174 |               60820 |            23403 |                       26807 |         26596 |              26545 |                     38857 |                   25978 |                     30936 |          27112 |                     27867 |            20410 |                      40010 |                21327 |              33539 |                   64779 |                   30300 |           68174 |                   24085 |                   29975 |          33658 |              29740 |            27840 |           68496 |                   81568 |                                25342 |                   28600 |                     30590 |                   23696 |                     37006 |          23425 |              44007 |                         25337 |                   29110 |          29555 |           45452 |        22210 |                   20876 |              81776 |                     40625 |              78397 |                     33694 |                     41867 |                   44833 |                   31859 |                    24398 |                    26124 |          25284 |              47770 |         20061 |           46013 |                     24782 |          29894 |          54359 |          71401 |                     26296 |                                26206 |                      37205 |        40495 |            27561 |          25361 |          37313 |                     30760 |                    29090 |            21820 |         28790 |                    33347 |            27782 |                    30775 |                22187 |            12643 |                     34631 |                     38979 |                    29757 |            21405 |            31236 |                27532 |                     31470 |                     27727 |                29031 |          24262 | +0.00%       |       0.00 | +0.00%      | +0.00%      | +0.00%      |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|----------------------------|------------------|------------------|----------------|--------------------------|----------------------------|------------------|----------------|------------------|----------------|------------------|-------------------------|---------------------------|------------------------------------|----------------------------------------|----------------|------------------------------|-----------------|--------------------|------------------|----------------------------|------------------|-----------------|----------------------------|--------------------|---------------------------|----------------------------|--------------------|---------------------|------------------|-----------------------------|---------------|--------------------|---------------------------|-------------------------|---------------------------|----------------|---------------------------|------------------|----------------------------|----------------------|--------------------|-------------------------|-------------------------|-----------------|-------------------------|-------------------------|----------------|--------------------|------------------|-----------------|-------------------------|--------------------------------------|-------------------------|---------------------------|-------------------------|---------------------------|----------------|--------------------|-------------------------------|-------------------------|----------------|-----------------|--------------|-------------------------|--------------------|---------------------------|--------------------|---------------------------|---------------------------|-------------------------|-------------------------|--------------------------|--------------------------|----------------|--------------------|---------------|-----------------|---------------------------|----------------|----------------|----------------|---------------------------|--------------------------------------|----------------------------|--------------|------------------|----------------|----------------|---------------------------|--------------------------|------------------|---------------|--------------------------|------------------|--------------------------|----------------------|------------------|---------------------------|---------------------------|--------------------------|------------------|------------------|----------------------|---------------------------|---------------------------|----------------------|----------------|--------------|------------|-------------|-------------|-------------|
| mllib_Innerloop_without_coalescing_peano |          66200 |                |                  |                               |                               |                      40027 |            15592 |            25869 |          31516 |                    25594 |                      42406 |            18995 |          25867 |            28023 |          30575 |            11850 |                   26125 |                     33690 |                              24519 |                                  29122 |          30972 |                        30860 |           24416 |              38460 |            15349 |                      39458 |            19589 |           32948 |                      57683 |              30939 |                     29785 |                      58403 |              51076 |               67164 |            25826 |                       29551 |         29296 |              29126 |                     42597 |                   28180 |                     33362 |          29206 |                     29983 |            21695 |                      42468 |                22255 |              34945 |                   67077 |                   31361 |           70340 |                   24813 |                   30690 |          34420 |              30397 |            28286 |           69502 |                   82628 |                                25647 |                   28943 |                     30936 |                   23951 |                     37260 |          23549 |              44151 |                         25392 |                   29141 |          29546 |           45250 |        22099 |                   20770 |              81342 |                     40168 |              77398 |                     33237 |                     41290 |                   44174 |                   31297 |                    23720 |                    25328 |          24476 |              46102 |         19308 |           44233 |                     23486 |          28176 |          51100 |          67076 |                     24546 |                                24221 |                      33924 |        36682 |            24628 |          22489 |          32828 |                     27056 |                    25504 |            18916 |         24902 |                    28783 |            23718 |                    26177 |                18776 |            10407 |                     27978 |                     31420 |                    23954 |            17201 |            25017 |                21412 |                     24263 |                     20713 |                19377 |          15488 | +3.18%       |      16.53 | -6.04%      | +1.17%      | +11.18%     |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|----------------------------|------------------|------------------|----------------|--------------------------|----------------------------|------------------|----------------|------------------|----------------|------------------|-------------------------|---------------------------|------------------------------------|----------------------------------------|----------------|------------------------------|-----------------|--------------------|------------------|----------------------------|------------------|-----------------|----------------------------|--------------------|---------------------------|----------------------------|--------------------|---------------------|------------------|-----------------------------|---------------|--------------------|---------------------------|-------------------------|---------------------------|----------------|---------------------------|------------------|----------------------------|----------------------|--------------------|-------------------------|-------------------------|-----------------|-------------------------|-------------------------|----------------|--------------------|------------------|-----------------|-------------------------|--------------------------------------|-------------------------|---------------------------|-------------------------|---------------------------|----------------|--------------------|-------------------------------|-------------------------|----------------|-----------------|--------------|-------------------------|--------------------|---------------------------|--------------------|---------------------------|---------------------------|-------------------------|-------------------------|--------------------------|--------------------------|----------------|--------------------|---------------|-----------------|---------------------------|----------------|----------------|----------------|---------------------------|--------------------------------------|----------------------------|--------------|------------------|----------------|----------------|---------------------------|--------------------------|------------------|---------------|--------------------------|------------------|--------------------------|----------------------|------------------|---------------------------|---------------------------|--------------------------|------------------|------------------|----------------------|---------------------------|---------------------------|----------------------|----------------|--------------|------------|-------------|-------------|-------------|
| Total diff                               |                |                |                  |                               |                               | REGR(+60.36%)              | REGR(+50.75%)    | REGR(+43.71%)    | REGR(+39.64%)  | REGR(+30.96%)            | REGR(+30.74%)              | REGR(+30.16%)    | REGR(+28.12%)  | REGR(+27.54%)    | REGR(+26.26%)  | REGR(+25.72%)    | REGR(+23.27%)           | REGR(+23.19%)             | REGR(+22.71%)                      | REGR(+19.41%)                          | REGR(+19.12%)  | REGR(+18.44%)                | REGR(+18.23%)   | REGR(+17.94%)      | REGR(+17.45%)    | REGR(+16.90%)              | REGR(+15.03%)    | REGR(+14.59%)   | REGR(+14.23%)              | REGR(+14.17%)      | REGR(+13.74%)             | REGR(+11.37%)              | REGR(+10.62%)      | REGR(+10.43%)       | REGR(+10.35%)    | REGR(+10.24%)               | REGR(+10.15%) | REGR(+9.72%)       | REGR(+9.63%)              | REGR(+8.48%)            | REGR(+7.84%)              | REGR(+7.72%)   | REGR(+7.59%)              | REGR(+6.30%)     | REGR(+6.14%)               | REGR(+4.35%)         | REGR(+4.19%)       | REGR(+3.55%)            | REGR(+3.50%)            | REGR(+3.18%)    | REGR(+3.02%)            | REGR(+2.39%)            | REGR(+2.26%)   | REGR(+2.21%)       | REGR(+1.60%)     | REGR(+1.47%)    | REGR(+1.30%)            | REGR(+1.20%)                         | REGR(+1.20%)            | REGR(+1.13%)              | REGR(+1.08%)            | REGR(+0.69%)              | REGR(+0.53%)   | REGR(+0.33%)       | REGR(+0.22%)                  | REGR(+0.11%)            | SAME(-0.03%)   | IMPR(-0.44%)    | IMPR(-0.50%) | IMPR(-0.51%)            | IMPR(-0.53%)       | IMPR(-1.12%)              | IMPR(-1.27%)       | IMPR(-1.36%)              | IMPR(-1.38%)              | IMPR(-1.47%)            | IMPR(-1.76%)            | IMPR(-2.78%)             | IMPR(-3.05%)             | IMPR(-3.20%)   | IMPR(-3.49%)       | IMPR(-3.75%)  | IMPR(-3.87%)    | IMPR(-5.23%)              | IMPR(-5.75%)   | IMPR(-6.00%)   | IMPR(-6.06%)   | IMPR(-6.66%)              | IMPR(-7.57%)                         | IMPR(-8.82%)               | IMPR(-9.42%) | IMPR(-10.64%)    | IMPR(-11.32%)  | IMPR(-12.02%)  | IMPR(-12.04%)             | IMPR(-12.33%)            | IMPR(-13.31%)    | IMPR(-13.50%) | IMPR(-13.69%)            | IMPR(-14.63%)    | IMPR(-14.94%)            | IMPR(-15.37%)        | IMPR(-17.69%)    | IMPR(-19.21%)             | IMPR(-19.39%)             | IMPR(-19.50%)            | IMPR(-19.64%)    | IMPR(-19.91%)    | IMPR(-22.23%)        | IMPR(-22.90%)             | IMPR(-25.30%)             | IMPR(-33.25%)        | IMPR(-36.16%)  | +3.18%       |      16.53 | -6.04%      | +1.17%      | +11.18%     |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|----------------------------|------------------|------------------|----------------|--------------------------|----------------------------|------------------|----------------|------------------|----------------|------------------|-------------------------|---------------------------|------------------------------------|----------------------------------------|----------------|------------------------------|-----------------|--------------------|------------------|----------------------------|------------------|-----------------|----------------------------|--------------------|---------------------------|----------------------------|--------------------|---------------------|------------------|-----------------------------|---------------|--------------------|---------------------------|-------------------------|---------------------------|----------------|---------------------------|------------------|----------------------------|----------------------|--------------------|-------------------------|-------------------------|-----------------|-------------------------|-------------------------|----------------|--------------------|------------------|-----------------|-------------------------|--------------------------------------|-------------------------|---------------------------|-------------------------|---------------------------|----------------|--------------------|-------------------------------|-------------------------|----------------|-----------------|--------------|-------------------------|--------------------|---------------------------|--------------------|---------------------------|---------------------------|-------------------------|-------------------------|--------------------------|--------------------------|----------------|--------------------|---------------|-----------------|---------------------------|----------------|----------------|----------------|---------------------------|--------------------------------------|----------------------------|--------------|------------------|----------------|----------------|---------------------------|--------------------------|------------------|---------------|--------------------------|------------------|--------------------------|----------------------|------------------|---------------------------|---------------------------|--------------------------|------------------|------------------|----------------------|---------------------------|---------------------------|----------------------|----------------|--------------|------------|-------------|-------------|-------------|

@andcarminati
Copy link
Collaborator

Coalescing widening copy from innerloop has positive impact when there is no inline spilling. However, in case of spilling, it inversely affects the stack size.

|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|-------------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|-------------------------|-------------------------|-------------------------------|------------------------------|--------------|--------------|-----------------|-----------------------------|----------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|---------------|--------------|------------|-------------|-------------|-------------|
| Core_StackSize                           | Conv2D_bfp16_9 | ElemDiv_aie2_1 | GEMM_Bfp16_opt_5 | TanhTemplated_aie2_bfloat16_0 | TanhTemplated_aie2_bfloat16_1 | AddAttributeBroadcasting_aie2_bf16 | AddBf16_aie2_0 | AddBf16_aie2_1 | AddBf16_aie2_2 | AvgPool2D_bfloat16_0 | AvgPool2D_bfloat16_1 | AvgPool2D_bfloat16_2 | AvgPool2D_bfloat16_3 | AvgPool2dVariant_bf16_0 | AvgPool2dVariant_bf16_1 | AvgPool2dVariant_bf16_2 | AvgPool2dVariant_bf16_3 | AvgPool2dVariant_bf16_4 | AvgPool2dVariant_bf16_5 | AvgPool2dVariant_bf16_6 | AvgPool2dVariant_bf16_7 | AvgPool2dVariant_bf16_8 | AvgPool2dVariant_bf16_9 | AvgPool2dVariant_bf16_10 | AvgPool2dVariant_bf16_11 | AvgPool2dVariant_bf16_12 | AvgPool2dVariant_bf16_13 | AvgPool2dVariant_bf16_14 | AvgPool2dVariant_bf16_15 | AvgPool2dVariant_bf16_16 | Clip_aie2_bf16 | CompareOpsAttributeBroadcasting_bf16 | CompareOpsAttributeBroadcasting_bf16_1 | Conv2D_DW_bf16_0 | Conv2D_DW_bf16_1 | Conv2D_DW_bf16_2 | Conv2D_DW_bf16_3 | Conv2D_DW_bf16_4 | Conv2D_bfp16_0 | Conv2D_bfp16_1 | Conv2D_bfp16_2 | Conv2D_bfp16_3 | Conv2D_bfp16_4 | Conv2D_bfp16_5 | Conv2D_bfp16_6 | Conv2D_bfp16_7 | Conv2D_bfp16_8 | Conv2D_bfp16_10 | Conv2D_bfp16_11 | Conv2D_bfp16_12 | Conv2D_bfp16_13 | Conv2D_bfp16_14 | Conv2D_bfp16_OC8_0 | Conv2D_bfp16_OC8_1 | Conv2D_bfp16_OC8_2 | Conv2D_bfp16_OC8_3 | Conv2D_bfp16_OC8_4 | Conv2D_bfp16_OC8_5 | Conv2D_bfp16_OC8_6 | Conv2D_bfp16_OC8_7 | Conv2D_bfp16_OC8_8 | Conv2D_bfp16_OC8_9 | Conv2D_bfp16_OC8_10 | Conv2D_bfp16_PSUM_FLOAT_0 | Conv2D_bfp16_PSUM_FLOAT_1 | Conv2D_bfp16_PSUM_FLOAT_2 | GEMM_Bfp16_opt_0 | GEMM_Bfp16_opt_1 | GEMM_Bfp16_opt_2 | GEMM_Bfp16_opt_3 | GEMM_Bfp16_opt_4 | GeluTemplated_aie2_bf16 | Hardswish_aie2_1 | MaxPool2D_bf16_0 | MaxPool2D_bf16_1 | MaxPool2D_bf16_2 | MaxPool2D_bf16_3 | MaxPool2D_bf16_4 | MulAttributeBroadcasting_aie2_bf16_0 | MulBf16_aie2_0 | Neg_aie2_1   | Pad2D_bf16_0 | ReduceMaxAxis_1_aie2_bf16 | ReduceMaxAxis_2_aie2_bf16 | ReduceMaxAxis_3_aie2_bf16 | ReduceMaxAxis_4_aie2_bf16 | ReduceMaxAxis_5_aie2_bf16 | ReduceMaxAxis_6_aie2_bf16 | ReduceMaxAxis_7_aie2_bf16 | ReduceSumAxis_1_aie2_bf16 | ReduceSumAxis_2_aie2_bf16 | ReduceSumAxis_3_aie2_bf16 | ReduceSumAxis_4_aie2_bf16 | ReduceSumAxis_5_aie2_bf16 | ReduceSumAxis_6_aie2_bf16 | ReduceSumAxis_7_aie2_bf16 | SigmoidTemplated_bf16_0 | SigmoidTemplated_bf16_1 | SigmoidTemplated_bf16_1_AIE2p | Sigmoidmode1Templated_bf16_0 | Sqrt_bf16_0  | Sqrt_bf16_1  | Sub_aie2_bf16_0 | TanhTemplatedmode1_bfloat16 | SiLU_aie2_bf16 | ReduceMeanAxis_1_aie2_bf16 | ReduceMeanAxis_2_aie2_bf16 | ReduceMeanAxis_3_aie2_bf16 | ReduceMeanAxis_4_aie2_bf16 | ReduceMeanAxis_5_aie2_bf16 | ReduceMeanAxis_6_aie2_bf16 | ReduceMeanAxis_7_aie2_bf16 | Sin_aie2_bf16 | Average diff | Diff stdev | Quantile #1 | Quantile #2 | Quantile #3 |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|-------------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|-------------------------|-------------------------|-------------------------------|------------------------------|--------------|--------------|-----------------|-----------------------------|----------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|---------------|--------------|------------|-------------|-------------|-------------|
| mllib_Innerloop_with_coalescing_peano    |                |                |                  |                               |                               |                                704 |            640 |            640 |            640 |                  448 |                  448 |                  448 |                  448 |                     832 |                     832 |                     832 |                     832 |                     832 |                     832 |                     832 |                     832 |                     832 |                     832 |                      832 |                      832 |                      832 |                      832 |                      832 |                      832 |                      832 |            384 |                                  576 |                                    576 |              320 |              320 |              320 |              320 |              320 |            896 |            896 |            896 |            896 |            896 |            896 |            896 |            896 |            896 |             896 |             896 |             896 |             896 |             896 |                512 |               1280 |                512 |                512 |                512 |               1344 |                512 |               1280 |                512 |                512 |                 512 |                      1152 |                      1152 |                      1152 |              768 |              768 |              704 |              768 |              768 |                     448 |              448 |              320 |              320 |              320 |              320 |              320 |                                  704 |            512 |          512 |          192 |                       384 |                       384 |                       384 |                       384 |                       384 |                       384 |                       384 |                       640 |                       640 |                       640 |                       640 |                       640 |                       640 |                       640 |                     448 |                     448 |                           448 |                          576 |          640 |          640 |             640 |                         640 |            832 |                        640 |                        640 |                        640 |                        640 |                        640 |                        640 |                        640 |          2240 | +0.00%       |       0.00 | +0.00%      | +0.00%      | +0.00%      |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|-------------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|-------------------------|-------------------------|-------------------------------|------------------------------|--------------|--------------|-----------------|-----------------------------|----------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|---------------|--------------|------------|-------------|-------------|-------------|
| mllib_Innerloop_without_coalescing_peano |            896 |                |                  |                               |                               |                                704 |            640 |            640 |            640 |                  448 |                  448 |                  448 |                  448 |                     832 |                     832 |                     832 |                     832 |                     832 |                     832 |                     832 |                     832 |                     832 |                     832 |                      832 |                      832 |                      832 |                      832 |                      832 |                      832 |                      832 |            384 |                                  576 |                                    576 |              320 |              320 |              320 |              320 |              320 |            896 |            896 |            896 |            896 |            896 |            896 |            896 |            896 |            896 |             896 |             896 |             896 |             896 |             896 |                512 |               1280 |                512 |                512 |                512 |               1344 |                512 |               1280 |                512 |                512 |                 512 |                      1152 |                      1152 |                      1152 |              768 |              768 |              704 |              768 |              768 |                     448 |              448 |              320 |              320 |              320 |              320 |              320 |                                  704 |            512 |          512 |          192 |                       384 |                       384 |                       384 |                       384 |                       384 |                       384 |                       384 |                       640 |                       640 |                       640 |                       640 |                       640 |                       640 |                       640 |                     448 |                     448 |                           448 |                          576 |          640 |          640 |             640 |                         640 |            704 |                        512 |                        512 |                        512 |                        512 |                        512 |                        512 |                        512 |          1472 | -1.76%       |       6.02 | +0.00%      | +0.00%      | +0.00%      |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|-------------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|-------------------------|-------------------------|-------------------------------|------------------------------|--------------|--------------|-----------------|-----------------------------|----------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|---------------|--------------|------------|-------------|-------------|-------------|
| Total diff                               |                |                |                  |                               |                               | SAME(+0.00%)                       | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)         | SAME(+0.00%)         | SAME(+0.00%)         | SAME(+0.00%)         | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)   | SAME(+0.00%)                         | SAME(+0.00%)                           | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)    | SAME(+0.00%)    | SAME(+0.00%)    | SAME(+0.00%)    | SAME(+0.00%)    | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)        | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)            | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)                         | SAME(+0.00%)   | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)                  | SAME(+0.00%)                 | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%)    | SAME(+0.00%)                | IMPR(-15.38%)  | IMPR(-20.00%)              | IMPR(-20.00%)              | IMPR(-20.00%)              | IMPR(-20.00%)              | IMPR(-20.00%)              | IMPR(-20.00%)              | IMPR(-20.00%)              | IMPR(-34.29%) | -1.76%       |       6.02 | +0.00%      | +0.00%      | +0.00%      |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|-------------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|-------------------------|-------------------------|-------------------------------|------------------------------|--------------|--------------|-----------------|-----------------------------|----------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|---------------|--------------|------------|-------------|-------------|-------------|
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|--------------|-------------------------|--------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------|------------------------------|-----------------|-----------------------------|-------------------------|----------------------------|----------------------------|-------------------------|-------------------------------|----------------------------|----------------------------|----------------------------|----------------------------|---------------|--------------|------------|-------------|-------------|-------------|
| Core_Compute_Insn_Count                  | Conv2D_bfp16_9 | ElemDiv_aie2_1 | GEMM_Bfp16_opt_5 | TanhTemplated_aie2_bfloat16_0 | TanhTemplated_aie2_bfloat16_1 | Sqrt_bf16_1  | GeluTemplated_aie2_bf16 | Sqrt_bf16_0  | AddAttributeBroadcasting_aie2_bf16 | AddBf16_aie2_0 | AddBf16_aie2_1 | AddBf16_aie2_2 | AvgPool2D_bfloat16_0 | AvgPool2D_bfloat16_1 | AvgPool2D_bfloat16_2 | AvgPool2D_bfloat16_3 | AvgPool2dVariant_bf16_0 | AvgPool2dVariant_bf16_1 | AvgPool2dVariant_bf16_2 | AvgPool2dVariant_bf16_3 | AvgPool2dVariant_bf16_4 | AvgPool2dVariant_bf16_5 | AvgPool2dVariant_bf16_6 | AvgPool2dVariant_bf16_7 | AvgPool2dVariant_bf16_8 | AvgPool2dVariant_bf16_9 | AvgPool2dVariant_bf16_10 | AvgPool2dVariant_bf16_11 | AvgPool2dVariant_bf16_12 | AvgPool2dVariant_bf16_13 | AvgPool2dVariant_bf16_14 | AvgPool2dVariant_bf16_15 | AvgPool2dVariant_bf16_16 | Clip_aie2_bf16 | CompareOpsAttributeBroadcasting_bf16 | CompareOpsAttributeBroadcasting_bf16_1 | Conv2D_DW_bf16_0 | Conv2D_DW_bf16_1 | Conv2D_DW_bf16_2 | Conv2D_DW_bf16_3 | Conv2D_DW_bf16_4 | Conv2D_bfp16_0 | Conv2D_bfp16_1 | Conv2D_bfp16_2 | Conv2D_bfp16_3 | Conv2D_bfp16_4 | Conv2D_bfp16_5 | Conv2D_bfp16_6 | Conv2D_bfp16_7 | Conv2D_bfp16_8 | Conv2D_bfp16_10 | Conv2D_bfp16_11 | Conv2D_bfp16_12 | Conv2D_bfp16_13 | Conv2D_bfp16_14 | Conv2D_bfp16_OC8_0 | Conv2D_bfp16_OC8_1 | Conv2D_bfp16_OC8_2 | Conv2D_bfp16_OC8_3 | Conv2D_bfp16_OC8_4 | Conv2D_bfp16_OC8_5 | Conv2D_bfp16_OC8_6 | Conv2D_bfp16_OC8_7 | Conv2D_bfp16_OC8_8 | Conv2D_bfp16_OC8_9 | Conv2D_bfp16_OC8_10 | Conv2D_bfp16_PSUM_FLOAT_0 | Conv2D_bfp16_PSUM_FLOAT_1 | Conv2D_bfp16_PSUM_FLOAT_2 | GEMM_Bfp16_opt_0 | GEMM_Bfp16_opt_1 | GEMM_Bfp16_opt_2 | GEMM_Bfp16_opt_3 | GEMM_Bfp16_opt_4 | Hardswish_aie2_1 | MaxPool2D_bf16_0 | MaxPool2D_bf16_1 | MaxPool2D_bf16_2 | MaxPool2D_bf16_3 | MaxPool2D_bf16_4 | MulAttributeBroadcasting_aie2_bf16_0 | MulBf16_aie2_0 | Neg_aie2_1   | Pad2D_bf16_0 | ReduceMaxAxis_1_aie2_bf16 | ReduceMaxAxis_2_aie2_bf16 | ReduceMaxAxis_3_aie2_bf16 | ReduceMaxAxis_4_aie2_bf16 | ReduceMaxAxis_5_aie2_bf16 | ReduceMaxAxis_6_aie2_bf16 | ReduceMaxAxis_7_aie2_bf16 | ReduceMeanAxis_7_aie2_bf16 | ReduceSumAxis_1_aie2_bf16 | ReduceSumAxis_2_aie2_bf16 | ReduceSumAxis_3_aie2_bf16 | ReduceSumAxis_4_aie2_bf16 | ReduceSumAxis_5_aie2_bf16 | ReduceSumAxis_6_aie2_bf16 | ReduceSumAxis_7_aie2_bf16 | SiLU_aie2_bf16 | Sigmoidmode1Templated_bf16_0 | Sub_aie2_bf16_0 | TanhTemplatedmode1_bfloat16 | SigmoidTemplated_bf16_0 | ReduceMeanAxis_3_aie2_bf16 | ReduceMeanAxis_6_aie2_bf16 | SigmoidTemplated_bf16_1 | SigmoidTemplated_bf16_1_AIE2p | ReduceMeanAxis_5_aie2_bf16 | ReduceMeanAxis_2_aie2_bf16 | ReduceMeanAxis_4_aie2_bf16 | ReduceMeanAxis_1_aie2_bf16 | Sin_aie2_bf16 | Average diff | Diff stdev | Quantile #1 | Quantile #2 | Quantile #3 |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|--------------|-------------------------|--------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------|------------------------------|-----------------|-----------------------------|-------------------------|----------------------------|----------------------------|-------------------------|-------------------------------|----------------------------|----------------------------|----------------------------|----------------------------|---------------|--------------|------------|-------------|-------------|-------------|
| mllib_Innerloop_with_coalescing_peano    |                |                |                  |                               |                               |         1791 |                    3363 |        13775 |                                456 |            668 |            795 |            795 |                 1118 |                  822 |                  532 |                  532 |                    3108 |                    1820 |                    3888 |                     759 |                    1219 |                    2440 |                    2746 |                    2251 |                    3418 |                    2744 |                     2972 |                     3223 |                     2735 |                     2914 |                     3275 |                     4664 |                     1042 |            146 |                                 1236 |                                   1242 |             1036 |             4424 |             1884 |             1036 |             1012 |          10459 |          21611 |          10459 |          11700 |          10703 |          40909 |           5581 |           2049 |           1466 |           35809 |           35809 |            7586 |           20616 |           14400 |               7222 |              16200 |               4835 |               9898 |               4718 |              29452 |              12910 |              14668 |              23652 |               5232 |               30046 |                      6573 |                      4901 |                      4973 |             1282 |             4084 |             3479 |             4084 |             4084 |              862 |             1695 |             1191 |              694 |              694 |              694 |                                  688 |            231 |          132 |         2357 |                      8572 |                      8588 |                      3381 |                      8608 |                      3381 |                      3362 |                      2601 |                       9037 |                     16528 |                     16544 |                      6523 |                     16564 |                      6523 |                      6504 |                      5100 |           1170 |                         6568 |             658 |                        7338 |                     969 |                      10165 |                      10146 |                     521 |                           521 |                      11322 |                      18544 |                      27924 |                      27916 |          1717 | +0.00%       |       0.00 | +0.00%      | +0.00%      | +0.00%      |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|--------------|-------------------------|--------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------|------------------------------|-----------------|-----------------------------|-------------------------|----------------------------|----------------------------|-------------------------|-------------------------------|----------------------------|----------------------------|----------------------------|----------------------------|---------------|--------------|------------|-------------|-------------|-------------|
| mllib_Innerloop_without_coalescing_peano |          35989 |                |                  |                               |                               |         1793 |                    3364 |        13777 |                                456 |            668 |            795 |            795 |                 1118 |                  822 |                  532 |                  532 |                    3108 |                    1820 |                    3888 |                     759 |                    1219 |                    2440 |                    2746 |                    2251 |                    3418 |                    2744 |                     2972 |                     3223 |                     2735 |                     2914 |                     3275 |                     4664 |                     1042 |            146 |                                 1236 |                                   1242 |             1036 |             4424 |             1884 |             1036 |             1012 |          10459 |          21611 |          10459 |          11700 |          10703 |          40909 |           5581 |           2049 |           1466 |           35809 |           35809 |            7586 |           20616 |           14400 |               7222 |              16200 |               4835 |               9898 |               4718 |              29452 |              12910 |              14668 |              23652 |               5232 |               30046 |                      6573 |                      4901 |                      4973 |             1282 |             4084 |             3479 |             4084 |             4084 |              862 |             1695 |             1191 |              694 |              694 |              694 |                                  688 |            231 |          132 |         2357 |                      8572 |                      8588 |                      3381 |                      8608 |                      3381 |                      3362 |                      2601 |                       9037 |                     16528 |                     16544 |                      6523 |                     16564 |                      6523 |                      6504 |                      5100 |           1170 |                         6568 |             658 |                        7338 |                     968 |                      10149 |                      10130 |                     520 |                           520 |                      11290 |                      18416 |                      27668 |                      27660 |          1493 | -0.15%       |       1.26 | +0.00%      | +0.00%      | +0.00%      |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|--------------|-------------------------|--------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------|------------------------------|-----------------|-----------------------------|-------------------------|----------------------------|----------------------------|-------------------------|-------------------------------|----------------------------|----------------------------|----------------------------|----------------------------|---------------|--------------|------------|-------------|-------------|-------------|
| Total diff                               |                |                |                  |                               |                               | REGR(+0.11%) | SAME(+0.03%)            | SAME(+0.01%) | SAME(+0.00%)                       | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)         | SAME(+0.00%)         | SAME(+0.00%)         | SAME(+0.00%)         | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)   | SAME(+0.00%)                         | SAME(+0.00%)                           | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)    | SAME(+0.00%)    | SAME(+0.00%)    | SAME(+0.00%)    | SAME(+0.00%)    | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)        | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)                         | SAME(+0.00%)   | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)               | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)   | SAME(+0.00%)                 | SAME(+0.00%)    | SAME(+0.00%)                | IMPR(-0.10%)            | IMPR(-0.16%)               | IMPR(-0.16%)               | IMPR(-0.19%)            | IMPR(-0.19%)                  | IMPR(-0.28%)               | IMPR(-0.69%)               | IMPR(-0.92%)               | IMPR(-0.92%)               | IMPR(-13.05%) | -0.15%       |       1.26 | +0.00%      | +0.00%      | +0.00%      |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|--------------|-------------------------|--------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|----------------|------------------------------|-----------------|-----------------------------|-------------------------|----------------------------|----------------------------|-------------------------|-------------------------------|----------------------------|----------------------------|----------------------------|----------------------------|---------------|--------------|------------|-------------|-------------|-------------|
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|-------------------------|-------------------------|-------------------------------|--------------|--------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|-------------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|------------------------------|-----------------|-----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------|---------------|--------------|------------|-------------|-------------|-------------|
| Core_PMSize                              | Conv2D_bfp16_9 | ElemDiv_aie2_1 | GEMM_Bfp16_opt_5 | TanhTemplated_aie2_bfloat16_0 | TanhTemplated_aie2_bfloat16_1 | SigmoidTemplated_bf16_0 | SigmoidTemplated_bf16_1 | SigmoidTemplated_bf16_1_AIE2p | Sqrt_bf16_1  | Sqrt_bf16_0  | AddAttributeBroadcasting_aie2_bf16 | AddBf16_aie2_0 | AddBf16_aie2_1 | AddBf16_aie2_2 | AvgPool2D_bfloat16_0 | AvgPool2D_bfloat16_1 | AvgPool2D_bfloat16_2 | AvgPool2D_bfloat16_3 | AvgPool2dVariant_bf16_0 | AvgPool2dVariant_bf16_1 | AvgPool2dVariant_bf16_2 | AvgPool2dVariant_bf16_3 | AvgPool2dVariant_bf16_4 | AvgPool2dVariant_bf16_5 | AvgPool2dVariant_bf16_6 | AvgPool2dVariant_bf16_7 | AvgPool2dVariant_bf16_8 | AvgPool2dVariant_bf16_9 | AvgPool2dVariant_bf16_10 | AvgPool2dVariant_bf16_11 | AvgPool2dVariant_bf16_12 | AvgPool2dVariant_bf16_13 | AvgPool2dVariant_bf16_14 | AvgPool2dVariant_bf16_15 | AvgPool2dVariant_bf16_16 | Clip_aie2_bf16 | CompareOpsAttributeBroadcasting_bf16 | CompareOpsAttributeBroadcasting_bf16_1 | Conv2D_DW_bf16_0 | Conv2D_DW_bf16_1 | Conv2D_DW_bf16_2 | Conv2D_DW_bf16_3 | Conv2D_DW_bf16_4 | Conv2D_bfp16_0 | Conv2D_bfp16_1 | Conv2D_bfp16_2 | Conv2D_bfp16_3 | Conv2D_bfp16_4 | Conv2D_bfp16_5 | Conv2D_bfp16_6 | Conv2D_bfp16_7 | Conv2D_bfp16_8 | Conv2D_bfp16_10 | Conv2D_bfp16_11 | Conv2D_bfp16_12 | Conv2D_bfp16_13 | Conv2D_bfp16_14 | Conv2D_bfp16_OC8_0 | Conv2D_bfp16_OC8_1 | Conv2D_bfp16_OC8_2 | Conv2D_bfp16_OC8_3 | Conv2D_bfp16_OC8_4 | Conv2D_bfp16_OC8_5 | Conv2D_bfp16_OC8_6 | Conv2D_bfp16_OC8_7 | Conv2D_bfp16_OC8_8 | Conv2D_bfp16_OC8_9 | Conv2D_bfp16_OC8_10 | Conv2D_bfp16_PSUM_FLOAT_0 | Conv2D_bfp16_PSUM_FLOAT_1 | Conv2D_bfp16_PSUM_FLOAT_2 | GEMM_Bfp16_opt_0 | GEMM_Bfp16_opt_1 | GEMM_Bfp16_opt_2 | GEMM_Bfp16_opt_3 | GEMM_Bfp16_opt_4 | GeluTemplated_aie2_bf16 | Hardswish_aie2_1 | MaxPool2D_bf16_0 | MaxPool2D_bf16_1 | MaxPool2D_bf16_2 | MaxPool2D_bf16_3 | MaxPool2D_bf16_4 | MulAttributeBroadcasting_aie2_bf16_0 | MulBf16_aie2_0 | Neg_aie2_1   | Pad2D_bf16_0 | ReduceMaxAxis_1_aie2_bf16 | ReduceMaxAxis_2_aie2_bf16 | ReduceMaxAxis_3_aie2_bf16 | ReduceMaxAxis_4_aie2_bf16 | ReduceMaxAxis_5_aie2_bf16 | ReduceMaxAxis_6_aie2_bf16 | ReduceMaxAxis_7_aie2_bf16 | ReduceSumAxis_1_aie2_bf16 | ReduceSumAxis_2_aie2_bf16 | ReduceSumAxis_3_aie2_bf16 | ReduceSumAxis_4_aie2_bf16 | ReduceSumAxis_5_aie2_bf16 | ReduceSumAxis_6_aie2_bf16 | ReduceSumAxis_7_aie2_bf16 | Sigmoidmode1Templated_bf16_0 | Sub_aie2_bf16_0 | TanhTemplatedmode1_bfloat16 | ReduceMeanAxis_1_aie2_bf16 | ReduceMeanAxis_2_aie2_bf16 | ReduceMeanAxis_3_aie2_bf16 | ReduceMeanAxis_4_aie2_bf16 | ReduceMeanAxis_5_aie2_bf16 | ReduceMeanAxis_6_aie2_bf16 | ReduceMeanAxis_7_aie2_bf16 | SiLU_aie2_bf16 | Sin_aie2_bf16 | Average diff | Diff stdev | Quantile #1 | Quantile #2 | Quantile #3 |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|-------------------------|-------------------------|-------------------------------|--------------|--------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|-------------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|------------------------------|-----------------|-----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------|---------------|--------------|------------|-------------|-------------|-------------|
| mllib_Innerloop_with_coalescing_peano    |                |                |                  |                               |                               |                    2260 |                    2260 |                          2260 |         2436 |         2452 |                               2724 |           2820 |           2868 |           2868 |                 3556 |                 3556 |                 3556 |                 3556 |                    4196 |                    4212 |                    4196 |                    4212 |                    4212 |                    4212 |                    4212 |                    4212 |                    4212 |                    4212 |                     4212 |                     4212 |                     4212 |                     4212 |                     4212 |                     4212 |                     4212 |           2116 |                                 3060 |                                   3060 |             3252 |             3252 |             3252 |             3252 |             3252 |           5764 |           5748 |           5764 |           5748 |           5748 |           5748 |           5828 |           5828 |           5716 |            5748 |            5748 |            5652 |            5748 |            5748 |               6756 |               7396 |               6852 |               6756 |               6756 |               7492 |               6756 |               7396 |               6836 |               6756 |                6740 |                      6116 |                      6116 |                      6116 |             5860 |             5860 |             5604 |             5860 |             5860 |                    2932 |             2340 |             2788 |             2788 |             2788 |             2788 |             2788 |                                 2612 |           2532 |         2100 |         3092 |                      6180 |                      6180 |                      6180 |                      6180 |                      6180 |                      6180 |                      6164 |                      6836 |                      6836 |                      6836 |                      6836 |                      6836 |                      6836 |                      6820 |                         3444 |            2756 |                        3124 |                       6900 |                       6900 |                       6900 |                       6900 |                       6900 |                       6900 |                       6884 |           2324 |          2596 | +0.00%       |       0.00 | +0.00%      | +0.00%      | +0.00%      |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|-------------------------|-------------------------|-------------------------------|--------------|--------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|-------------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|------------------------------|-----------------|-----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------|---------------|--------------|------------|-------------|-------------|-------------|
| mllib_Innerloop_without_coalescing_peano |           5812 |                |                  |                               |                               |                    2292 |                    2292 |                          2292 |         2452 |         2468 |                               2724 |           2820 |           2868 |           2868 |                 3556 |                 3556 |                 3556 |                 3556 |                    4196 |                    4212 |                    4196 |                    4212 |                    4212 |                    4212 |                    4212 |                    4212 |                    4212 |                    4212 |                     4212 |                     4212 |                     4212 |                     4212 |                     4212 |                     4212 |                     4212 |           2116 |                                 3060 |                                   3060 |             3252 |             3252 |             3252 |             3252 |             3252 |           5764 |           5748 |           5764 |           5748 |           5748 |           5748 |           5828 |           5828 |           5716 |            5748 |            5748 |            5652 |            5748 |            5748 |               6756 |               7396 |               6852 |               6756 |               6756 |               7492 |               6756 |               7396 |               6836 |               6756 |                6740 |                      6116 |                      6116 |                      6116 |             5860 |             5860 |             5604 |             5860 |             5860 |                    2932 |             2340 |             2788 |             2788 |             2788 |             2788 |             2788 |                                 2612 |           2532 |         2100 |         3092 |                      6180 |                      6180 |                      6180 |                      6180 |                      6180 |                      6180 |                      6164 |                      6836 |                      6836 |                      6836 |                      6836 |                      6836 |                      6836 |                      6820 |                         3444 |            2756 |                        3124 |                       6884 |                       6884 |                       6884 |                       6884 |                       6884 |                       6884 |                       6868 |           2308 |          2548 | +0.01%       |       0.32 | +0.00%      | +0.00%      | +0.00%      |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|-------------------------|-------------------------|-------------------------------|--------------|--------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|-------------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|------------------------------|-----------------|-----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------|---------------|--------------|------------|-------------|-------------|-------------|
| Total diff                               |                |                |                  |                               |                               | REGR(+1.42%)            | REGR(+1.42%)            | REGR(+1.42%)                  | REGR(+0.66%) | REGR(+0.65%) | SAME(+0.00%)                       | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)         | SAME(+0.00%)         | SAME(+0.00%)         | SAME(+0.00%)         | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)            | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)             | SAME(+0.00%)   | SAME(+0.00%)                         | SAME(+0.00%)                           | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)   | SAME(+0.00%)    | SAME(+0.00%)    | SAME(+0.00%)    | SAME(+0.00%)    | SAME(+0.00%)    | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)       | SAME(+0.00%)        | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)            | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)     | SAME(+0.00%)                         | SAME(+0.00%)   | SAME(+0.00%) | SAME(+0.00%) | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)              | SAME(+0.00%)                 | SAME(+0.00%)    | SAME(+0.00%)                | IMPR(-0.23%)               | IMPR(-0.23%)               | IMPR(-0.23%)               | IMPR(-0.23%)               | IMPR(-0.23%)               | IMPR(-0.23%)               | IMPR(-0.23%)               | IMPR(-0.69%)   | IMPR(-1.85%)  | +0.01%       |       0.32 | +0.00%      | +0.00%      | +0.00%      |
|------------------------------------------|----------------|----------------|------------------|-------------------------------|-------------------------------|-------------------------|-------------------------|-------------------------------|--------------|--------------|------------------------------------|----------------|----------------|----------------|----------------------|----------------------|----------------------|----------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|-------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|--------------------------|----------------|--------------------------------------|----------------------------------------|------------------|------------------|------------------|------------------|------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|-----------------|-----------------|-----------------|-----------------|-----------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|--------------------|---------------------|---------------------------|---------------------------|---------------------------|------------------|------------------|------------------|------------------|------------------|-------------------------|------------------|------------------|------------------|------------------|------------------|------------------|--------------------------------------|----------------|--------------|--------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|---------------------------|------------------------------|-----------------|-----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------------------|----------------|---------------|--------------|------------|-------------|-------------|-------------|

Can you please update to Cycles?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This worries me a bit. We are worsening some AIE2 test results.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andcarminati as discussed, I will remove this for AIE2. Also, here is the tracking ticket for AIE2P AIECC-899

Copy link
Collaborator

@martien-de-jong martien-de-jong Mar 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may be able to take in a bit more information. If the source is within the same loop block, I think it's always better to coalesce. If the source has a lot of live intervals (indicating that it's coalesced to a lot of other registers already), it may be better not to coalesce. In that case, it may also help to split the live range by inserting a widening copy in the preheader of the loop. That allows to bring in the double register with a very short live range, (few interferences) and coalesce the now regular copy in the loop. But I'm afraid this can not be done from this target hook.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also note that this would be a natural mechanism in region-based register allocation, where the loop body would be isolated from the surroundings by these copy-in (and copy-out) moves. The inner loops then get priority, and the rest can either be coalesced or yields a low-frequency copy.
I think such a liverange splitter pass would sit naturally before register coalescing. It would guarantee that all interbank copies would be outside the inner loops. Widening copies would never be coalesced.

@niwinanto niwinanto force-pushed the niwin.innerloop.coalesce branch 3 times, most recently from 9a546fc to c18dd20 Compare April 1, 2025 12:41
@niwinanto niwinanto force-pushed the niwin.innerloop.coalesce branch from c18dd20 to 8d6b4f7 Compare April 1, 2025 12:42
AIE2P::FIFO1024RegClass.contains(Reg));
}

bool AIE2PRegisterInfo::shouldCoalesce(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it is better to have this code only for AIE2P, as we are not planning to evaluate effects for AIE2 in this moment. Also, we should keep QoR results for AIE2 in a stable state.

@niwinanto niwinanto force-pushed the niwin.innerloop.coalesce branch from d4c66d4 to 1f27c64 Compare April 1, 2025 13:04
Copy link
Collaborator

@andcarminati andcarminati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@niwinanto niwinanto merged commit 9f0b58c into aie-public Apr 2, 2025
6 checks passed
mgehre-amd added a commit that referenced this pull request Aug 21, 2025
arith-to-emitc: Fix lowering of fptoui
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants