Skip to content

Commit 9d1c4ef

Browse files
committed
fix: Use black_box() on input arguments in benchmarks
This fixes #1547 Benchmark results (before vs. after the change): mat2_mul_m time: [1.8211 ns 1.8224 ns 1.8239 ns] change: [+146.50% +146.66% +146.86%] (p = 0.00 < 0.05) Performance has regressed. Found 16 outliers among 100 measurements (16.00%) 3 (3.00%) low mild 4 (4.00%) high mild 9 (9.00%) high severe mat3_mul_m time: [10.085 ns 10.090 ns 10.096 ns] change: [+531.16% +531.50% +531.97%] (p = 0.00 < 0.05) Performance has regressed. Found 5 outliers among 100 measurements (5.00%) 1 (1.00%) high mild 4 (4.00%) high severe mat4_mul_m time: [11.219 ns 11.234 ns 11.250 ns] change: [+277.89% +278.45% +278.98%] (p = 0.00 < 0.05) Performance has regressed. Found 18 outliers among 100 measurements (18.00%) 9 (9.00%) low mild 7 (7.00%) high mild 2 (2.00%) high severe mat2_tr_mul_m time: [1.7146 ns 1.7154 ns 1.7161 ns] change: [+131.55% +131.93% +132.26%] (p = 0.00 < 0.05) Performance has regressed. Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low severe 1 (1.00%) low mild 2 (2.00%) high mild 4 (4.00%) high severe mat3_tr_mul_m time: [9.7604 ns 9.7655 ns 9.7713 ns] change: [+513.89% +514.65% +515.31%] (p = 0.00 < 0.05) Performance has regressed. Found 21 outliers among 100 measurements (21.00%) 4 (4.00%) high mild 17 (17.00%) high severe mat4_tr_mul_m time: [9.0668 ns 9.0724 ns 9.0786 ns] change: [+206.70% +206.94% +207.18%] (p = 0.00 < 0.05) Performance has regressed. Found 9 outliers among 100 measurements (9.00%) 1 (1.00%) low severe 5 (5.00%) high mild 3 (3.00%) high severe mat2_add_m time: [1.7747 ns 1.7768 ns 1.7794 ns] change: [+139.74% +140.01% +140.28%] (p = 0.00 < 0.05) Performance has regressed. Found 12 outliers among 100 measurements (12.00%) 2 (2.00%) high mild 10 (10.00%) high severe mat3_add_m time: [4.1347 ns 4.1397 ns 4.1450 ns] change: [+158.49% +158.79% +159.11%] (p = 0.00 < 0.05) Performance has regressed. mat4_add_m time: [6.3138 ns 6.3202 ns 6.3277 ns] change: [+113.14% +113.45% +113.79%] (p = 0.00 < 0.05) Performance has regressed. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild mat2_sub_m time: [1.7622 ns 1.7636 ns 1.7653 ns] change: [+138.09% +138.34% +138.58%] (p = 0.00 < 0.05) Performance has regressed. Found 11 outliers among 100 measurements (11.00%) 8 (8.00%) high mild 3 (3.00%) high severe mat3_sub_m time: [4.1355 ns 4.1407 ns 4.1472 ns] change: [+159.26% +159.58% +159.93%] (p = 0.00 < 0.05) Performance has regressed. Found 6 outliers among 100 measurements (6.00%) 6 (6.00%) high mild mat4_sub_m time: [6.3649 ns 6.3712 ns 6.3777 ns] change: [+113.88% +114.05% +114.23%] (p = 0.00 < 0.05) Performance has regressed. Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild mat2_mul_v time: [1.8745 ns 1.8809 ns 1.8880 ns] change: [+490.83% +492.86% +494.55%] (p = 0.00 < 0.05) Performance has regressed. Found 5 outliers among 100 measurements (5.00%) 5 (5.00%) high mild mat3_mul_v time: [8.7801 ns 8.7907 ns 8.8027 ns] change: [+1890.3% +1894.0% +1898.5%] (p = 0.00 < 0.05) Performance has regressed. Found 17 outliers among 100 measurements (17.00%) 3 (3.00%) high mild 14 (14.00%) high severe mat4_mul_v time: [2.7012 ns 2.7086 ns 2.7170 ns] change: [+264.61% +265.53% +266.48%] (p = 0.00 < 0.05) Performance has regressed. Found 5 outliers among 100 measurements (5.00%) 4 (4.00%) high mild 1 (1.00%) high severe mat2_tr_mul_v time: [1.3577 ns 1.3579 ns 1.3582 ns] change: [+325.94% +326.43% +326.83%] (p = 0.00 < 0.05) Performance has regressed. Found 7 outliers among 100 measurements (7.00%) 1 (1.00%) low severe 2 (2.00%) low mild 2 (2.00%) high mild 2 (2.00%) high severe mat3_tr_mul_v time: [2.3408 ns 2.3449 ns 2.3491 ns] change: [+419.84% +420.66% +421.31%] (p = 0.00 < 0.05) Performance has regressed. Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild mat4_tr_mul_v time: [3.1961 ns 3.2026 ns 3.2100 ns] change: [+329.36% +330.71% +332.28%] (p = 0.00 < 0.05) Performance has regressed. Found 7 outliers among 100 measurements (7.00%) 2 (2.00%) high mild 5 (5.00%) high severe mat2_mul_s time: [1.5770 ns 1.5804 ns 1.5846 ns] change: [+112.05% +112.90% +113.84%] (p = 0.00 < 0.05) Performance has regressed. Found 17 outliers among 100 measurements (17.00%) 2 (2.00%) high mild 15 (15.00%) high severe mat3_mul_s time: [3.2606 ns 3.2749 ns 3.2909 ns] change: [+105.41% +106.09% +106.85%] (p = 0.00 < 0.05) Performance has regressed. Found 17 outliers among 100 measurements (17.00%) 6 (6.00%) high mild 11 (11.00%) high severe mat4_mul_s time: [5.3422 ns 5.3465 ns 5.3512 ns] change: [+80.678% +80.813% +80.952%] (p = 0.00 < 0.05) Performance has regressed. Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high mild mat2_div_s time: [1.6070 ns 1.6156 ns 1.6256 ns] change: [+117.73% +118.67% +119.89%] (p = 0.00 < 0.05) Performance has regressed. Found 21 outliers among 100 measurements (21.00%) 1 (1.00%) high mild 20 (20.00%) high severe mat3_div_s time: [3.3834 ns 3.3934 ns 3.4053 ns] change: [+112.87% +113.44% +114.19%] (p = 0.00 < 0.05) Performance has regressed. Found 17 outliers among 100 measurements (17.00%) 3 (3.00%) high mild 14 (14.00%) high severe mat4_div_s time: [5.6942 ns 5.6986 ns 5.7034 ns] change: [+91.417% +91.588% +91.762%] (p = 0.00 < 0.05) Performance has regressed. Found 4 outliers among 100 measurements (4.00%) 4 (4.00%) high mild mat2_inv time: [1.8721 ns 1.8725 ns 1.8731 ns] change: [+8.9164% +9.1056% +9.2598%] (p = 0.00 < 0.05) Performance has regressed. Found 8 outliers among 100 measurements (8.00%) 2 (2.00%) high mild 6 (6.00%) high severe mat3_inv time: [5.3171 ns 5.3242 ns 5.3315 ns] change: [+2.0336% +2.1086% +2.1810%] (p = 0.00 < 0.05) Performance has regressed. Found 12 outliers among 100 measurements (12.00%) 8 (8.00%) low severe 2 (2.00%) low mild 2 (2.00%) high severe mat4_inv time: [27.564 ns 27.591 ns 27.632 ns] change: [-5.5949% -5.4858% -5.3888%] (p = 0.00 < 0.05) Performance has improved. Found 4 outliers among 100 measurements (4.00%) 4 (4.00%) high severe mat2_transpose time: [1.2243 ns 1.2248 ns 1.2258 ns] change: [+9.6634% +9.7213% +9.7847%] (p = 0.00 < 0.05) Performance has regressed. Found 8 outliers among 100 measurements (8.00%) 3 (3.00%) high mild 5 (5.00%) high severe mat3_transpose time: [2.6247 ns 2.6261 ns 2.6276 ns] change: [+3.2032% +3.2351% +3.2676%] (p = 0.00 < 0.05) Performance has regressed. Found 13 outliers among 100 measurements (13.00%) 2 (2.00%) low severe 1 (1.00%) low mild 2 (2.00%) high mild 8 (8.00%) high severe mat4_transpose time: [5.0910 ns 5.0925 ns 5.0938 ns] change: [+3.4672% +3.5265% +3.5802%] (p = 0.00 < 0.05) Performance has regressed. Found 3 outliers among 100 measurements (3.00%) 2 (2.00%) low mild 1 (1.00%) high mild mat_div_scalar time: [621.16 µs 621.39 µs 621.68 µs] change: [-0.9877% -0.8953% -0.8031%] (p = 0.00 < 0.05) Change within noise threshold. Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) high mild 7 (7.00%) high severe mat100_add_mat100 time: [1.7506 µs 1.7515 µs 1.7526 µs] change: [+0.7028% +0.7809% +0.8593%] (p = 0.00 < 0.05) Change within noise threshold. Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild mat4_mul_mat4 time: [33.744 ns 33.794 ns 33.898 ns] change: [+17.549% +17.748% +18.055%] (p = 0.00 < 0.05) Performance has regressed. Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 3 (3.00%) high mild 4 (4.00%) high severe mat5_mul_mat5 time: [48.309 ns 48.319 ns 48.331 ns] change: [-44.665% -44.512% -44.362%] (p = 0.00 < 0.05) Performance has improved. Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe mat6_mul_mat6 time: [77.366 ns 77.384 ns 77.405 ns] change: [+0.4793% +0.5202% +0.5684%] (p = 0.00 < 0.05) Change within noise threshold. Found 7 outliers among 100 measurements (7.00%) 3 (3.00%) high mild 4 (4.00%) high severe mat7_mul_mat7 time: [83.373 ns 83.398 ns 83.430 ns] change: [-0.3133% -0.1903% -0.0321%] (p = 0.00 < 0.05) Change within noise threshold. Found 4 outliers among 100 measurements (4.00%) 3 (3.00%) high mild 1 (1.00%) high severe mat8_mul_mat8 time: [69.337 ns 69.426 ns 69.532 ns] change: [-0.6629% -0.2159% +0.0430%] (p = 0.34 > 0.05) No change in performance detected. Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) low mild 2 (2.00%) high mild 6 (6.00%) high severe mat9_mul_mat9 time: [187.68 ns 187.79 ns 187.89 ns] change: [+0.3227% +0.3859% +0.4503%] (p = 0.00 < 0.05) Change within noise threshold. mat10_mul_mat10 time: [199.87 ns 200.02 ns 200.18 ns] change: [+2.5979% +2.6928% +2.7811%] (p = 0.00 < 0.05) Performance has regressed. Found 5 outliers among 100 measurements (5.00%) 1 (1.00%) low severe 2 (2.00%) low mild 2 (2.00%) high mild mat10_mul_mat10_static time: [123.43 ns 123.48 ns 123.56 ns] change: [+17.238% +17.404% +17.662%] (p = 0.00 < 0.05) Performance has regressed. Found 9 outliers among 100 measurements (9.00%) 2 (2.00%) low mild 4 (4.00%) high mild 3 (3.00%) high severe mat100_mul_mat100 time: [39.498 µs 39.600 µs 39.717 µs] change: [+0.5117% +0.6856% +0.8528%] (p = 0.00 < 0.05) Change within noise threshold. mat500_mul_mat500 time: [4.3615 ms 4.3638 ms 4.3667 ms] change: [-1.5042% -1.4307% -1.3514%] (p = 0.00 < 0.05) Performance has improved. Found 13 outliers among 100 measurements (13.00%) 13 (13.00%) high severe iter time: [851.64 µs 851.73 µs 851.84 µs] change: [+10.980% +11.254% +11.506%] (p = 0.00 < 0.05) Performance has regressed. Found 12 outliers among 100 measurements (12.00%) 4 (4.00%) high mild 8 (8.00%) high severe iter_rev time: [212.95 µs 212.98 µs 213.02 µs] change: [+0.2664% +0.5462% +0.7106%] (p = 0.00 < 0.05) Change within noise threshold. Found 7 outliers among 100 measurements (7.00%) 7 (7.00%) high severe copy_from time: [141.14 µs 141.31 µs 141.53 µs] change: [-0.8809% -0.5821% -0.3187%] (p = 0.00 < 0.05) Change within noise threshold. Found 6 outliers among 100 measurements (6.00%) 1 (1.00%) high mild 5 (5.00%) high severe axpy time: [17.648 µs 17.693 µs 17.741 µs] change: [-1.2897% -1.0169% -0.7049%] (p = 0.00 < 0.05) Change within noise threshold. Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) high mild 1 (1.00%) high severe tr_mul_to time: [136.20 µs 136.28 µs 136.36 µs] change: [+0.5699% +1.3218% +1.7817%] (p = 0.00 < 0.05) Change within noise threshold. Found 18 outliers among 100 measurements (18.00%) 10 (10.00%) high mild 8 (8.00%) high severe mat_mul_mat time: [39.931 µs 39.970 µs 40.007 µs] change: [+2.4660% +2.5525% +2.6388%] (p = 0.00 < 0.05) Performance has regressed. mat100_from_fn time: [6.8839 µs 6.8872 µs 6.8901 µs] change: [+514.04% +514.93% +515.78%] (p = 0.00 < 0.05) Performance has regressed. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild mat500_from_fn time: [173.41 µs 173.45 µs 173.49 µs] change: [+500.13% +501.19% +502.54%] (p = 0.00 < 0.05) Performance has regressed. Found 13 outliers among 100 measurements (13.00%) 5 (5.00%) low severe 5 (5.00%) low mild 3 (3.00%) high severe vec2_add_v_f32 time: [1.1836 ns 1.1840 ns 1.1843 ns] change: [+273.42% +273.97% +274.44%] (p = 0.00 < 0.05) Performance has regressed. Found 11 outliers among 100 measurements (11.00%) 1 (1.00%) low severe 2 (2.00%) low mild 4 (4.00%) high mild 4 (4.00%) high severe vec3_add_v_f32 time: [1.7895 ns 1.7906 ns 1.7918 ns] change: [+304.64% +305.11% +305.60%] (p = 0.00 < 0.05) Performance has regressed. Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild vec4_add_v_f32 time: [1.7916 ns 1.7944 ns 1.7979 ns] change: [+143.09% +143.48% +143.88%] (p = 0.00 < 0.05) Performance has regressed. Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) high mild 2 (2.00%) high severe vec2_add_v_f64 time: [1.1722 ns 1.1734 ns 1.1748 ns] change: [+269.25% +270.01% +270.79%] (p = 0.00 < 0.05) Performance has regressed. Found 5 outliers among 100 measurements (5.00%) 2 (2.00%) low mild 1 (1.00%) high mild 2 (2.00%) high severe vec3_add_v_f64 time: [1.9394 ns 1.9423 ns 1.9449 ns] change: [+331.48% +332.20% +332.90%] (p = 0.00 < 0.05) Performance has regressed. Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild vec4_add_v_f64 time: [2.2729 ns 2.2761 ns 2.2795 ns] change: [+252.84% +253.55% +254.30%] (p = 0.00 < 0.05) Performance has regressed. Found 11 outliers among 100 measurements (11.00%) 9 (9.00%) high mild 2 (2.00%) high severe vec2_sub_v time: [1.2017 ns 1.2029 ns 1.2044 ns] change: [+274.58% +275.29% +276.09%] (p = 0.00 < 0.05) Performance has regressed. Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high severe vec3_sub_v time: [1.7818 ns 1.7838 ns 1.7861 ns] change: [+303.94% +304.55% +305.21%] (p = 0.00 < 0.05) Performance has regressed. Found 6 outliers among 100 measurements (6.00%) 1 (1.00%) low mild 5 (5.00%) high mild vec4_sub_v time: [1.7921 ns 1.7936 ns 1.7951 ns] change: [+140.58% +141.03% +141.57%] (p = 0.00 < 0.05) Performance has regressed. Found 10 outliers among 100 measurements (10.00%) 3 (3.00%) low mild 5 (5.00%) high mild 2 (2.00%) high severe vec2_mul_s time: [985.48 ps 985.59 ps 985.78 ps] change: [+210.12% +210.25% +210.38%] (p = 0.00 < 0.05) Performance has regressed. Found 6 outliers among 100 measurements (6.00%) 3 (3.00%) high mild 3 (3.00%) high severe vec3_mul_s time: [1.4056 ns 1.4059 ns 1.4062 ns] change: [+217.88% +218.05% +218.20%] (p = 0.00 < 0.05) Performance has regressed. Found 10 outliers among 100 measurements (10.00%) 3 (3.00%) low mild 1 (1.00%) high mild 6 (6.00%) high severe vec4_mul_s time: [1.5828 ns 1.5839 ns 1.5850 ns] change: [+114.04% +114.28% +114.53%] (p = 0.00 < 0.05) Performance has regressed. Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) low mild 2 (2.00%) high mild 2 (2.00%) high severe vec2_div_s time: [1.4854 ns 1.4860 ns 1.4867 ns] change: [+366.97% +367.15% +367.33%] (p = 0.00 < 0.05) Performance has regressed. Found 7 outliers among 100 measurements (7.00%) 1 (1.00%) low severe 2 (2.00%) low mild 1 (1.00%) high mild 3 (3.00%) high severe vec3_div_s time: [1.4821 ns 1.4832 ns 1.4848 ns] change: [+229.77% +230.14% +230.45%] (p = 0.00 < 0.05) Performance has regressed. Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe vec4_div_s time: [1.6161 ns 1.6175 ns 1.6189 ns] change: [+116.74% +117.02% +117.30%] (p = 0.00 < 0.05) Performance has regressed. Found 7 outliers among 100 measurements (7.00%) 3 (3.00%) low mild 2 (2.00%) high mild 2 (2.00%) high severe vec2_dot_f32 time: [715.01 ps 717.55 ps 721.65 ps] change: [+231.29% +232.12% +233.08%] (p = 0.00 < 0.05) Performance has regressed. Found 6 outliers among 100 measurements (6.00%) 5 (5.00%) high mild 1 (1.00%) high severe vec3_dot_f32 time: [7.5379 ns 7.5393 ns 7.5412 ns] change: [+3441.1% +3445.4% +3449.7%] (p = 0.00 < 0.05) Performance has regressed. Found 16 outliers among 100 measurements (16.00%) 4 (4.00%) high mild 12 (12.00%) high severe vec4_dot_f32 time: [1.1914 ns 1.1941 ns 1.1968 ns] change: [+455.67% +456.70% +457.85%] (p = 0.00 < 0.05) Performance has regressed. Found 4 outliers among 100 measurements (4.00%) 4 (4.00%) high mild vec2_dot_f64 time: [833.73 ps 834.75 ps 835.77 ps] change: [+286.15% +286.81% +287.48%] (p = 0.00 < 0.05) Performance has regressed. Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild vec3_dot_f64 time: [7.5031 ns 7.5143 ns 7.5302 ns] change: [+3387.4% +3390.9% +3395.5%] (p = 0.00 < 0.05) Performance has regressed. Found 4 outliers among 100 measurements (4.00%) 3 (3.00%) high mild 1 (1.00%) high severe vec4_dot_f64 time: [1.2572 ns 1.2582 ns 1.2593 ns] change: [+481.66% +482.79% +483.69%] (p = 0.00 < 0.05) Performance has regressed. Found 8 outliers among 100 measurements (8.00%) 4 (4.00%) high mild 4 (4.00%) high severe vec3_cross time: [7.6360 ns 7.6371 ns 7.6383 ns] change: [+1603.2% +1604.4% +1605.3%] (p = 0.00 < 0.05) Performance has regressed. Found 13 outliers among 100 measurements (13.00%) 1 (1.00%) low severe 4 (4.00%) high mild 8 (8.00%) high severe vec2_norm time: [1.0934 ns 1.0936 ns 1.0939 ns] change: [+0.4240% +0.4477% +0.4713%] (p = 0.00 < 0.05) Change within noise threshold. Found 7 outliers among 100 measurements (7.00%) 1 (1.00%) low severe 1 (1.00%) low mild 4 (4.00%) high mild 1 (1.00%) high severe vec3_norm time: [1.1115 ns 1.1117 ns 1.1120 ns] change: [-1.2543% -1.1951% -1.1429%] (p = 0.00 < 0.05) Performance has improved. Found 11 outliers among 100 measurements (11.00%) 2 (2.00%) low severe 3 (3.00%) low mild 1 (1.00%) high mild 5 (5.00%) high severe vec4_norm time: [1.1119 ns 1.1121 ns 1.1124 ns] change: [-2.7306% -2.5540% -2.4351%] (p = 0.00 < 0.05) Performance has improved. Found 6 outliers among 100 measurements (6.00%) 2 (2.00%) low severe 2 (2.00%) low mild 2 (2.00%) high severe vec2_normalize time: [2.4860 ns 2.4869 ns 2.4880 ns] change: [+0.8836% +0.9584% +1.0264%] (p = 0.00 < 0.05) Change within noise threshold. Found 14 outliers among 100 measurements (14.00%) 5 (5.00%) low severe 2 (2.00%) low mild 1 (1.00%) high mild 6 (6.00%) high severe vec3_normalize time: [2.5838 ns 2.5843 ns 2.5850 ns] change: [+2.4280% +2.5396% +2.6267%] (p = 0.00 < 0.05) Performance has regressed. Found 11 outliers among 100 measurements (11.00%) 8 (8.00%) high mild 3 (3.00%) high severe vec4_normalize time: [1.9319 ns 1.9321 ns 1.9323 ns] change: [+3.0547% +3.1098% +3.1644%] (p = 0.00 < 0.05) Performance has regressed. Found 9 outliers among 100 measurements (9.00%) 3 (3.00%) low mild 1 (1.00%) high mild 5 (5.00%) high severe vec10000_dot_f64 time: [2.4662 µs 2.4669 µs 2.4677 µs] change: [+103.74% +103.80% +103.89%] (p = 0.00 < 0.05) Performance has regressed. Found 4 outliers among 100 measurements (4.00%) 1 (1.00%) low mild 1 (1.00%) high mild 2 (2.00%) high severe vec10000_dot_f32 time: [1.7355 µs 1.7368 µs 1.7386 µs] change: [+56.145% +56.539% +56.985%] (p = 0.00 < 0.05) Performance has regressed. Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe vec10000_axpy_f64 time: [1.5285 µs 1.5289 µs 1.5293 µs] change: [+1.5407% +1.5954% +1.6519%] (p = 0.00 < 0.05) Performance has regressed. Found 9 outliers among 100 measurements (9.00%) 1 (1.00%) low severe 3 (3.00%) low mild 4 (4.00%) high mild 1 (1.00%) high severe vec10000_axpy_beta_f64 time: [1.6062 µs 1.6083 µs 1.6123 µs] change: [-1.2658% -1.1519% -1.0143%] (p = 0.00 < 0.05) Performance has improved. Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe vec10000_axpy_f64_slice time: [1.4899 µs 1.4900 µs 1.4902 µs] change: [+1.0994% +1.1302% +1.1582%] (p = 0.00 < 0.05) Performance has regressed. Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low severe 2 (2.00%) low mild 3 (3.00%) high mild 2 (2.00%) high severe vec10000_axpy_f64_static time: [1.4381 µs 1.4384 µs 1.4386 µs] change: [-1.4760% -1.3183% -1.2239%] (p = 0.00 < 0.05) Performance has improved. Found 6 outliers among 100 measurements (6.00%) 1 (1.00%) low mild 4 (4.00%) high mild 1 (1.00%) high severe vec10000_axpy_f32 time: [758.37 ns 758.47 ns 758.59 ns] change: [+0.7210% +1.1900% +1.4537%] (p = 0.00 < 0.05) Change within noise threshold. Found 5 outliers among 100 measurements (5.00%) 4 (4.00%) high mild 1 (1.00%) high severe vec10000_axpy_beta_f32 time: [859.70 ns 859.90 ns 860.17 ns] change: [+7.1477% +7.2858% +7.4170%] (p = 0.00 < 0.05) Performance has regressed. Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) high mild 1 (1.00%) high severe quaternion_add_q time: [1.7877 ns 1.7890 ns 1.7903 ns] change: [+140.68% +140.91% +141.14%] (p = 0.00 < 0.05) Performance has regressed. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high severe quaternion_sub_q time: [1.7894 ns 1.7907 ns 1.7920 ns] change: [+140.86% +141.23% +141.61%] (p = 0.00 < 0.05) Performance has regressed. Found 4 outliers among 100 measurements (4.00%) 1 (1.00%) high mild 3 (3.00%) high severe quaternion_mul_q time: [3.2688 ns 3.2697 ns 3.2705 ns] change: [+342.95% +343.27% +343.63%] (p = 0.00 < 0.05) Performance has regressed. Found 10 outliers among 100 measurements (10.00%) 6 (6.00%) low mild 1 (1.00%) high mild 3 (3.00%) high severe unit_quaternion_mul_v time: [11.541 ns 11.549 ns 11.563 ns] change: [+2500.6% +2504.0% +2506.8%] (p = 0.00 < 0.05) Performance has regressed. Found 24 outliers among 100 measurements (24.00%) 5 (5.00%) low severe 6 (6.00%) low mild 6 (6.00%) high mild 7 (7.00%) high severe quaternion_mul_s time: [1.5707 ns 1.5711 ns 1.5715 ns] change: [+112.20% +112.40% +112.57%] (p = 0.00 < 0.05) Performance has regressed. Found 9 outliers among 100 measurements (9.00%) 1 (1.00%) low mild 5 (5.00%) high mild 3 (3.00%) high severe quaternion_div_s time: [1.5778 ns 1.5785 ns 1.5794 ns] change: [+112.39% +112.52% +112.66%] (p = 0.00 < 0.05) Performance has regressed. Found 9 outliers among 100 measurements (9.00%) 2 (2.00%) low severe 3 (3.00%) high mild 4 (4.00%) high severe quaternion_inv time: [1.9206 ns 1.9213 ns 1.9220 ns] change: [+5.0362% +5.0895% +5.1421%] (p = 0.00 < 0.05) Performance has regressed. Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) high mild 1 (1.00%) high severe unit_quaternion_inv time: [1.3271 ns 1.3285 ns 1.3295 ns] change: [+9.1358% +9.2383% +9.3384%] (p = 0.00 < 0.05) Performance has regressed. bidiagonalize_100x100 time: [265.82 µs 266.42 µs 267.32 µs] change: [-0.4676% -0.2799% -0.0688%] (p = 0.00 < 0.05) Change within noise threshold. Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high severe Benchmarking bidiagonalize_100x500: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.8s, enable flat sampling, or reduce sample count to 50. bidiagonalize_100x500 time: [1.9467 ms 1.9530 ms 1.9592 ms] change: [-4.9336% -4.6776% -4.4570%] (p = 0.00 < 0.05) Performance has improved. Found 32 outliers among 100 measurements (32.00%) 8 (8.00%) low mild 2 (2.00%) high mild 22 (22.00%) high severe bidiagonalize_4x4 time: [248.64 ns 248.71 ns 248.81 ns] change: [-5.2386% -5.1754% -5.1180%] (p = 0.00 < 0.05) Performance has improved. Found 7 outliers among 100 measurements (7.00%) 7 (7.00%) high severe Benchmarking bidiagonalize_500x100: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.3s, enable flat sampling, or reduce sample count to 50. bidiagonalize_500x100 time: [1.6480 ms 1.6490 ms 1.6504 ms] change: [-1.6630% -1.4692% -1.2645%] (p = 0.00 < 0.05) Performance has improved. Found 13 outliers among 100 measurements (13.00%) 4 (4.00%) high mild 9 (9.00%) high severe bidiagonalize_unpack_100x100 time: [523.30 µs 523.38 µs 523.46 µs] change: [-0.2366% -0.1909% -0.1462%] (p = 0.00 < 0.05) Change within noise threshold. Found 10 outliers among 100 measurements (10.00%) 1 (1.00%) low mild 5 (5.00%) high mild 4 (4.00%) high severe bidiagonalize_unpack_100x500 time: [3.0536 ms 3.0596 ms 3.0656 ms] change: [+0.4142% +0.6144% +0.8242%] (p = 0.00 < 0.05) Change within noise threshold. bidiagonalize_unpack_500x100 time: [2.6027 ms 2.6039 ms 2.6052 ms] change: [-0.3104% -0.1818% -0.0919%] (p = 0.00 < 0.05) Change within noise threshold. cholesky_100x100 time: [37.281 µs 37.289 µs 37.296 µs] change: [+15.445% +15.524% +15.590%] (p = 0.00 < 0.05) Performance has regressed. Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild cholesky_500x500 time: [4.8081 ms 4.8162 ms 4.8260 ms] change: [+5.9663% +6.2001% +6.4592%] (p = 0.00 < 0.05) Performance has regressed. Found 30 outliers among 100 measurements (30.00%) 19 (19.00%) low severe 11 (11.00%) high severe cholesky_decompose_unpack_100x100 time: [37.755 µs 37.763 µs 37.773 µs] change: [+14.066% +14.305% +14.477%] (p = 0.00 < 0.05) Performance has regressed. Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe cholesky_decompose_unpack_500x500 time: [4.6743 ms 4.6891 ms 4.7052 ms] change: [+2.1368% +2.4522% +2.8032%] (p = 0.00 < 0.05) Performance has regressed. Found 19 outliers among 100 measurements (19.00%) 19 (19.00%) high severe cholesky_solve_10x10 time: [160.86 ns 160.97 ns 161.17 ns] change: [+0.2713% +0.3412% +0.4236%] (p = 0.00 < 0.05) Change within noise threshold. Found 20 outliers among 100 measurements (20.00%) 3 (3.00%) high mild 17 (17.00%) high severe cholesky_solve_100x100 time: [2.7392 µs 2.7399 µs 2.7407 µs] change: [-0.2820% -0.2443% -0.2066%] (p = 0.00 < 0.05) Change within noise threshold. Found 6 outliers among 100 measurements (6.00%) 2 (2.00%) high mild 4 (4.00%) high severe cholesky_solve_500x500 time: [52.883 µs 52.896 µs 52.917 µs] change: [+1.9926% +2.2238% +2.5966%] (p = 0.00 < 0.05) Performance has regressed. Found 7 outliers among 100 measurements (7.00%) 4 (4.00%) high mild 3 (3.00%) high severe cholesky_inverse_10x10 time: [1.3102 µs 1.3110 µs 1.3119 µs] change: [+1.5928% +1.6789% +1.7629%] (p = 0.00 < 0.05) Performance has regressed. cholesky_inverse_100x100 time: [276.96 µs 276.98 µs 277.01 µs] change: [+0.3069% +0.3685% +0.4182%] (p = 0.00 < 0.05) Change within noise threshold. Found 10 outliers among 100 measurements (10.00%) 2 (2.00%) low severe 1 (1.00%) low mild 3 (3.00%) high mild 4 (4.00%) high severe cholesky_inverse_500x500 time: [27.078 ms 27.084 ms 27.090 ms] change: [+2.3068% +2.3369% +2.3710%] (p = 0.00 < 0.05) Performance has regressed. Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild full_piv_lu_decompose_10x10 time: [560.68 ns 561.00 ns 561.33 ns] change: [+0.6362% +0.7103% +0.7954%] (p = 0.00 < 0.05) Change within noise threshold. Found 3 outliers among 100 measurements (3.00%) 2 (2.00%) low mild 1 (1.00%) high severe full_piv_lu_decompose_100x100 time: [207.09 µs 207.12 µs 207.15 µs] change: [-0.3356% -0.2879% -0.2475%] (p = 0.00 < 0.05) Change within noise threshold. Found 6 outliers among 100 measurements (6.00%) 1 (1.00%) low severe 1 (1.00%) low mild 3 (3.00%) high mild 1 (1.00%) high severe full_piv_lu_solve_10x10 time: [117.33 ns 117.39 ns 117.46 ns] change: [-1.0837% -1.0232% -0.9624%] (p = 0.00 < 0.05) Change within noise threshold. Found 15 outliers among 100 measurements (15.00%) 4 (4.00%) high mild 11 (11.00%) high severe full_piv_lu_solve_100x100 time: [2.1694 µs 2.1707 µs 2.1729 µs] change: [-1.7197% -1.6315% -1.5183%] (p = 0.00 < 0.05) Performance has improved. Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) high mild 2 (2.00%) high severe full_piv_lu_inverse_10x10 time: [857.13 ns 857.32 ns 857.52 ns] change: [-0.4396% -0.3489% -0.2379%] (p = 0.00 < 0.05) Change within noise threshold. Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe full_piv_lu_inverse_100x100 time: [211.92 µs 212.00 µs 212.10 µs] change: [-2.0475% -1.9749% -1.9135%] (p = 0.00 < 0.05) Performance has improved. Found 10 outliers among 100 measurements (10.00%) 5 (5.00%) low mild 2 (2.00%) high mild 3 (3.00%) high severe full_piv_lu_determinant_10x10 time: [3.4777 ns 3.4794 ns 3.4814 ns] change: [+17.827% +17.938% +18.036%] (p = 0.00 < 0.05) Performance has regressed. Found 3 outliers among 100 measurements (3.00%) 2 (2.00%) high mild 1 (1.00%) high severe full_piv_lu_determinant_100x100 time: [38.435 ns 38.454 ns 38.475 ns] change: [+3.5887% +3.6755% +3.7612%] (p = 0.00 < 0.05) Performance has regressed. Found 3 outliers among 100 measurements (3.00%) 2 (2.00%) high mild 1 (1.00%) high severe hessenberg_decompose_4x4 time: [114.52 ns 114.54 ns 114.57 ns] change: [-0.6226% -0.5236% -0.4055%] (p = 0.00 < 0.05) Change within noise threshold. Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high severe hessenberg_decompose_100x100 time: [289.44 µs 289.48 µs 289.54 µs] change: [-0.1901% -0.1355% -0.0850%] (p = 0.00 < 0.05) Change within noise threshold. Found 5 outliers among 100 measurements (5.00%) 3 (3.00%) high mild 2 (2.00%) high severe hessenberg_decompose_200x200 time: [2.2102 ms 2.2147 ms 2.2212 ms] change: [+0.8072% +1.0155% +1.2723%] (p = 0.00 < 0.05) Change within noise threshold. Found 16 outliers among 100 measurements (16.00%) 12 (12.00%) low severe 4 (4.00%) high severe hessenberg_decompose_unpack_100x100 time: [428.66 µs 428.77 µs 428.91 µs] change: [-0.2769% -0.2390% -0.1885%] (p = 0.00 < 0.05) Change within noise threshold. Found 11 outliers among 100 measurements (11.00%) 2 (2.00%) high mild 9 (9.00%) high severe hessenberg_decompose_unpack_200x200 time: [3.2263 ms 3.2288 ms 3.2314 ms] change: [+0.8029% +1.0059% +1.1576%] (p = 0.00 < 0.05) Change within noise threshold. Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild lu_decompose_10x10 time: [361.75 ns 362.08 ns 362.41 ns] change: [+13.439% +13.613% +13.784%] (p = 0.00 < 0.05) Performance has regressed. Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) low severe 1 (1.00%) high mild lu_decompose_100x100 time: [73.649 µs 73.662 µs 73.687 µs] change: [+0.9113% +0.9610% +1.0216%] (p = 0.00 < 0.05) Change within noise threshold. Found 10 outliers among 100 measurements (10.00%) 3 (3.00%) low mild 5 (5.00%) high mild 2 (2.00%) high severe lu_solve_10x10 time: [110.70 ns 110.74 ns 110.78 ns] change: [-0.6342% -0.5758% -0.5265%] (p = 0.00 < 0.05) Change within noise threshold. Found 5 outliers among 100 measurements (5.00%) 1 (1.00%) low mild 3 (3.00%) high mild 1 (1.00%) high severe lu_solve_100x100 time: [2.1038 µs 2.1047 µs 2.1059 µs] change: [-2.7288% -2.6424% -2.5009%] (p = 0.00 < 0.05) Performance has improved. Found 11 outliers among 100 measurements (11.00%) 2 (2.00%) low mild 6 (6.00%) high mild 3 (3.00%) high severe lu_inverse_10x10 time: [887.44 ns 887.64 ns 887.88 ns] change: [-2.8481% -2.7682% -2.6966%] (p = 0.00 < 0.05) Performance has improved. Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) high mild 7 (7.00%) high severe lu_inverse_100x100 time: [215.47 µs 215.67 µs 215.90 µs] change: [-1.0637% -0.9343% -0.7643%] (p = 0.00 < 0.05) Change within noise threshold. Found 17 outliers among 100 measurements (17.00%) 8 (8.00%) high mild 9 (9.00%) high severe lu_determinant_10x10 time: [2.5924 ns 2.5970 ns 2.6015 ns] change: [+20.878% +21.192% +21.451%] (p = 0.00 < 0.05) Performance has regressed. Found 5 outliers among 100 measurements (5.00%) 5 (5.00%) high mild lu_determinant_100x100 time: [35.934 ns 35.983 ns 36.032 ns] change: [-1.7500% -1.6698% -1.5889%] (p = 0.00 < 0.05) Performance has improved. Found 20 outliers among 100 measurements (20.00%) 1 (1.00%) low severe 2 (2.00%) low mild 2 (2.00%) high mild 15 (15.00%) high severe qr_decompose_100x100 time: [143.15 µs 143.24 µs 143.40 µs] change: [+0.7070% +0.8463% +1.0082%] (p = 0.00 < 0.05) Change within noise threshold. Found 10 outliers among 100 measurements (10.00%) 1 (1.00%) low severe 2 (2.00%) low mild 3 (3.00%) high mild 4 (4.00%) high severe Benchmarking qr_decompose_100x500: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.1s, enable flat sampling, or reduce sample count to 60. qr_decompose_100x500 time: [1.0041 ms 1.0064 ms 1.0112 ms] change: [-1.0163% -0.8784% -0.6928%] (p = 0.00 < 0.05) Change within noise threshold. Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) high mild 2 (2.00%) high severe qr_decompose_4x4 time: [125.09 ns 125.10 ns 125.12 ns] change: [-0.2898% -0.0935% +0.1416%] (p = 0.49 > 0.05) No change in performance detected. Found 16 outliers among 100 measurements (16.00%) 1 (1.00%) high mild 15 (15.00%) high severe qr_decompose_500x100 time: [834.17 µs 835.03 µs 836.03 µs] change: [-0.3712% -0.1635% +0.0648%] (p = 0.14 > 0.05) No change in performance detected. Found 2 outliers among 100 measurements (2.00%) 1 (1.00%) high mild 1 (1.00%) high severe qr_decompose_unpack_100x100 time: [283.60 µs 283.90 µs 284.16 µs] change: [-0.3949% -0.2993% -0.1910%] (p = 0.00 < 0.05) Change within noise threshold. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) low mild Benchmarking qr_decompose_unpack_100x500: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.8s, enable flat sampling, or reduce sample count to 60. qr_decompose_unpack_100x500 time: [1.1475 ms 1.1491 ms 1.1521 ms] change: [-0.8690% -0.7318% -0.5277%] (p = 0.00 < 0.05) Change within noise threshold. Found 5 outliers among 100 measurements (5.00%) 1 (1.00%) high mild 4 (4.00%) high severe Benchmarking qr_decompose_unpack_500x100: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.5s, enable flat sampling, or reduce sample count to 50. qr_decompose_unpack_500x100 time: [1.6793 ms 1.6797 ms 1.6801 ms] change: [+2.4782% +2.5491% +2.6214%] (p = 0.00 < 0.05) Performance has regressed. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild qr_solve_10x10 time: [152.57 ns 152.63 ns 152.73 ns] change: [-0.4288% -0.3814% -0.3276%] (p = 0.00 < 0.05) Change within noise threshold. Found 9 outliers among 100 measurements (9.00%) 1 (1.00%) low mild 4 (4.00%) high mild 4 (4.00%) high severe qr_solve_100x100 time: [3.3232 µs 3.3254 µs 3.3285 µs] change: [-0.0158% +0.2076% +0.5519%] (p = 0.13 > 0.05) No change in performance detected. Found 13 outliers among 100 measurements (13.00%) 1 (1.00%) low mild 7 (7.00%) high mild 5 (5.00%) high severe qr_inverse_10x10 time: [805.76 ns 806.05 ns 806.44 ns] change: [-0.9942% -0.7502% -0.6015%] (p = 0.00 < 0.05) Change within noise threshold. Found 9 outliers among 100 measurements (9.00%) 2 (2.00%) low mild 5 (5.00%) high mild 2 (2.00%) high severe qr_inverse_100x100 time: [330.09 µs 330.39 µs 330.67 µs] change: [+0.4902% +0.5806% +0.6779%] (p = 0.00 < 0.05) Change within noise threshold. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild schur_decompose_4x4 time: [926.95 ns 928.24 ns 929.26 ns] change: [-13.166% -13.030% -12.884%] (p = 0.00 < 0.05) Performance has improved. schur_decompose_10x10 time: [7.4409 µs 7.4453 µs 7.4492 µs] change: [+1.5363% +1.6395% +1.7360%] (p = 0.00 < 0.05) Performance has regressed. Found 10 outliers among 100 measurements (10.00%) 8 (8.00%) low severe 1 (1.00%) high mild 1 (1.00%) high severe schur_decompose_100x100 time: [2.6115 ms 2.6172 ms 2.6243 ms] change: [+1.6088% +1.8440% +2.1414%] (p = 0.00 < 0.05) Performance has regressed. Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) high mild 2 (2.00%) high severe schur_decompose_200x200 time: [18.406 ms 18.418 ms 18.432 ms] change: [+0.9610% +1.1237% +1.2824%] (p = 0.00 < 0.05) Change within noise threshold. Found 11 outliers among 100 measurements (11.00%) 6 (6.00%) high mild 5 (5.00%) high severe eigenvalues_4x4 time: [852.29 ns 855.47 ns 858.45 ns] change: [-33.645% -33.514% -33.373%] (p = 0.00 < 0.05) Performance has improved. Found 21 outliers among 100 measurements (21.00%) 1 (1.00%) low mild 2 (2.00%) high mild 18 (18.00%) high severe eigenvalues_10x10 time: [5.9802 µs 5.9817 µs 5.9835 µs] change: [+0.4433% +0.5280% +0.5939%] (p = 0.00 < 0.05) Change within noise threshold. Found 9 outliers among 100 measurements (9.00%) 2 (2.00%) high mild 7 (7.00%) high severe Benchmarking eigenvalues_100x100: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.1s, enable flat sampling, or reduce sample count to 50. eigenvalues_100x100 time: [1.5907 ms 1.5927 ms 1.5955 ms] change: [+0.4506% +0.5711% +0.6951%] (p = 0.00 < 0.05) Change within noise threshold. Found 3 outliers among 100 measurements (3.00%) 2 (2.00%) high mild 1 (1.00%) high severe eigenvalues_200x200 time: [11.141 ms 11.142 ms 11.144 ms] change: [-0.1305% -0.0959% -0.0696%] (p = 0.00 < 0.05) Change within noise threshold. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high severe solve_l_triangular_100x100 time: [1.0009 µs 1.0023 µs 1.0044 µs] change: [-3.0545% -2.9536% -2.8531%] (p = 0.00 < 0.05) Performance has improved. Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high severe solve_l_triangular_1000x1000 time: [101.96 µs 102.10 µs 102.26 µs] change: [+0.2874% +0.4251% +0.5544%] (p = 0.00 < 0.05) Change within noise threshold. Found 20 outliers among 100 measurements (20.00%) 1 (1.00%) low severe 3 (3.00%) high mild 16 (16.00%) high severe tr_solve_l_triangular_100x100 time: [1.7602 µs 1.7606 µs 1.7611 µs] change: [+0.1457% +0.2378% +0.3238%] (p = 0.00 < 0.05) Change within noise threshold. Found 8 outliers among 100 measurements (8.00%) 3 (3.00%) high mild 5 (5.00%) high severe tr_solve_l_triangular_1000x1000 time: [95.588 µs 95.613 µs 95.638 µs] change: [+0.6009% +0.8491% +1.0883%] (p = 0.00 < 0.05) Change within noise threshold. Found 6 outliers among 100 measurements (6.00%) 5 (5.00%) high mild 1 (1.00%) high severe solve_u_triangular_100x100 time: [1.1486 µs 1.1486 µs 1.1487 µs] change: [-1.2041% -1.1651% -1.1380%] (p = 0.00 < 0.05) Performance has improved. Found 8 outliers among 100 measurements (8.00%) 2 (2.00%) low mild 2 (2.00%) high mild 4 (4.00%) high severe solve_u_triangular_1000x1000 time: [96.805 µs 96.827 µs 96.850 µs] change: [-2.4840% -2.4423% -2.4081%] (p = 0.00 < 0.05) Performance has improved. Found 7 outliers among 100 measurements (7.00%) 1 (1.00%) low mild 3 (3.00%) high mild 3 (3.00%) high severe tr_solve_u_triangular_100x100 time: [1.1943 µs 1.1947 µs 1.1951 µs] change: [-0.9912% -0.6228% -0.3160%] (p = 0.00 < 0.05) Change within noise threshold. Found 10 outliers among 100 measurements (10.00%) 1 (1.00%) low severe 3 (3.00%) low mild 4 (4.00%) high mild 2 (2.00%) high severe tr_solve_u_triangular_1000x1000 time: [86.848 µs 86.858 µs 86.868 µs] change: [-1.4656% -1.4306% -1.4030%] (p = 0.00 < 0.05) Performance has improved. Found 6 outliers among 100 measurements (6.00%) 1 (1.00%) low mild 4 (4.00%) high mild 1 (1.00%) high severe svd_decompose_2x2 time: [24.714 ns 24.731 ns 24.757 ns] change: [+16.359% +16.416% +16.476%] (p = 0.00 < 0.05) Performance has regressed. Found 5 outliers among 100 measurements (5.00%) 2 (2.00%) low mild 2 (2.00%) high mild 1 (1.00%) high severe svd_decompose_3x3 time: [356.14 ns 356.26 ns 356.41 ns] change: [+7.1080% +7.1682% +7.2242%] (p = 0.00 < 0.05) Performance has regressed. Found 5 outliers among 100 measurements (5.00%) 1 (1.00%) high mild 4 (4.00%) high severe svd_decompose_4x4 time: [973.20 ns 973.36 ns 973.52 ns] change: [-0.1503% -0.1038% -0.0509%] (p = 0.00 < 0.05) Change within noise threshold. Found 7 outliers among 100 measurements (7.00%) 1 (1.00%) low mild 2 (2.00%) high mild 4 (4.00%) high severe svd_decompose_10x10 time: [5.7955 µs 5.7969 µs 5.7982 µs] change: [-1.2212% -1.1496% -1.0648%] (p = 0.00 < 0.05) Performance has improved. Found 11 outliers among 100 measurements (11.00%) 3 (3.00%) high mild 8 (8.00%) high severe Benchmarking svd_decompose_100x100: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.8s, enable flat sampling, or reduce sample count to 50. svd_decompose_100x100 time: [1.5457 ms 1.5461 ms 1.5466 ms] change: [-1.1800% -1.1312% -1.0869%] (p = 0.00 < 0.05) Performance has improved. Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high mild svd_decompose_200x200 time: [11.688 ms 11.698 ms 11.708 ms] change: [-1.5525% -1.4404% -1.3260%] (p = 0.00 < 0.05) Performance has improved. rank_4x4 time: [687.08 ns 687.31 ns 687.60 ns] change: [-4.3382% -4.2619% -4.1730%] (p = 0.00 < 0.05) Performance has improved. Found 12 outliers among 100 measurements (12.00%) 1 (1.00%) low mild 4 (4.00%) high mild 7 (7.00%) high severe rank_10x10 time: [4.2263 µs 4.2314 µs 4.2359 µs] change: [-0.0796% +0.0376% +0.1498%] (p = 0.53 > 0.05) No change in performance detected. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high mild rank_100x100 time: [524.07 µs 525.08 µs 526.09 µs] change: [+0.2055% +0.4232% +0.6185%] (p = 0.00 < 0.05) Change within noise threshold. rank_200x200 time: [3.0034 ms 3.0049 ms 3.0066 ms] change: [-0.0776% -0.0258% +0.0284%] (p = 0.40 > 0.05) No change in performance detected. Found 15 outliers among 100 measurements (15.00%) 7 (7.00%) low severe 6 (6.00%) low mild 2 (2.00%) high severe singular_values_4x4 time: [711.28 ns 711.46 ns 711.69 ns] change: [-10.996% -10.943% -10.895%] (p = 0.00 < 0.05) Performance has improved. Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) high mild 7 (7.00%) high severe singular_values_10x10 time: [4.3082 µs 4.3088 µs 4.3098 µs] change: [+0.2477% +0.2828% +0.3239%] (p = 0.00 < 0.05) Change within noise threshold. Found 13 outliers among 100 measurements (13.00%) 2 (2.00%) low mild 5 (5.00%) high mild 6 (6.00%) high severe singular_values_100x100 time: [520.96 µs 521.13 µs 521.29 µs] change: [-1.3767% -1.2379% -1.0940%] (p = 0.00 < 0.05) Performance has improved. Found 6 outliers among 100 measurements (6.00%) 2 (2.00%) low mild 3 (3.00%) high mild 1 (1.00%) high severe singular_values_200x200 time: [3.0055 ms 3.0063 ms 3.0075 ms] change: [-0.0668% -0.0002% +0.0545%] (p = 0.99 > 0.05) No change in performance detected. Found 7 outliers among 100 measurements (7.00%) 3 (3.00%) low mild 2 (2.00%) high mild 2 (2.00%) high severe pseudo_inverse_4x4 time: [767.27 ns 767.63 ns 768.12 ns] change: [-20.375% -20.322% -20.267%] (p = 0.00 < 0.05) Performance has improved. Found 14 outliers among 100 measurements (14.00%) 5 (5.00%) high mild 9 (9.00%) high severe pseudo_inverse_10x10 time: [6.1395 µs 6.1415 µs 6.1440 µs] change: [+2.0662% +2.1284% +2.1866%] (p = 0.00 < 0.05) Performance has regressed. Found 6 outliers among 100 measurements (6.00%) 3 (3.00%) low mild 1 (1.00%) high mild 2 (2.00%) high severe Benchmarking pseudo_inverse_100x100: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 8.1s, enable flat sampling, or reduce sample count to 50. pseudo_inverse_100x100 time: [1.6018 ms 1.6035 ms 1.6050 ms] change: [-0.0386% +0.0840% +0.2218%] (p = 0.21 > 0.05) No change in performance detected. Found 23 outliers among 100 measurements (23.00%) 13 (13.00%) low severe 3 (3.00%) high mild 7 (7.00%) high severe pseudo_inverse_200x200 time: [11.989 ms 11.997 ms 12.006 ms] change: [-0.7602% -0.5368% -0.3351%] (p = 0.00 < 0.05) Change within noise threshold. symmetric_eigen_decompose_4x4 time: [453.98 ns 454.22 ns 454.53 ns] change: [-10.475% -10.350% -10.207%] (p = 0.00 < 0.05) Performance has improved. Found 18 outliers among 100 measurements (18.00%) 1 (1.00%) low severe 1 (1.00%) low mild 5 (5.00%) high mild 11 (11.00%) high severe symmetric_eigen_decompose_10x10 time: [3.6767 µs 3.6782 µs 3.6800 µs] change: [-1.7404% -1.6869% -1.6361%] (p = 0.00 < 0.05) Performance has improved. Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low mild 1 (1.00%) high mild 6 (6.00%) high severe symmetric_eigen_decompose_100x100 time: [767.36 µs 768.36 µs 769.56 µs] change: [-7.0024% -6.9023% -6.7865%] (p = 0.00 < 0.05) Performance has improved. Found 1 outliers among 100 measurements (1.00%) 1 (1.00%) high severe symmetric_eigen_decompose_200x200 time: [5.2143 ms 5.2218 ms 5.2350 ms] change: [-9.1662% -8.8429% -8.5229%] (p = 0.00 < 0.05) Performance has improved. Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) low severe 1 (1.00%) low mild 6 (6.00%) high severe Benchmarked on AMD Ryzen 9 5950X with rustc 1.89.0 (29483883e 2025-08-04) (Fedora 1.89.0-2.fc42) binary: rustc commit-hash: 29483883eed69d5fb4db01964cdf2af4d86e9cb2 commit-date: 2025-08-04 host: x86_64-unknown-linux-gnu release: 1.89.0 LLVM version: 20.1.8
1 parent 955b0ca commit 9d1c4ef

File tree

13 files changed

+184
-158
lines changed

13 files changed

+184
-158
lines changed

benches/common/macros.rs

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,12 +4,14 @@ macro_rules! bench_binop(
44
($name: ident, $t1: ty, $t2: ty, $binop: ident) => {
55
fn $name(bh: &mut criterion::Criterion) {
66
use rand::SeedableRng;
7+
use std::hint::black_box;
8+
79
let mut rng = IsaacRng::seed_from_u64(0);
810
let a = rng.random::<$t1>();
911
let b = rng.random::<$t2>();
1012

1113
bh.bench_function(stringify!($name), move |bh| bh.iter(|| {
12-
a.$binop(b)
14+
black_box(&a).$binop(black_box(b))
1315
}));
1416
}
1517
}
@@ -19,12 +21,14 @@ macro_rules! bench_binop_ref(
1921
($name: ident, $t1: ty, $t2: ty, $binop: ident) => {
2022
fn $name(bh: &mut criterion::Criterion) {
2123
use rand::SeedableRng;
24+
use std::hint::black_box;
25+
2226
let mut rng = IsaacRng::seed_from_u64(0);
2327
let a = rng.random::<$t1>();
2428
let b = rng.random::<$t2>();
2529

2630
bh.bench_function(stringify!($name), move |bh| bh.iter(|| {
27-
a.$binop(&b)
31+
black_box(&a).$binop(black_box(&b))
2832
}));
2933
}
3034
}
@@ -34,12 +38,14 @@ macro_rules! bench_binop_fn(
3438
($name: ident, $t1: ty, $t2: ty, $binop: path) => {
3539
fn $name(bh: &mut criterion::Criterion) {
3640
use rand::SeedableRng;
41+
use std::hint::black_box;
42+
3743
let mut rng = IsaacRng::seed_from_u64(0);
3844
let a = rng.random::<$t1>();
3945
let b = rng.random::<$t2>();
4046

4147
bh.bench_function(stringify!($name), move |bh| bh.iter(|| {
42-
$binop(&a, &b)
48+
$binop(black_box(&a), black_box(&b))
4349
}));
4450
}
4551
}
@@ -51,6 +57,8 @@ macro_rules! bench_unop(
5157
const LEN: usize = 1 << 13;
5258

5359
use rand::SeedableRng;
60+
use std::hint::black_box;
61+
5462
let mut rng = IsaacRng::seed_from_u64(0);
5563

5664
let mut elems: Vec<$t> = (0usize .. LEN).map(|_| rng.random::<$t>()).collect();
@@ -60,7 +68,7 @@ macro_rules! bench_unop(
6068
i = (i + 1) & (LEN - 1);
6169

6270
unsafe {
63-
std::hint::black_box(elems.get_unchecked_mut(i).$unop())
71+
black_box(elems.get_unchecked_mut(i)).$unop()
6472
}
6573
}));
6674
}

benches/core/matrix.rs

Lines changed: 57 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
use na::{DMatrix, DVector, Matrix2, Matrix3, Matrix4, OMatrix, U10, Vector2, Vector3, Vector4};
22
use rand::Rng;
33
use rand_isaac::IsaacRng;
4+
use std::hint::black_box;
45
use std::ops::{Add, Div, Mul, Sub};
56

67
#[path = "../common/macros.rs"]
@@ -52,7 +53,9 @@ fn mat_div_scalar(b: &mut criterion::Criterion) {
5253

5354
b.bench_function("mat_div_scalar", move |bh| {
5455
bh.iter(|| {
55-
let mut aa = a.clone();
56+
let mut aa = black_box(a.clone());
57+
let n = black_box(n);
58+
5659
let mut b = aa.view_mut((0, 0), (1000, 1000));
5760
b /= n
5861
})
@@ -63,86 +66,108 @@ fn mat100_add_mat100(bench: &mut criterion::Criterion) {
6366
let a = DMatrix::<f64>::new_random(100, 100);
6467
let b = DMatrix::<f64>::new_random(100, 100);
6568

66-
bench.bench_function("mat100_add_mat100", move |bh| bh.iter(|| &a + &b));
69+
bench.bench_function("mat100_add_mat100", move |bh| {
70+
bh.iter(|| black_box(&a) + black_box(&b))
71+
});
6772
}
6873

6974
fn mat4_mul_mat4(bench: &mut criterion::Criterion) {
7075
let a = DMatrix::<f64>::new_random(4, 4);
7176
let b = DMatrix::<f64>::new_random(4, 4);
7277

73-
bench.bench_function("mat4_mul_mat4", move |bh| bh.iter(|| &a * &b));
78+
bench.bench_function("mat4_mul_mat4", move |bh| {
79+
bh.iter(|| black_box(&a) * black_box(&b))
80+
});
7481
}
7582

7683
fn mat5_mul_mat5(bench: &mut criterion::Criterion) {
7784
let a = DMatrix::<f64>::new_random(5, 5);
7885
let b = DMatrix::<f64>::new_random(5, 5);
7986

80-
bench.bench_function("mat5_mul_mat5", move |bh| bh.iter(|| &a * &b));
87+
bench.bench_function("mat5_mul_mat5", move |bh| {
88+
bh.iter(|| black_box(&a) * black_box(&b))
89+
});
8190
}
8291

8392
fn mat6_mul_mat6(bench: &mut criterion::Criterion) {
8493
let a = DMatrix::<f64>::new_random(6, 6);
8594
let b = DMatrix::<f64>::new_random(6, 6);
8695

87-
bench.bench_function("mat6_mul_mat6", move |bh| bh.iter(|| &a * &b));
96+
bench.bench_function("mat6_mul_mat6", move |bh| {
97+
bh.iter(|| black_box(&a) * black_box(&b))
98+
});
8899
}
89100

90101
fn mat7_mul_mat7(bench: &mut criterion::Criterion) {
91102
let a = DMatrix::<f64>::new_random(7, 7);
92103
let b = DMatrix::<f64>::new_random(7, 7);
93104

94-
bench.bench_function("mat7_mul_mat7", move |bh| bh.iter(|| &a * &b));
105+
bench.bench_function("mat7_mul_mat7", move |bh| {
106+
bh.iter(|| black_box(&a) * black_box(&b))
107+
});
95108
}
96109

97110
fn mat8_mul_mat8(bench: &mut criterion::Criterion) {
98111
let a = DMatrix::<f64>::new_random(8, 8);
99112
let b = DMatrix::<f64>::new_random(8, 8);
100113

101-
bench.bench_function("mat8_mul_mat8", move |bh| bh.iter(|| &a * &b));
114+
bench.bench_function("mat8_mul_mat8", move |bh| {
115+
bh.iter(|| black_box(&a) * black_box(&b))
116+
});
102117
}
103118

104119
fn mat9_mul_mat9(bench: &mut criterion::Criterion) {
105120
let a = DMatrix::<f64>::new_random(9, 9);
106121
let b = DMatrix::<f64>::new_random(9, 9);
107122

108-
bench.bench_function("mat9_mul_mat9", move |bh| bh.iter(|| &a * &b));
123+
bench.bench_function("mat9_mul_mat9", move |bh| {
124+
bh.iter(|| black_box(&a) * black_box(&b))
125+
});
109126
}
110127

111128
fn mat10_mul_mat10(bench: &mut criterion::Criterion) {
112129
let a = DMatrix::<f64>::new_random(10, 10);
113130
let b = DMatrix::<f64>::new_random(10, 10);
114131

115-
bench.bench_function("mat10_mul_mat10", move |bh| bh.iter(|| &a * &b));
132+
bench.bench_function("mat10_mul_mat10", move |bh| {
133+
bh.iter(|| black_box(&a) * black_box(&b))
134+
});
116135
}
117136

118137
fn mat10_mul_mat10_static(bench: &mut criterion::Criterion) {
119138
let a = OMatrix::<f64, U10, U10>::new_random();
120139
let b = OMatrix::<f64, U10, U10>::new_random();
121140

122-
bench.bench_function("mat10_mul_mat10_static", move |bh| bh.iter(|| &a * &b));
141+
bench.bench_function("mat10_mul_mat10_static", move |bh| {
142+
bh.iter(|| black_box(&a) * black_box(&b))
143+
});
123144
}
124145

125146
fn mat100_mul_mat100(bench: &mut criterion::Criterion) {
126147
let a = DMatrix::<f64>::new_random(100, 100);
127148
let b = DMatrix::<f64>::new_random(100, 100);
128149

129-
bench.bench_function("mat100_mul_mat100", move |bh| bh.iter(|| &a * &b));
150+
bench.bench_function("mat100_mul_mat100", move |bh| {
151+
bh.iter(|| black_box(&a) * black_box(&b))
152+
});
130153
}
131154

132155
fn mat500_mul_mat500(bench: &mut criterion::Criterion) {
133156
let a = DMatrix::<f64>::from_element(500, 500, 5f64);
134157
let b = DMatrix::<f64>::from_element(500, 500, 6f64);
135158

136-
bench.bench_function("mat500_mul_mat500", move |bh| bh.iter(|| &a * &b));
159+
bench.bench_function("mat500_mul_mat500", move |bh| {
160+
bh.iter(|| black_box(&a) * black_box(&b))
161+
});
137162
}
138163

139164
fn iter(bench: &mut criterion::Criterion) {
140165
let a = DMatrix::<f64>::new_random(1000, 1000);
141166

142167
bench.bench_function("iter", move |bh| {
143168
bh.iter(|| {
144-
for value in a.iter() {
145-
std::hint::black_box(value);
169+
for value in black_box(&a).iter() {
170+
black_box(value);
146171
}
147172
})
148173
});
@@ -153,8 +178,8 @@ fn iter_rev(bench: &mut criterion::Criterion) {
153178

154179
bench.bench_function("iter_rev", move |bh| {
155180
bh.iter(|| {
156-
for value in a.iter().rev() {
157-
std::hint::black_box(value);
181+
for value in black_box(&a).iter().rev() {
182+
black_box(value);
158183
}
159184
})
160185
});
@@ -166,7 +191,7 @@ fn copy_from(bench: &mut criterion::Criterion) {
166191

167192
bench.bench_function("copy_from", move |bh| {
168193
bh.iter(|| {
169-
b.copy_from(&a);
194+
black_box(&mut b).copy_from(black_box(&a));
170195
})
171196
});
172197
}
@@ -178,7 +203,7 @@ fn axpy(bench: &mut criterion::Criterion) {
178203

179204
bench.bench_function("axpy", move |bh| {
180205
bh.iter(|| {
181-
y.axpy(a, &x, 1.0);
206+
black_box(&mut y).axpy(black_box(a), black_box(&x), 1.0);
182207
})
183208
});
184209
}
@@ -188,7 +213,9 @@ fn tr_mul_to(bench: &mut criterion::Criterion) {
188213
let b = DVector::<f64>::new_random(1000);
189214
let mut c = DVector::from_element(1000, 0.0);
190215

191-
bench.bench_function("tr_mul_to", move |bh| bh.iter(|| a.tr_mul_to(&b, &mut c)));
216+
bench.bench_function("tr_mul_to", move |bh| {
217+
bh.iter(|| black_box(&a).tr_mul_to(black_box(&b), black_box(&mut c)))
218+
});
192219
}
193220

194221
fn mat_mul_mat(bench: &mut criterion::Criterion) {
@@ -198,20 +225,28 @@ fn mat_mul_mat(bench: &mut criterion::Criterion) {
198225

199226
bench.bench_function("mat_mul_mat", move |bh| {
200227
bh.iter(|| {
201-
std::hint::black_box(a.mul_to(&b, &mut ab));
228+
std::hint::black_box(black_box(&a).mul_to(black_box(&b), black_box(&mut ab)));
202229
})
203230
});
204231
}
205232

206233
fn mat100_from_fn(bench: &mut criterion::Criterion) {
207234
bench.bench_function("mat100_from_fn", move |bh| {
208-
bh.iter(|| DMatrix::from_fn(100, 100, |a, b| a + b))
235+
bh.iter(|| {
236+
DMatrix::from_fn(black_box(100), black_box(100), |a, b| {
237+
black_box(a) + black_box(b)
238+
})
239+
})
209240
});
210241
}
211242

212243
fn mat500_from_fn(bench: &mut criterion::Criterion) {
213244
bench.bench_function("mat500_from_fn", move |bh| {
214-
bh.iter(|| DMatrix::from_fn(500, 500, |a, b| a + b))
245+
bh.iter(|| {
246+
DMatrix::from_fn(black_box(500), black_box(500), |a, b| {
247+
black_box(a) + black_box(b)
248+
})
249+
})
215250
});
216251
}
217252

benches/core/vector.rs

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
use na::{DVector, SVector, Vector2, Vector3, Vector4};
22
use rand::Rng;
33
use rand_isaac::IsaacRng;
4+
use std::hint::black_box;
45
use std::ops::{Add, Div, Mul, Sub};
56

67
#[path = "../common/macros.rs"]
@@ -55,7 +56,7 @@ fn vec10000_axpy_f64(bh: &mut criterion::Criterion) {
5556
let n = rng.random::<f64>();
5657

5758
bh.bench_function("vec10000_axpy_f64", move |bh| {
58-
bh.iter(|| a.axpy(n, &b, 1.0))
59+
bh.iter(|| black_box(&mut a).axpy(black_box(n), black_box(&b), 1.0))
5960
});
6061
}
6162

@@ -68,7 +69,7 @@ fn vec10000_axpy_beta_f64(bh: &mut criterion::Criterion) {
6869
let beta = rng.random::<f64>();
6970

7071
bh.bench_function("vec10000_axpy_beta_f64", move |bh| {
71-
bh.iter(|| a.axpy(n, &b, beta))
72+
bh.iter(|| black_box(&mut a).axpy(black_box(n), black_box(&b), black_box(beta)))
7273
});
7374
}
7475

@@ -81,10 +82,10 @@ fn vec10000_axpy_f64_slice(bh: &mut criterion::Criterion) {
8182

8283
bh.bench_function("vec10000_axpy_f64_slice", move |bh| {
8384
bh.iter(|| {
84-
let mut a = a.fixed_rows_mut::<10000>(0);
85-
let b = b.fixed_rows::<10000>(0);
85+
let mut a = black_box(&mut a).fixed_rows_mut::<10000>(0);
86+
let b = black_box(&b).fixed_rows::<10000>(0);
8687

87-
a.axpy(n, &b, 1.0)
88+
a.axpy(black_box(n), &b, 1.0)
8889
})
8990
});
9091
}
@@ -98,7 +99,7 @@ fn vec10000_axpy_f64_static(bh: &mut criterion::Criterion) {
9899

99100
// NOTE: for some reasons, it is much faster if the argument are boxed (Box::new(OVector...)).
100101
bh.bench_function("vec10000_axpy_f64_static", move |bh| {
101-
bh.iter(|| a.axpy(n, &b, 1.0))
102+
bh.iter(|| black_box(&mut a).axpy(black_box(n), black_box(&b), 1.0))
102103
});
103104
}
104105

@@ -110,7 +111,7 @@ fn vec10000_axpy_f32(bh: &mut criterion::Criterion) {
110111
let n = rng.random::<f32>();
111112

112113
bh.bench_function("vec10000_axpy_f32", move |bh| {
113-
bh.iter(|| a.axpy(n, &b, 1.0))
114+
bh.iter(|| black_box(&mut a).axpy(black_box(n), black_box(&b), 1.0))
114115
});
115116
}
116117

@@ -123,7 +124,7 @@ fn vec10000_axpy_beta_f32(bh: &mut criterion::Criterion) {
123124
let beta = rng.random::<f32>();
124125

125126
bh.bench_function("vec10000_axpy_beta_f32", move |bh| {
126-
bh.iter(|| a.axpy(n, &b, beta))
127+
bh.iter(|| black_box(&mut a).axpy(black_box(n), black_box(&b), black_box(beta)))
127128
});
128129
}
129130

0 commit comments

Comments
 (0)