Skip to content

feat: option to skip fused kernels #128

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Conversation

MilesCranmer
Copy link
Member

Copy link
Contributor

Benchmark Results (Julia v1)

Time benchmarks
master 12a851b... master / 12a851b...
eval/ComplexF32/evaluation 7.28 ± 0.57 ms 7.27 ± 0.57 ms 1 ± 0.11
eval/ComplexF64/evaluation 10.8 ± 0.84 ms 10.9 ± 0.75 ms 0.989 ± 0.1
eval/Float32/derivative 11.5 ± 0.59 ms 11.5 ± 0.61 ms 1 ± 0.074
eval/Float32/derivative_turbo 11.5 ± 0.58 ms 11.4 ± 0.5 ms 1.01 ± 0.068
eval/Float32/evaluation 2.73 ± 0.28 ms 2.77 ± 0.29 ms 0.988 ± 0.14
eval/Float32/evaluation_bumper 0.59 ± 0.017 ms 0.603 ± 0.017 ms 0.979 ± 0.039
eval/Float32/evaluation_turbo 0.506 ± 0.034 ms 0.516 ± 0.033 ms 0.982 ± 0.09
eval/Float32/evaluation_turbo_bumper 0.59 ± 0.016 ms 0.601 ± 0.017 ms 0.981 ± 0.038
eval/Float64/derivative 15 ± 1.4 ms 15.6 ± 1.5 ms 0.964 ± 0.13
eval/Float64/derivative_turbo 15 ± 1.3 ms 15.3 ± 1.4 ms 0.978 ± 0.12
eval/Float64/evaluation 3.09 ± 0.32 ms 3.11 ± 0.35 ms 0.994 ± 0.15
eval/Float64/evaluation_bumper 1.24 ± 0.045 ms 1.26 ± 0.044 ms 0.987 ± 0.05
eval/Float64/evaluation_turbo 0.988 ± 0.076 ms 1.01 ± 0.074 ms 0.98 ± 0.1
eval/Float64/evaluation_turbo_bumper 1.24 ± 0.044 ms 1.26 ± 0.046 ms 0.985 ± 0.05
utils/combine_operators/break_sharing 0.0391 ± 0.00053 ms 0.0395 ± 0.00069 ms 0.988 ± 0.022
utils/convert/break_sharing 26.2 ± 1.3 μs 26.9 ± 2 μs 0.973 ± 0.089
utils/convert/preserve_sharing 0.0986 ± 0.0058 ms 0.0984 ± 0.0059 ms 1 ± 0.084
utils/copy/break_sharing 27.4 ± 1.5 μs 28 ± 1.9 μs 0.978 ± 0.084
utils/copy/preserve_sharing 0.0981 ± 0.0054 ms 0.0988 ± 0.0056 ms 0.993 ± 0.078
utils/count_constant_nodes/break_sharing 12.2 ± 0.4 μs 12.3 ± 0.7 μs 0.993 ± 0.065
utils/count_constant_nodes/preserve_sharing 0.0842 ± 0.0045 ms 0.0863 ± 0.0053 ms 0.975 ± 0.08
utils/count_depth/break_sharing 13.1 ± 0.43 μs 12.6 ± 0.47 μs 1.04 ± 0.052
utils/count_nodes/break_sharing 11.5 ± 0.4 μs 11.6 ± 0.5 μs 0.996 ± 0.055
utils/count_nodes/preserve_sharing 0.0851 ± 0.0046 ms 0.0846 ± 0.0049 ms 1.01 ± 0.08
utils/get_set_constants!/break_sharing 0.0341 ± 0.0015 ms 0.0334 ± 0.0022 ms 1.02 ± 0.08
utils/get_set_constants!/preserve_sharing 0.178 ± 0.0092 ms 0.177 ± 0.0095 ms 1.01 ± 0.075
utils/get_set_constants_parametric 0.0444 ± 0.0021 ms 0.0455 ± 0.0025 ms 0.974 ± 0.07
utils/has_constants/break_sharing 6.74 ± 0.58 μs 6.71 ± 0.56 μs 1 ± 0.12
utils/has_operators/break_sharing 2.52 ± 0.19 μs 2.54 ± 0.21 μs 0.996 ± 0.11
utils/hash/break_sharing 23.1 ± 0.79 μs 23.4 ± 0.65 μs 0.989 ± 0.044
utils/hash/preserve_sharing 0.0978 ± 0.0047 ms 0.0967 ± 0.0049 ms 1.01 ± 0.071
utils/index_constant_nodes/break_sharing 24.7 ± 0.76 μs 24.8 ± 1.1 μs 0.998 ± 0.054
utils/index_constant_nodes/preserve_sharing 0.0987 ± 0.0052 ms 0.0974 ± 0.0056 ms 1.01 ± 0.079
utils/is_constant/break_sharing 7.15 ± 0.52 μs 7.34 ± 0.53 μs 0.974 ± 0.1
utils/simplify_tree/break_sharing 0.162 ± 0.0029 ms 0.177 ± 0.0039 ms 0.916 ± 0.026
utils/simplify_tree/preserve_sharing 0.221 ± 0.0091 ms 0.244 ± 0.01 ms 0.908 ± 0.053
utils/string_tree/break_sharing 0.478 ± 0.026 ms 0.484 ± 0.024 ms 0.987 ± 0.074
utils/string_tree/preserve_sharing 0.591 ± 0.029 ms 0.593 ± 0.028 ms 0.996 ± 0.068
time_to_load 0.215 ± 0.0073 s 0.215 ± 0.0021 s 1 ± 0.035
Memory benchmarks
master 12a851b... master / 12a851b...
eval/ComplexF32/evaluation 0.963 k allocs: 2.46 MB 0.978 k allocs: 2.5 MB 0.985
eval/ComplexF64/evaluation 0.996 k allocs: 5.07 MB 0.996 k allocs: 5.07 MB 1
eval/Float32/derivative 4.65 k allocs: 17.5 MB 4.6 k allocs: 17.3 MB 1.01
eval/Float32/derivative_turbo 4.62 k allocs: 17.4 MB 4.65 k allocs: 17.5 MB 0.993
eval/Float32/evaluation 0.984 k allocs: 1.28 MB 0.984 k allocs: 1.28 MB 1
eval/Float32/evaluation_bumper 0.303 k allocs: 0.393 MB 0.303 k allocs: 0.393 MB 1
eval/Float32/evaluation_turbo 0.96 k allocs: 1.25 MB 0.93 k allocs: 1.21 MB 1.03
eval/Float32/evaluation_turbo_bumper 0.303 k allocs: 0.393 MB 0.303 k allocs: 0.393 MB 1
eval/Float64/derivative 4.77 k allocs: 0.0349 GB 4.78 k allocs: 0.035 GB 0.997
eval/Float64/derivative_turbo 4.81 k allocs: 0.0352 GB 4.8 k allocs: 0.0351 GB 1
eval/Float64/evaluation 1 k allocs: 2.57 MB 1.01 k allocs: 2.6 MB 0.988
eval/Float64/evaluation_bumper 0.303 k allocs: 0.771 MB 0.303 k allocs: 0.771 MB 1
eval/Float64/evaluation_turbo 0.996 k allocs: 2.55 MB 0.999 k allocs: 2.56 MB 0.997
eval/Float64/evaluation_turbo_bumper 0.303 k allocs: 0.771 MB 0.303 k allocs: 0.771 MB 1
utils/combine_operators/break_sharing 4 allocs: 0.953 kB 4 allocs: 0.953 kB 1
utils/convert/break_sharing 2 k allocs: 0.0924 MB 2 k allocs: 0.0924 MB 1
utils/convert/preserve_sharing 2.4 k allocs: 0.161 MB 2.4 k allocs: 0.161 MB 1
utils/copy/break_sharing 2 k allocs: 0.0924 MB 2 k allocs: 0.0924 MB 1
utils/copy/preserve_sharing 2.4 k allocs: 0.161 MB 2.4 k allocs: 0.161 MB 1
utils/count_constant_nodes/break_sharing 4 allocs: 0.953 kB 4 allocs: 0.953 kB 1
utils/count_constant_nodes/preserve_sharing 0.404 k allocs: 0.0696 MB 0.404 k allocs: 0.0696 MB 1
utils/count_depth/break_sharing 4 allocs: 0.953 kB 4 allocs: 0.953 kB 1
utils/count_nodes/break_sharing 4 allocs: 0.953 kB 4 allocs: 0.953 kB 1
utils/count_nodes/preserve_sharing 0.404 k allocs: 0.0696 MB 0.404 k allocs: 0.0696 MB 1
utils/get_set_constants!/break_sharing 0.898 k allocs: 25.2 kB 0.898 k allocs: 25.2 kB 1
utils/get_set_constants!/preserve_sharing 1.7 k allocs: 0.138 MB 1.7 k allocs: 0.138 MB 1
utils/get_set_constants_parametric 1.42 k allocs: 0.0663 MB 1.42 k allocs: 0.0663 MB 1
utils/has_constants/break_sharing 4 allocs: 0.203 kB 4 allocs: 0.203 kB 1
utils/has_operators/break_sharing 4 allocs: 0.203 kB 4 allocs: 0.203 kB 1
utils/hash/break_sharing 0.104 k allocs: 2.52 kB 0.104 k allocs: 2.52 kB 1
utils/hash/preserve_sharing 0.504 k allocs: 0.0711 MB 0.504 k allocs: 0.0711 MB 1
utils/index_constant_nodes/break_sharing 1.67 k allocs: 0.0501 MB 1.67 k allocs: 0.0501 MB 1
utils/index_constant_nodes/preserve_sharing 2.07 k allocs: 0.119 MB 2.07 k allocs: 0.119 MB 1
utils/is_constant/break_sharing 4 allocs: 0.203 kB 4 allocs: 0.203 kB 1
utils/simplify_tree/break_sharing 1.33 k allocs: 0.0436 MB 1.33 k allocs: 0.0436 MB 1
utils/simplify_tree/preserve_sharing 1.58 k allocs: 0.101 MB 1.58 k allocs: 0.101 MB 1
utils/string_tree/break_sharing 11.8 k allocs: 1.04 MB 11.8 k allocs: 1.04 MB 1
utils/string_tree/preserve_sharing 12.2 k allocs: 1.11 MB 12.2 k allocs: 1.11 MB 1
time_to_load 0.159 k allocs: 11.2 kB 0.159 k allocs: 11.2 kB 1

@SymbolicML SymbolicML deleted a comment from github-actions bot May 26, 2025
@coveralls
Copy link

Pull Request Test Coverage Report for Build 15258008479

Details

  • 10 of 10 (100.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.007%) to 95.586%

Totals Coverage Status
Change from base Build 15258002007: 0.007%
Covered Lines: 2577
Relevant Lines: 2696

💛 - Coveralls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants