Skip to content

Conversation

charleskawczynski
Copy link
Member

Inspired by , this PR attempts to reduce the number of compile methods (and hopefully overall compilation time) by using Base.Cartesian utilities:

using Revise; include("perf/jet.jl")
using MethodAnalysis
import ClimaTimeSteppers as CTS
length(methodinstances(CTS.update_stage!))
length(methodinstances(CTS.unrolled_foreach))

Main

julia> using Revise; include("perf/jet.jl")
Precompiling ClimaTimeSteppers
  1 dependency successfully precompiled in 3 seconds. 180 already precompiled.
step_allocs = 12768
Test Summary:     | Pass  Total  Time
JET / allocations |    2      2  3.1s
Test.DefaultTestSet("JET / allocations", ^R
^[[Any[], 2, false, false, true, 1.728926011863406e9, 1.728926014999377e9, false, "/Users/charliekawczynski/Dropbox/Caltech/work/dev/CliMA/ClimaTimeSteppers.jl/perf/jet.jl")

julia> using MethodAnalysis

julia> import ClimaTimeSteppers as CTS

julia> length(methodinstances(CTS.update_stage!))
5

julia> length(methodinstances(CTS.unrolled_foreach))
10

This branch

julia> using Revise; include("perf/jet.jl")
Precompiling ClimaTimeSteppers
  1 dependency successfully precompiled in 5 seconds. 180 already precompiled.
step_allocs = 12768
Test Summary:     | Pass  Total  Time
JET / allocations |    2      2  3.7s
Test.DefaultTestSet("JET / allocations", Any[], 2, false, false, true, 1.728925925530902e9, 1.728925929182172e9, false, "/Users/charliekawczynski/Dropbox/Caltech/work/dev/CliMA/ClimaTimeSteppers.jl/perf/jet.jl")
julia> using MethodAnalysis

julia> import ClimaTimeSteppers as CTS

julia> length(methodinstances(CTS.update_stage!))
2

julia> length(methodinstances(CTS.unrolled_foreach))
1

I think I'd better use SnoopCompile to incorporate the importance of timings in real-world cases.

@dennisYatunin
Copy link
Member

Can we just use UnrolledUtilities here, instead of redefining its internal functions? I think all the issues with that package have been hashed out and it is ready for our wider codebase.

@charleskawczynski
Copy link
Member Author

charleskawczynski commented Nov 19, 2024

Can we just use UnrolledUtilities here, instead of redefining its internal functions? I think all the issues with that package have been hashed out and it is ready for our wider codebase.

I think UnrolledUtilities is very valuable to us for things like in-kernel FD operators in ClimaCore (especially the highly nested MatrixFields operators), since those functions require specialization to be inlined and constant propagated. They cannot compile without it.

In this case, however, this is about top-level inference, which would be nice to make type-stable. We don't need specialization on every element of unrolled_foreach. I think the difference here is that there's one large specialization instead of a bunch of smaller ones. I don't intend to merge this until we do some measurements to understand the impact.

Ultimately, we know that Julia overspecializes, and this PR was kind of tracer bullet to explore reducing specializations and alleviating compile times.

Let me convert it to a draft PR.

@charleskawczynski charleskawczynski marked this pull request as draft November 19, 2024 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants