Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .buildkite/Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ Dates = "ade2ca70-3891-5945-98fb-dc099432e06a"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
ExplicitImports = "7d51a73a-1435-4ff3-83d9-f097790105c7"
JuliaFormatter = "98e50ef6-434e-11e9-1051-2b60c6c9e899"
LazyBroadcast = "9dccce8e-a116-406d-9fcc-a88ed4f510c8"
MPI = "da04e1cc-30fd-572f-bb4f-1f8673147195"
NCDatasets = "85f8d34a-cbdd-5861-8df4-14fed0d494ab"
Profile = "9abbd945-dff8-562f-b5e8-e1ebf5ef1b79"
Expand Down
33 changes: 33 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# NEWS

v0.2.13
-------

Expand All @@ -13,6 +14,37 @@ the interval `[0.0, 10.0]`. If one knows that the data represents a time
average, then the time of `10.0` is the time average over the interval
`[0.0, 10.0]`.

### Support for `lazy`

Starting version `0.2.13`, `ClimaDiagnostics` supports diagnostic variables
specified with un-evaluated expressions (as provided by
[LazyBroadcast.jl](https://github.com/CliMA/LazyBroadcast.jl)).

Instead of
```julia
function compute_ta!(out, state, cache, time)
if isnothing(out)
return state.ta .- 273.15
else
out .= state.ta .- 273.15
end
end
```
You can now write
```julia
import LazyBroadcast: lazy

function compute_ta(state, cache, time)
return lazy.(state.ta .- 273.15)
end
```
Or, for `Field`s
```julia
function compute_ta(state, cache, time)
return state.ta
end
```

v0.2.12
-------

Expand Down Expand Up @@ -130,6 +162,7 @@ v0.2.4

- Add `EveryCalendarDtSchedule` for schedules with calendar periods.


v0.2.3
-------

Expand Down
4 changes: 3 additions & 1 deletion Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ ClimaUtilities = "0.1.22"
Dates = "1"
Documenter = "1"
ExplicitImports = "1.6"
LazyBroadcast = "1"
JuliaFormatter = "1"
NCDatasets = "0.14"
OrderedCollections = "1.4"
Expand All @@ -41,10 +42,11 @@ ClimaTimeSteppers = "595c0a79-7f3d-439a-bc5a-b232dc3bde79"
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
ExplicitImports = "7d51a73a-1435-4ff3-83d9-f097790105c7"
JuliaFormatter = "98e50ef6-434e-11e9-1051-2b60c6c9e899"
LazyBroadcast = "9dccce8e-a116-406d-9fcc-a88ed4f510c8"
Profile = "9abbd945-dff8-562f-b5e8-e1ebf5ef1b79"
ProfileCanvas = "efd6af41-a80b-495e-886c-e51b0c7d77a3"
SafeTestsets = "1bc83da4-3b8d-516f-aca4-4fe02f6d838f"
Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"

[targets]
test = ["Aqua", "BenchmarkTools", "ClimaTimeSteppers", "Documenter", "ExplicitImports", "JuliaFormatter", "Profile", "ProfileCanvas", "SafeTestsets", "Test"]
test = ["Aqua", "BenchmarkTools", "ClimaTimeSteppers", "Documenter", "ExplicitImports", "JuliaFormatter", "LazyBroadcast", "Profile", "ProfileCanvas", "SafeTestsets", "Test"]
70 changes: 39 additions & 31 deletions docs/src/developer_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ Let us see the simplest example to accomplish this
import ClimaDiagnostics: DiagnosticVariable, ScheduledDiagnostic
import ClimaDiagnostics.Writers: DictWriter

myvar = DiagnosticVariable(; compute! = (out, u, p, t) -> u.var1)
myvar = DiagnosticVariable(; compute = (u, p, t) -> u.var1)

myschedule = (integrator) -> maximum(integrator.u.var2) > 10.0

Expand Down Expand Up @@ -148,7 +148,7 @@ const ALL_DIAGNOSTICS = Dict{String, DiagnosticVariable}()
standard_name,
units,
description,
compute!)
compute)


Add a new variable to the `ALL_DIAGNOSTICS` dictionary (this function mutates the state of
Expand All @@ -173,27 +173,25 @@ Keyword arguments
- `comments`: More verbose explanation of what the variable is, or comments related to how
it is defined or computed.

- `compute!`: Function that compute the diagnostic variable from the state. It has to take
two arguments: the `integrator`, and a pre-allocated area of memory where to
write the result of the computation. It the no pre-allocated area is
available, a new one will be allocated. To avoid extra allocations, this
function should perform the calculation in-place (i.e., using `.=`).

- `compute`: Function that computes the diagnostic variable from the state, cache, and time. The function
should return a `Field` or a `Base.Broadcast.Broadcasted` expression. It should not allocate
new `Field`: if you find yourself using a dot, that is a good indication you should be using
`lazy`.
"""
function add_diagnostic_variable!(;
short_name,
long_name,
standard_name = "",
units,
comments = "",
compute!,
compute,
)
haskey(ALL_DIAGNOSTICS, short_name) && @warn(
"overwriting diagnostic `$short_name` entry containing fields\n" *
"$(map(
field -> "$(getfield(ALL_DIAGNOSTICS[short_name], field))",
# We cannot really compare functions...
filter(field -> field != :compute!, fieldnames(DiagnosticVariable)),
filter(field -> !(field in (:compute!, :compute)), fieldnames(DiagnosticVariable)),
))"
)

Expand All @@ -203,7 +201,7 @@ function add_diagnostic_variable!(;
standard_name,
units,
comments,
compute!,
compute,
)

"""
Expand Down Expand Up @@ -236,15 +234,30 @@ add_diagnostic_variable!(
long_name = "Air Density",
standard_name = "air_density",
units = "kg m^-3",
compute! = (out, state, cache, time) -> begin
if isnothing(out)
return state.c.ρ
else
out .= state.c.ρ
end
end,
compute = (state, cache, time) -> state.c.ρ,
)
```

When writing compute functions, make them lazy with
[LazyBroadcast.jl](https://github.com/CliMA/LazyBroadcast.jl) to improve
performance and avoid intermediate allocations. To do that, add `LazyBroadcast`
to your dependencies and import `lazy`. A slight variation of the previous
example would look like

```julia
###
# Density (3d)
###
add_diagnostic_variable!(
short_name = "rhoa",
long_name = "Air Density",
standard_name = "air_density",
units = "kg m^-3",
compute = (state, cache, time) -> lazy.(1000 .* state.c.ρ),
)
```
Where we added the `1000` to simulate a more complex expression. If you didn't have
`lazy`, the diagnostic would allocate an intermediate `Field`, severly hurting performance.

It is a good idea to put safeguards in place to ensure that your users will not
be allowed to call diagnostics that do not make sense for the simulation they
Expand All @@ -254,26 +267,21 @@ can dispatch over that and return an error. A simple example might be
###
# Specific Humidity
###
compute_hus!(out, state, cache, time) =
compute_hus!(out, state, cache, time, cache.atmos.moisture_model)
compute_hus(state, cache, time) =
compute_hus(state, cache, time, cache.atmos.moisture_model)

compute_hus!(out, state, cache, time) =
compute_hus!(out, state, cache, time, cache.model.moisture_model)
compute_hus!(_, _, _, _, model::T) where {T} =
compute_hus(state, cache, time) =
compute_hus!(state, cache, time, cache.model.moisture_model)
compute_hus(_, _, _, model::T) where {T} =
Comment on lines +270 to +275
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

⚠️ Potential issue

Potential Ambiguity in compute_hus Overloads

There are now two definitions for compute_hus(state, cache, time):

  • One dispatches with cache.atmos.moisture_model.
  • The other dispatches with cache.model.moisture_model (using the legacy compute_hus! in the call).

This duality may be confusing and could lead to ambiguity if both cache.atmos and cache.model exist.

Recommendation: Consider consolidating these definitions or providing clearer dispatch conditions (or renaming one of the methods) to avoid potential conflicts in method resolution. Documentation clarifying when each branch is chosen would also be beneficial.

error("Cannot compute hus with $model")

function compute_hus!(
out,
function compute_hus(
state,
cache,
time,
moisture_model::T,
) where {T <: Union{EquilMoistModel, NonEquilMoistModel}}
if isnothing(out)
return state.c.ρq_tot ./ state.c.ρ
else
out .= state.c.ρq_tot ./ state.c.ρ
end
return lazy.(state.c.ρq_tot ./ state.c.ρ)
end

add_diagnostic_variable!(
Expand All @@ -282,7 +290,7 @@ add_diagnostic_variable!(
standard_name = "specific_humidity",
units = "kg kg^-1",
comments = "Mass of all water phases per mass of air",
compute! = compute_hus!,
compute = compute_hus,
)
```
This relies on dispatching over `moisture_model`. If `model` is not in
Expand Down
3 changes: 2 additions & 1 deletion docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,5 +24,6 @@ the Developer guide page.
- Allow users to define arbitrary new diagnostics;
- Trigger diagnostics on arbitrary conditions;
- Save output to HDF5 or NetCDF files, or a dictionary in memory;

- Work with lazy expressions (such as the ones produced by
[LazyBroadcast.jl](https://github.com/CliMA/LazyBroadcast.jl)).

71 changes: 54 additions & 17 deletions docs/src/user_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,12 +37,8 @@ Let us see how we would define a `DiagnosticVariable`
```julia
import ClimaDiagnostics: DiagnosticVariable

function compute_ta!(out, state, cache, time)
if isnothing(out)
return state.ta
else
out .= state.ta
end
function compute_ta(state, cache, time)
return state.ta
end

var = DiagnosticVariable(;
Expand All @@ -51,25 +47,66 @@ var = DiagnosticVariable(;
standard_name = "air_temperature",
comments = "Measured assuming that the air is in quantum equilibrium with the metaverse",
units = "K",
compute! = compute_ta!
compute = compute_ta
)
```

`compute_ta!` is the key function here. It determines how the variable should be
`compute_ta` is the key function here. It determines how the variable should be
computed from the `state`, `cache`, and `time` of the simulation. Typically,
these are packaged within an `integrator` object (e.g., `state = integrator.u`
or `integrator.Y`).

`compute_ta!` takes another argument, `out`. `out` is an area of memory managed
by `ClimaDiagnostics` that is used to reduce the number of allocations needed
when working with diagnostics. The first time the diagnostic is called, an area
of memory is allocated and filled with the value (this is when `out` is
`nothing`). All the subsequent times, the same space is overwritten, leading to
much better performance. You should follow this pattern in all your diagnostics.
!!! compat "ClimaDiagnostics 0.2.13"

Support for `compute` was introduced in version `0.2.13`. Prior to this
version, the in-place `compute!` had to be provided. In this case, `compute`
has to take an extra argument, `out`. `out` is an area of memory managed by
`ClimaDiagnostics` that is used to reduce the number of allocations needed
when working with diagnostics. The first time the diagnostic is called, an
area of memory is allocated and filled with the value (this is when `out` is
`nothing`). All the subsequent times, the same space is overwritten, leading
to much better performance. You should follow this pattern in all your
diagnostics. This is left to developer to implement, so `compute_ta` would
look like

```julia
function compute_ta!(out, state, cache, time)
if isnothing(out)
return state.ta
else
out .= state.ta
end
end
```

In general, we do not recommend implementing `compute!`, unless required for
backward compatibility.

> Note, in the future, we hope to improve this rather clumsy way to write
> diagnostics. Hopefully, at some point you will just have to write something like
> `state.ta` and not worry about the `out` at all.
When the expression is anything more complicated than just returning a `Field`,
it is best to return an unevaluated expression represented by a
`Base.Broadcast.Broadcasted` object (such as the ones produced with
[LazyBroadcast.jl](https://github.com/CliMA/LazyBroadcast.jl)). Consider the
following example where we want to shift the temperature to Celsius:
```julia
function compute_ta(state, cache, time)
return state.ta .- 273.15
end
```

This `compute` function is inefficient because it allocates an entire `Field`
before returning it. Instead, we can return just a recipe on how the diagnostic
should be computed: Using `LazyBroadcast.jl`, the snippet above can be rewritten
as
```julia
import LazyBroadcast: lazy

function compute_ta(state, cache, time)
return lazy.(state.ta .- 273.15)
end
```
The return value of `compute_ta` is a `Base.Broadcast.Broadcasted` object and
`ClimaDiagnostics` knows how to handle it efficiently, avoiding the intermediate
allocations.

A `DiagnosticVariable` defines what a variable is and how to compute it, but
does not specify when to compute/output it. For that, we need
Expand Down
2 changes: 1 addition & 1 deletion docs/src/writers.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ linearly. For the vertical dimension, the behavior can be customized by passing
the `z_sampling_method` variable. When `z_sampling_method =
ClimaDiagnostics.Writers.LevelMethod()`, points evaluated on the grid levels
(and the provided number of points ignored), when `z_sampling_method =
ClimaDiagnostics.Writers.FakePressureLevelMethod()`, points are sampled
ClimaDiagnostics.Writers.FakePressureLevelsMethod()`, points are sampled
uniformly in simplified hydrostatic atmospheric model.

The output in the `NetCDFWriter` roughly follows the CF conventions.
Expand Down
Loading
Loading