Skip to content

Conversation

@imreddyTeja
Copy link
Member

@imreddyTeja imreddyTeja commented Oct 23, 2025

Purpose

Split up a high register pressure broadcast.

closes #4064

To-do

  • measure sypd in more robust way
  • decide if adding to scratch is OK, or if another scratch field could be used (while changing eltype of field)

Content

It appears that just the broadcast is nearly seven times as fast after the change.

Sypd of numerics_sphere_he30ze63.yml and longrun_aquaplanet_allsky_1M.yml:

before: 0.429
after: 0.476

~11% speedup

BEFORE:
Screenshot 2025-10-24 at 11 13 43 AM

AFTER:
Screenshot 2025-10-24 at 11 13 58 AM


  • I have read and checked the items on the review checklist.

Comment on lines 572 to 578
@. p.scratch.ᶠtemp_scalar = ClimaAtmos.ᶠinterp(ᶜρ * ᶜJ) / ᶠJ
@. p.scratch.ᶜtemp_scalar_5 = -(ᶜwₚ) / ᶜρ
@. ∂ᶜρχₚ_err_∂ᶜρχₚ =
dtγ * -(ᶜprecipdivᵥ_matrix())
DiagonalMatrixRow(ᶠinterp(ᶜρ * ᶜJ) / ᶠJ)
DiagonalMatrixRow(p.scratch.ᶠtemp_scalar)
ᶠright_bias_matrix()
DiagonalMatrixRow(-Geometry.WVector(ᶜwₚ) / ᶜρ) - (I,)
DiagonalMatrixRow(Geometry.WVector(p.scratch.ᶜtemp_scalar_5)) - (I,)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@. p.scratch.ᶠtemp_scalar = ClimaAtmos.ᶠinterp(ᶜρ * ᶜJ) / ᶠJ
@. p.scratch.ᶜtemp_scalar_5 = -(ᶜwₚ) / ᶜρ
@. ∂ᶜρχₚ_err_∂ᶜρχₚ =
dtγ * -(ᶜprecipdivᵥ_matrix())
DiagonalMatrixRow(ᶠinterp(ᶜρ * ᶜJ) / ᶠJ)
DiagonalMatrixRow(p.scratch.ᶠtemp_scalar)
ᶠright_bias_matrix()
DiagonalMatrixRow(-Geometry.WVector(ᶜwₚ) / ᶜρ) - (I,)
DiagonalMatrixRow(Geometry.WVector(p.scratch.ᶜtemp_scalar_5)) - (I,)
ᶠρ = @. p.scratch.ᶠtemp_scalar = ᶠinterp(ᶜρ * ᶜJ) / ᶠJ
ᶜminus_wₚ_div_ρ = @. p.scratch.ᶜtemp_scalar_5 = -(ᶜwₚ) / ᶜρ
@. ∂ᶜρχₚ_err_∂ᶜρχₚ =
dtγ * -(ᶜprecipdivᵥ_matrix())
DiagonalMatrixRow(ᶠρ)
ᶠright_bias_matrix()
DiagonalMatrixRow(Geometry.WVector(ᶜminus_wₚ_div_ρ)) - (I,)

I think giving the temp fields reasonable names improves readability.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the intermediate computations to this:

 @. p.scratch.ᶜbidiagonal_adjoint_matrix_c3 = -(ClimaAtmos.ᶜprecipdivᵥ_matrix())   DiagonalMatrixRow(ClimaAtmos.ᶠinterp(ᶜρ * ᶜJ) / ᶠJ)
@. p.scratch.ᶠband_matrix_wvec = ClimaAtmos.ᶠright_bias_matrix()   DiagonalMatrixRow(ClimaCore.Geometry.WVector(-(ᶜwₚ) / ᶜρ))

Do you have a suggestion for names that make sense here?

Split up a high register pressure broadcast.

It appears that just the broadcast is nearly seven times as fast after the
change.

SYPD for 1m with 30 he improves by ~11%
@imreddyTeja imreddyTeja marked this pull request as ready for review November 6, 2025 16:58
Copy link
Member

@szy21 szy21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there would be a good name for these temporary fields...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve performance of tracer ∂ᶜρχₚ_err_∂ᶜρχₚ update

4 participants