Skip to content

Conversation

@petebachant
Copy link
Member

@petebachant petebachant commented Dec 22, 2025

This is a ~2% performance improvement on 1 GPU in UV mode. Not huge, but maybe worth it if it's still correct.

Towards #4147

TODO

  • If valuable, rebase and remove env changes.
  • Verify that omitting vertical components of strain rate does not impact horizontal diffusion.

@petebachant petebachant requested a review from haakon-e December 22, 2025 23:30
Copy link
Member

@haakon-e haakon-e left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for testing this!

We should make sure the conditional evaluation remains numerically equivalent, though I think few of them are covered by CI. So may need to run some tests manually (or add to ci).

- This variant only computes horizontal gradients, reducing register pressure and local memory usage
- Recommended for use when only horizontal diffusion is needed
"""
function compute_strain_rate_center_horizontal!(ᶜε, ᶜu)
Copy link
Member

@haakon-e haakon-e Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do these (face/center) need to be separate functions? They should be identical.
Horizontal operations don't change spaces from centers to faces and vice versa, so only having one function for compute_strain_rate_horizontal!(...) should be sufficient.

Also, because these only do horizontal operations, they could be lazy instead of writing memory twice.

Comment on lines 69 to 74
else
# Horizontal-only variants take two arguments:
# centers use ᶜu; faces use ᶠu
compute_strain_rate_center_horizontal!(ᶜS, ᶜu)
compute_strain_rate_face_horizontal!(ᶠS, ᶠu)
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This misses the case if we run SmagorinskyLilly{W} (only apply smagorinsky diffusion in the vertical direction).

Though in the case of SmagorinskyLilly{UV}, I think this may lead to different result in the tendency computation. Could you check the values you get when you do e.g.:

ᶜτ_smag = @. lazy(-2 * ᶜνₜ_h * ᶜS)
# ...
@. Yₜ.c.uₕ -= C12(wdivₕ(ᶜρ * ᶜτ_smag) / ᶜρ)

while using the UV-configuration

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could also be problematic if any of the cached vertical components like ᶜS_norm_v are used elsewhere.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a TODO item to verify this by actually running code, but it looks like the vertical terms should be zeroed out by the projection in strain_rate_norm.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reran baseline and modified code with the latest main merged in and they're tough to compare because the baseline is all NaN at 2048 steps, and the modified code is not. Not sure if there's a bug in the current version on main or if it's just numerical stability: https://github.com/petebachant/clima-horizontal-diffusion/blob/e2f44a52f3a7c546fcb3b0d2a4c23ed5d2fbd30b/results/compare_smag_uh.log

uuid = "ae029012-a4dd-5104-9daa-d747884805df"
version = "1.3.1"

[[deps.Revise]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added Revise in a separate PR (#4187), so you can remove this here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it's in the PR's TODO list to revert the env changes if the rest looks good.

lilly_stratification_correction(p, ᶜS)
Return a lazy representation of the Lilly stratification correction factor
Return a lazy representation of the Lilly stratification correction factor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these unrelated changes are a bit distracting. Is there any way you could disable them?

@petebachant petebachant marked this pull request as draft January 5, 2026 18:03
@petebachant
Copy link
Member Author

Tagging you for review @imreddyTeja in case you see anything else in the Smagorinsky implementation that looks like low-hanging fruit. So far, this doesn't seem worth it for ~2% gain, especially since it will drop to nearly zeo if Smagorinsky will typically be used in a horizontal--vertical coupled fashion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants