-
Notifications
You must be signed in to change notification settings - Fork 69
Refactor barotropic time stepping and OBC code #878
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor barotropic time stepping and OBC code #878
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## dev/gfdl #878 +/- ##
============================================
+ Coverage 38.01% 38.03% +0.02%
============================================
Files 299 299
Lines 88063 88148 +85
Branches 16547 16578 +31
============================================
+ Hits 33475 33525 +50
- Misses 48513 48531 +18
- Partials 6075 6092 +17 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
b8aa6fd
to
bed5ba7
Compare
This PR has passed the CEFI regression test suite (for both NWA12 and NEP10). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are many changes here. The new OBC masks seem to be a good addition and I appreciate the correction of spelling errors. Since these changes do not change answers with ot without OBCs I think they should be accepted.
c3a6b5b
to
70ec893
Compare
Static build error:
|
I think |
70ec893
to
c82a146
Compare
Added code to avoid a separate OBC block within btloop_update_u and btloop_update_v. This was done by adding static CS%OBCmask_[uv] arrays that are 1 except at OBC points where they are 0, and using them to set BT_force_[uv], IdxCu. IdyCv, f_4_u and f_4_v to Conversely bt_rem_u and bt_rem_v are being set to 1 at OBC points. Because all the changes occur by masking 2-d arrays during the barotropic setup, there should be a net increase in model speed (if anything). Also ensured that all allocates for arrays in the MOM_barotropic control structure are paired with appropriate deallocates in barotropic_end. As a part of this, 7 arrays (lin_drag_u, lin_drag_v, ubt_IC, vbt_IC, D_u_Cor, D_v_cor and q_D) that are only allocated and used conditionally were converted from potentially static memory arrays into arrays that are always allocatable. The comments describing the nature and purpose of the Rayleigh_u and Rayleigh_v arrays were expanded, and commented out code that might use a new answer date to avoid some unnecessary extra divisions in btstep_find_Cor was also added. All answers are bitwise identical and no public interfaces are changed.
Moved the code calculating the barotropic pressure gradients into the new subroutine btloop_find_PF. This simplifies the code in btloop_update_u and btloop_update_v, which should make it more likely that these are inlined. All answers are bitwise identical and no public interfaces are changed.
Simplified btloop_eta_predictor by moving several optional loops out into the main body of btstep_timeloop, to increase the chances that this routine will become a good candidate for inlining, and to take steps toward the reuse of the code calculating the barotropic fluxes in the predictor and corrector steps. In addition, several unused variables were eliminated, and the arguments to btloop_eta_predictor are now documented in the order in which they appear in the argument list. All answers are bitwise identical and no public interfaces are changed.
Use two separate calls to btloop_find_PF an eta argument that depends on BT_project_velocity, thereby eliminating the need for the eta_PF_BT pointer to the eta array that will be used for the barotropic pressure force. All answers are bitwise identical and no public interfaces are changed.
Refacted the code implementing the barotropic open boundary conditions to replace the direct references to the segments with simple 2-d arrays in the BT_OBC type for everything inside of the performance-critical routine btstep_timeloop. There are several new or renamed elements inside of the BT_OBC_type and three new integer parameters enumerating the types of open boundary conditions that are to be applied at each point, with the sign of integers in the new u_OBC_type and v_OBC_type arrays indicating the direction of the OBCs, and 0 indicating no OBC at a cell face. There is a new routine, initialize_BT_OBC, that stores the static information about the position and nature of the various OBCs for use within the barotropic time stepping; this routine is called from barotropic_init. Several spelling errors in comments were also corrected. All answers are bitwise identical and no public interfaces are changed.
Split the loops applying western and eastern open boundary conditions in apply_u_velocity_OBCs and southern and northern OBCs in apply_v_velocity_OBCs. These usually cover very different loop ranges, and often there is only one of the pair of OBCs on any processor, so it should be more efficient to only apply the OBCs on within the loop ranges where they apply (which can be determined during initialization). The code still works for arbitrarily located OBC segments, and all answers are bitwise identical.
Cleaned up the OBC-related memory allocation and moved it into initialize_BT_OBC. Also combined OBC-related halo passes to reduce latency during the set-up phase of btstep. Also simplified the code setting the various gtot values around OBC points. The OBC argument to btstep_timeloop was replaced with a new logical variable in the barotropic control structure that is set during initialization. All answers are bitwise identical and no public interfaces are changed.
Moved the code to calculate the dynamic pressure into the new routine btloop_add_dyn_PF to simplify btloop_find_PF. Also revised btcalc to use the BT_OBC_type to streamline the application of open boundary conditions in the calculation of frhatu and frhatv. All answers are bitwise identical and no public interfaces are changed.
Refactored btcalc to include the application of open boundary conditions in the calculation of CS%frhatu and CS%frhatv within the same loops as the non-OBC calculations, and also to reduce the amount of duplicated code. This should increase the model efficiency with open boundary conditions, and have minimal performance impacts otherwise. All answers are bitwise identical and no public interfaces are changed.
c82a146
to
32d1ded
Compare
This PR has passed pipeline testing at https://gitlab.gfdl.noaa.gov/ogrp/mom6ci/MOM6/-/pipelines/27196. |
This series of 9 commits extensively refactors the code in MOM_barotropic for simplicity and efficiency, with a particular emphasis on reducing the cost of the open boundary condition code and using simpler subsidiary routines within the barotropic time stepping loops.
The open boundary condition code that is called from inside of
btstep_timeloop()
no longer uses the indirectly referenced segments in the OBC type to determine the location, direction and kind of open boundary conditions, instead using two new arrays (u_OBC_type
andv_OBC_type
) in the localBT_OBC_type
to specify the kind of OBC point with their absolute value while their signs indicate the direction. The OBC code in the barotropic timestepping is only called now if there is an OBC segment on the PE, and the loop ranges over which the OBCs are applied are closely tailored to cover the ranges over which eastern, western, northern or southern boundary conditions are applied on a PE. These restricted loop ranges take advantage of the fact that often there are only open boundary condition points at the edges of the domain, while still allowing for the more general case where open boundary condition segments can be anywhere. In other places, the multiplication ofBT_force_[uv]
by the new OBC masking arraysOBCmask_[uv]
and a reset ofbt_rem_[uv]
during the setup phase eliminate the need for separate OBC-specific loops during the barotropic time stepping. Also revisedbtcalc()
to usetheBT_OBC_type
to streamline the application of open boundary conditions in the calculation offrhatu
andfrhatv
. All of the extra logic related to the OBCs can be determined during initialization (specifically in the new routineinitialize_BT_OBC()
), so it should have a negligible impact on runtime, whereas the restructuring of the OBC code should reduce its computational burden during the time stepping itself.Other changes are separate from the updates to the OBC code, with a rearrangement of the code that makes up
btstep_timeloop()
. The pressure gradient force calculation is now in the new subroutinebtloop_find_PF()
, which allows for the elimination of theeta_PF_BT
pointer and a substantial reduction in the number of arguments tobtloop_update_u()
andbtloop_update_v()
. The 8 separate pseudo-Coriolis arrays ([abcd]zon
and[abcd]mer
) were combined into the new 3-d arraysf_4_u
andf_4_v
, with the new 4-point fastest varying index covering the influence of the 4 nearest v-velocity anomalies on the Coriolis acceleration at a u-point. The scope ofbtloop_eta_predictor()
was reduced to just the predicting eta, with the calculation ofp_surf_dyn
,eta_sum
andeta_PF
occurring outside or in different routines. The code to calculate the dynamic pressure was moved into the new routinebtloop_add_dyn_PF()
.Several spelling errors in comments were also corrected, and some comments describing variables were expanded for clarity.
All answers are bitwise identical and no public interfaces are changed. The specific commit in this PR include: