Skip to content

Conversation

@hkershaw-brown
Copy link
Member

Description:

Removes the initialization of close_ind array to -99 in get_close for 3D location_mods.
See #910 for the cost of this initialization.
The output array of close_ind is only valid from 1:num_close

Fixes issue

fixes #910

Types of changes

  • Bug (Performance) fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update

Documentation changes needed?

  • My change requires a change to the documentation.
    • I have updated the documentation accordingly.

Tests

Please describe any tests you ran to verify your changes.
MPAS case from profiling in #910, and get_close test harness also given in #910

Checklist for merging

  • Updated changelog entry
  • Documentation updated
  • Update conf.py

Checklist for release

  • Merge into main
  • Create release from the main branch with appropriate tag
  • Delete feature-branch

Testing Datasets

  • Dataset needed for testing available upon request
  • Dataset download instructions included
  • No dataset needed

@mgharamti
Copy link
Contributor

Performance check (ocean models)

Experiment Application Type State Vector Size Assimilated Obs MPI Processes Time (sec)
ROMS (original code) Regional 645,759,164 571 8000 54
ROMS (no initialization of close_ind) Regional 645,759,164 571 8000 57
MOM6 (original code) Global 77,760,000 258,669 1280 87
MOM6 (no initialization of close_ind) Global 77,760,000 258,669 1280 85

Interpretation: Omitting close_ind initialization shows negligible runtime impact in both regional and global ocean configurations. ROMS (regional): +3 s (~5%). MOM6 (global): −2 s (~2%).

@hkershaw-brown
Copy link
Member Author

Thanks! I wouldn’t necessarily expect to see run time improvements in these cases, with most of the time being in IO (obs sequence or writing the large state) -> the things that don't scale to the large core count.
Did you happen to run with output_timestamps = .true? It would be interesting to see if these timings are mostly IO.

The gotcha for this pull request is if someone is relying on testing for the -99 value in their code.

@mgharamti
Copy link
Contributor

output_timestamps

Helen, unfortunately I didn't turn on output_timestamps. I could rerun filter if you think it's important.

@mgharamti
Copy link
Contributor

Thanks! I wouldn’t necessarily expect to see run time improvements in these cases, with most of the time being in IO (obs sequence or writing the large state) -> the things that don't scale to the large core count. Did you happen to run with output_timestamps = .true? It would be interesting to see if these timings are mostly IO.

The gotcha for this pull request is if someone is relying on testing for the -99 value in their code.

I did run with timestamps turned on. Here is a summary:

  • MOM6: The localization part accounted for ~35% of the total run time with the original code. For the code without initialization of close_ind, the localization accounted for ~31% of the run time. So, there is a small gain.
  • ROMS-Rutgers: Localization in both runs accounted for less than 2% of the run time. IO is the major player in this case.

For future purposes, I placed the 2 ocean DA test cases, with the necessary files, here:

  • MOM6: /glade/campaign/cisl/dares/large_dart_tests/ocean/mom6_global
  • ROMS-Rutgers: /glade/campaign/cisl/dares/large_dart_tests/ocean/roms_rutgers_regional

@hkershaw-brown
Copy link
Member Author

awesome thanks Moha

@mjs2369
Copy link
Contributor

mjs2369 commented Dec 17, 2025

@hkershaw-brown this has been hanging out for a minute... is there a hold on this PR?

@hkershaw-brown
Copy link
Member Author

@hkershaw-brown this has been hanging out for a minute... is there a hold on this PR?

@mjs2369 no just needs 'approve' clicked

Copy link
Contributor

@mjs2369 mjs2369 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get this is in!

@hkershaw-brown hkershaw-brown added the release! bundle with next release label Dec 18, 2025
@hkershaw-brown hkershaw-brown merged commit 80e978c into main Dec 31, 2025
4 checks passed
@hkershaw-brown hkershaw-brown deleted the perf-get-close branch December 31, 2025 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release! bundle with next release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

get_close initialization cost

4 participants