bug-fix: perfect_model_obs prints #916

mjs2369 · 2025-07-10T20:23:36Z

Description:

Adds a conditional to the print statements in pmo "Processing observation X of Y" so that the error_handler is called with E_ALLMSG if pmo is being run on more than 1 mpi proc

This fixes the issue that was occurring where it was only printing the the obs on task 0, which made it appear that only those obs were being processed

Fixes issue

#905

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update

Documentation changes needed?

My change requires a change to the documentation.
- I have updated the documentation accordingly.

Tests

Please describe any tests you ran to verify your changes.

Ran pmo for lorenz models and MOM6 with various numbers of MPI procs and values for print_every_nth_obs, ensured prints were correct

Checklist for merging

Updated changelog entry
Documentation updated
Update conf.py

Checklist for release

Merge into main
Create release from the main branch with appropriate tag
Delete feature-branch

Testing Datasets

Dataset needed for testing available upon request
Dataset download instructions included
No dataset needed

…s called with E_ALLMSG if pmo is being run on more than 1 mpi proc

mjs2369 · 2025-07-10T20:34:12Z

A few extra notes:

The observations are out of order while being printed (especially when using lots of mpi procs) - currently the prints look like this with MOM6 on 256 procs

I could force the pmo prints to go in order, but I would have to sync across tasks, so I don't think we want to go for that...

The print statements are slightly mismatched in terms of how many spaces there are on task 0 vs all the other tasks. Here's an example with lorenz_63

We could fix this by adding the same number of spaces to line 780 as there are in 782 here in the error_handler subroutine

DART/assimilation_code/modules/utilities/utilities_mod.f90

Lines 779 to 782 in d908222

    
           if (task_number == 0) then 
        
              write(taskstr, '(a)' ) "PE 0: " 
        
           else 
        
              write(taskstr, '(a,i5,a)' ) "PE ", task_number, ": "

i.e.

   if (task_number == 0) then
      write(taskstr, '(a)' ) "PE     0: "

hkershaw-brown · 2025-07-11T21:44:23Z

A few extra notes:

The observations are out of order while being printed (especially when using lots of mpi procs) - currently the prints look like this with MOM6 on 256 procs
I could force the pmo prints to go in order, but I would have to sync across tasks, so I don't think we want to go for that...

Good call, no need waste core-hours syncing.

The print statements are slightly mismatched in terms of how many spaces there are on task 0 vs all the other tasks. Here's an example with lorenz_63

We could fix this by adding the same number of spaces to line 780 as there are in 782 here in the error_handler subroutine

DART/assimilation_code/modules/utilities/utilities_mod.f90

Lines 779 to 782 in d908222

if (task_number == 0) then

write(taskstr, '(a)' ) "PE 0: "

else

write(taskstr, '(a,i5,a)' ) "PE ", task_number, ": "

i.e.
   if (task_number == 0) then
      write(taskstr, '(a)' ) "PE     0: "

Gut feeling is to leave this alone so the PE 0: prints stand out.

I think most of the time people will set either -1 (to not have printouts) or some n comparable to the total obs, e.g. for 50,000 obs print every 5000 obs. Its a sanity check to see that PMO is doing something, or where it got to when a job ran of wall clock.

hkershaw-brown · 2025-07-11T21:56:24Z

assimilation_code/programs/perfect_model_obs/perfect_model_obs.f90

+         if(task_count() == 1) then
+            write(msgstring, '(A,1x,I8,1x,A,I8)') 'Processing observation ', j, &
+                                         ' of ', num_obs_in_set
+            call trace_message(msgstring, 'perfect_model_obs:', -1)
+         else
+            write(msgstring, '(A,1x,I8,1x,A,I8)') 'Processing observation ', global_obs_num, &
+                                         ' of ', num_obs_in_set
+            call error_handler(E_ALLMSG, 'perfect_model_obs:' ,trim(msgstring))
+         endif


This is neat.

I was wondering if you could not have the if statement and use E_ALLMSG for both cases since j==global_obs_num for task_count()==1

But I have not looked through why trace_message is used rather than error_handler(E_MSG
Is the -1 to stop 'silence=.true.' from not printing the message?

Side note, there is bunch of replicated code "trace_message", "set_trace" between pmo and filter and other modules.

I was wondering if you could not have the if statement and use E_ALLMSG for both cases since j==global_obs_num for task_count()==1

But I have not looked through why trace_message is used rather than error_handler(E_MSG Is the -1 to stop 'silence=.true.' from not printing the message?

That's a good point. I think it would be better to use E_ALLMSG in both cases. I'll make this change.

I was also wondering about the general need for trace_message since we have the error_handler. It seems that trace_message and error_handler are basically doing the same thing in a slightly different way. I think we could consider switching out the places that use trace_message to use the error_handler alltogether.

But like you said, trace_message is replicated in pmo and filter_mod.f90 (and models/coamps_nest/coamps_util_mod.f90 but I don't think this is seeing a lot of use). So at the very least I think we should move trace_message to utilities_mod. Should I go ahead and make an issue about this and its overlap with the error_handler?

Finally, the -1 in call trace_message(msgstring, 'perfect_model_obs:', -1) is simply to force the message to print even if silence is set to true in the namelist

Should I go ahead and make an issue about this and its overlap with the error_handler?

Yup make an issue. I wouldn't go ahead any make an code changes with the various trace message calls yet, let's collect all the various trace message stuff, e.g. there is a totally unused routine progress

DART/assimilation_code/modules/assimilation/filter_mod.f90

Line 1941 in 59b75b7

subroutine progress(msg, when, dotime, label, threshold, sync) ! trace plus timestamp

that seems like another trace/timestamp message code.

hkershaw-brown · 2025-07-14T15:42:41Z

The window printing is broken too:

#905 (comment)

…rints: using global_obs_num regardless of number of tasks since j==global_obs_num for task_count()==1. NOTE: now that we are not using trace_message, these prints will NOT execute anymore if the namelist option silence is set to true

…ime if on task 0 (fixes #922)

hkershaw-brown · 2025-07-16T18:38:46Z

nice double bug fix Marlee!

Adding conditonal to pmo processing obs print so that error_handler i…

3b2ef37

…s called with E_ALLMSG if pmo is being run on more than 1 mpi proc

Merge branch 'main' into pmo_prints

3309d6e

mjs2369 requested a review from hkershaw-brown July 10, 2025 20:36

hkershaw-brown reviewed Jul 11, 2025

View reviewed changes

mjs2369 added 2 commits July 16, 2025 12:22

Adding conditional to only print time of first and last observation t…

b014b98

…ime if on task 0 (fixes #922)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bug-fix: perfect_model_obs prints #916

bug-fix: perfect_model_obs prints #916

Uh oh!

mjs2369 commented Jul 10, 2025 •

edited

Loading

Uh oh!

mjs2369 commented Jul 10, 2025

Uh oh!

hkershaw-brown commented Jul 11, 2025

Uh oh!

hkershaw-brown Jul 11, 2025

Uh oh!

mjs2369 Jul 14, 2025

Uh oh!

hkershaw-brown Jul 15, 2025 •

edited

Loading

Uh oh!

hkershaw-brown commented Jul 14, 2025

Uh oh!

hkershaw-brown commented Jul 16, 2025

Uh oh!

Uh oh!

bug-fix: perfect_model_obs prints #916

Are you sure you want to change the base?

bug-fix: perfect_model_obs prints #916

Uh oh!

Conversation

mjs2369 commented Jul 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description:

Fixes issue

Types of changes

Documentation changes needed?

Tests

Checklist for merging

Checklist for release

Testing Datasets

Uh oh!

mjs2369 commented Jul 10, 2025

Uh oh!

hkershaw-brown commented Jul 11, 2025

Uh oh!

hkershaw-brown Jul 11, 2025

Choose a reason for hiding this comment

Uh oh!

mjs2369 Jul 14, 2025

Choose a reason for hiding this comment

Uh oh!

hkershaw-brown Jul 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hkershaw-brown commented Jul 14, 2025

Uh oh!

hkershaw-brown commented Jul 16, 2025

Uh oh!

Uh oh!

mjs2369 commented Jul 10, 2025 •

edited

Loading

hkershaw-brown Jul 15, 2025 •

edited

Loading