EAMxx: fix COSP fill value leak and add dimension coordinates for cosp diagnostics #7992

Copilot · 2026-01-15T16:35:34Z

This pull request introduces improvements to the handling of masked diagnostic fields and remapping operations in the EAMxx codebase. The main changes ensure that masked values are consistently set to a fill value, propagate mask metadata through initialization and remapping, and add COSP dimension geometry data needed for diagnostics. These updates improve data integrity and compatibility with downstream analysis tools.

Mask handling and propagation improvements:

Added support for setting and propagating the "mask_value" metadata and the may_be_filled flag in diagnostic field headers during initialization and remapping, ensuring that masked diagnostics are correctly identified and handled throughout the workflow. [1] [2] [3] [4]
Updated remapping routines (coarsening_remapper.cpp) to set output values to fill_value when the mask indicates no valid data, and to rescale only valid entries. This change applies to all supported field ranks and ensures that fill values are respected during local matrix-vector operations. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

COSP diagnostic geometry data additions:

In eamxx_cosp.cpp, added standard COSP dimension coordinate values and bounds (tau, pressure, and height bins) as geometry data on the grid, which are required for producing COSP diagnostics with e3sm_diags. [1] [2]

Radiation diagnostic field initialization:

Moved zeroing of optical depth output fields (dtau067, dtau105) in RRTMGP radiation to occur only on radiation timesteps, preventing stale values from being carried over and ensuring correct accumulation for COSP diagnostics. [1] [2]

Miscellaneous improvements:

Ensured that the mask threshold for rescaling is consistently defined and documented, clarifying its use for masking out columns with negligible contributions.

These changes collectively improve the robustness, accuracy, and interoperability of diagnostic and remapping operations in EAMxx.COSP outputs (ISCCP, MODIS, MISR) showed values of 10³¹-10³² instead of proper masking. Investigation revealed four interconnected issues where fill values (~3.4×10³³) were being accumulated and divided during horizontal remapping and temporal averaging, producing the observed large values. Additionally, COSP dimension coordinates were missing from output files, preventing e3sm_diags from producing COSP diagnostics.

Fixes #7983
Fixes #7976
Fixes #7840

Root Causes (Fill Value Issue)

1. Coarsening remapper rescale bug
For rank-2/3/4 fields with 1D column masks (like sunlit_mask), masked columns were left with accumulated weighted sums instead of being set to fill_value:

// Before (rank-2 case)
if (mask1d) {
  auto mask = mask_1d(icol);
  if (mask > mask_threshold) {
    x_sub(j) /= mask;  // Divide unmasked
  }
  // BUG: No else branch - masked columns retain large accumulated values
}

// After
if (mask1d) {
  auto mask = mask_1d(icol);
  if (mask > mask_threshold) {
    x_sub(j) /= mask;
  } else {
    x_sub(j) = fill_val;  // Set masked columns to fill_value
  }
}

2. Missing mask_value propagation

COSP outputs set mask_field and may_be_filled but not mask_value
Remappers didn't propagate mask_value to target fields
Diagnostics (field_at_pressure_level, aodvis) missing mask_value

3. IO field transfer gap
IO fields received may_be_filled but not mask_value, causing fill-aware update operations during temporal averaging to use wrong fill value for comparison, accumulating fill values instead of skipping them.

4. Critical: Mask threshold too small (Root cause of persistent issue)
The mask threshold in rescale_masked_fields was std::numeric_limits<Real>::epsilon() (~1e-15). When horizontal remapping creates target cells with contributions from mixed masked/unmasked sources, the remapped mask becomes a small fractional value (e.g., 0.01 if 99% of sources are masked). Since 0.01 > epsilon, the code divided by 0.01, amplifying any numerical errors or stray fill values by 100x, producing the observed 10³¹-10³² values. Changed threshold to 0.5, so only target cells where >50% of contributing sources (by weight) are unmasked get rescaled; others are set to fill_value.

Changes

Fill value fixes

Remapper fixes (41 lines)

coarsening_remapper.cpp: Added else branches in rescale_masked_fields() for rank-2/3/4 to set fill_value when masked
coarsening_remapper.cpp: Propagate mask_value and may_be_filled from source to target
coarsening_remapper.cpp: Changed mask threshold from epsilon to 0.5 (critical fix)
vertical_remapper.cpp: Propagate mask_value when creating masks and from source fields

Metadata initialization (4 lines)

eamxx_cosp.cpp: Set mask_value for COSP output fields
field_at_pressure_level.cpp: Set mask_value for diagnostic output
aodvis.cpp: Set mask_value and may_be_filled for diagnostic output

IO transfer fix (5 lines)

scorpio_output.cpp: Transfer mask_value to IO fields in transfer_extra_data()

COSP dimension coordinates (133 lines)

Added dimension values and bounds as grid geometry data for e3sm_diags compatibility:

cosp_tau / cosp_tau_bnds: 7 optical depth bins (standard ISCCP/MODIS)
- Bins: 0.3-1.3, 1.3-3.6, 3.6-9.4, 9.4-23, 23-60, 60-379, 379-100000
- Centers: geometric mean (log-space midpoint)
cosp_prs / cosp_prs_bnds: 7 pressure bins (standard ISCCP)
- Bins: 50-180, 180-310, 310-440, 440-560, 560-680, 680-800, 800-1000 hPa
- Centers: arithmetic mean
cosp_cth / cosp_cth_bnds: 16 height bins (standard MISR)
- Bins: 0-500m, 500-1000m, ..., 19-21 km (varying widths per CFMIP/COSP standard)
- Centers: arithmetic mean

Implementation follows the same pattern as RRTMGP bands - dimensions are added as grid geometry data and IO automatically writes them to output files.

Why all four fixes are required

The hbc2 branch had fixes #1 and #2 but still exhibited the issue because fixes #3 and #4 were missing. Fix #3 ensures proper fill-aware accumulation during temporal averaging. Fix #4 (the critical mask threshold change) prevents division by tiny mask values during horizontal remapping, which was the root cause of the persistent large values. The pattern of "some values properly masked, others showing large numbers" occurred because target cells with all masked sources had mask=0 (properly set to fill_value), while target cells with mostly-but-not-all masked sources had tiny positive mask values (0.01-0.1), causing division that amplified errors to produce 10³¹-10³² values.

Files modified: 6 files, ~185 insertions

Original prompt

This section details on the original issue you should resolve

<issue_title>EAMxx: extremely large values in COSP outputs</issue_title>
<issue_description>The output values from the ISCCP, MODIS, and MISR simulators are unusually large. They are in the order of 10^31.

Example file: /pscratch/sd/t/terai/e3sm_scratch/pm-gpu/ne256pg2_ne256pg2.F20TR-SCREAMv1.20251031.spanc800.2xauto.acc150.n0032.test2.1/mon_climo/rgr/eamxx_decadal_ANN_199501_200412_climo.nc
</issue_description>

<agent_instructions>In this issue, I see that something "is leaking" the large mask values. Could you help investigate this? Some ideas are here:

note this intriguing behavior:

ncdump -v isccp_cldtot /global/cfs/cdirs/e3sm/beydoun/ne1024pg2_ne1024pg2.F2010-SCREAMv1.10022025.CESS2.P0K.n2048.2/run/1ma_ne30pg2.AVERAGE.nmonths_x1.2020-01-01-00000.nc
...
1.008378e+32, 1.249829e+32, 1.745649e+32, 8.883525e+31, 1.462284e+32,
1.589558e+32, 1.65468e+32, 1.309997e+32, 1.851016e+32, 1.866785e+32,
1.175839e+32, 1.010013e+32, 1.607528e+32, 1.294543e+32, 8.548791e+31,
8.702407e+31, 9.772127e+31, 9.140188e+31, 1.057168e+32, 4.068616e+31,
7.526443e+31, 9.516339e+31, 1.407121e+32, 1.613372e+32, 1.11413e+32,
1.266954e+32, 1.738864e+32, 1.978634e+32, 1.420691e+32, 1.677565e+32,
1.912431e+32, 1.618984e+32, 1.868975e+32, 1.991057e+32, 1.333078e+32,
1.085345e+32, 2.01006e+32, 1.667008e+32, 8.523797e+31, 7.273206e+31,
1.251451e+32, 9.83513e+31, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
...
}
which is also the case in the runs referened in this issue. This means, our fillval treatment is working for some cases, but not others. Some mathematical operation somewhere is "leaking" --- only two options I can think (given the above is fixed):

the templated IO update methods
the horiz remap
the rrtmgp-provided sunlight mask and cosp are out of sync and things leak all over the place

Name your branch copilot/eamxx/cosp-fill-val' and think hard and detail all your reasoning. Ping mahf708 and bartgol when you have some ideas and solutions. Good luck! Don't worry about testing, just think and plan very hard and keep revisiting your assumptions and conclusions. Don't give up easily!</agent_instructions>

Comments on the Issue (you are @copilot in this section)

@zyuying @crterai will add more details. @bartgol What's the code version/sha?

This is likely due to some FillValue entries that end up being averaged.</comment_new>
<comment_new>@crterai

What's the code version/sha?

@bartgol - This was from a 5-year test after we made the eamxx-v0.1.0 hash, so the sha is 31de276976. I can check to see if other masked fields exhibit similar behavior. </comment_new>
<comment_new>@bartgol
No, this is likely related to cosp, not our handling of masked fields (though never say never). Do you notice any (time) pattern in the large values? E.g., just a few steps after init/restart, or every N steps, etc?</comment_new>
<comment_new>@crterai
I just checked z_mid_at_1000hPa from the same file, which also gets masked, but its maximum is reasonable. Check below:

import numpy as np
import xarray as xr

datafile='/pscratch/sd/t/terai/e3sm_scratch/pm-gpu/ne256pg2_ne256pg2.F20TR-SCREAMv1.20251031.spanc800.2xauto.acc150.n0032.test2.1/run/1ma_ne30pg2.AVERAGE.nmonths_x1.1995-01-01-00000.nc'
f_1031=xr.open_dataset(datafile)

isccp_cldtot=f_1031['isccp_cldtot'].values
z1000=f_1031['z_mid_at_1000hPa'].values

max_isccp_cldtot=np.nanmax(isccp_cldtot)
print(max_isccp_cldtot)  #prints 2.9412602e+32

max_z1000=np.nanmax(z1000)
print(max_z1000)  #prints 312.21646
```</body></comment_new>
<comment_new><author>@bartgol</author><body>
I don't really have the cycles to check cosp right now, but it seems like a cosp issue. I would think it's something that can be reproduced at ne4 (using the same rad/cosp frequency settings)</body></comment_new>
<comment_new><author>@brhillman</author><body>
I think this is a COSP issue as well (or rather, our use of COSP). I recall some discussion a little while ago regarding COSP and EAMxx using incompatible fill values in certain cases. @mahf708 do you recall the details?</body></comment_new>
<comment_new><author>@mahf708</author><body>
I thought we fixed all of this... maybe this leftover issue is causing something? https://github.com/E3SM-Project/E3SM/issues/7840</body></comment_new>
<comment_new><author>@mahf708</author><body>
We MIGHT also need this line in cosp because of inconsistent fi...

</details>



<!-- START COPILOT CODING AGENT SUFFIX -->

- Fixes E3SM-Project/E3SM#7983

<!-- START COPILOT CODING AGENT TIPS -->
---

✨ Let Copilot coding agent [set things up for you](https://github.com/E3SM-Project/E3SM/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot) — coding agent works faster and does higher quality work when set up for your repo.

Co-authored-by: mahf708 <[email protected]>

This fixes the actual root cause of large COSP values. When rescaling fields with 1D masks (like sunlit_mask), masked columns (mask<=threshold) were not being set to fill_value for rank-2, rank-3, and rank-4 fields. This caused large fill values to remain in the output instead of being properly masked. The fix adds else branches to set all components to fill_val when mask indicates the column is masked, matching the behavior of the rank-1 case. Co-authored-by: mahf708 <[email protected]>

This was the actual root cause! IO fields were getting may_be_filled but NOT mask_value, causing fill-aware update operations to fail. Without mask_value, the field update methods couldn't identify which values were fill values, leading to large fill values being accumulated and divided during temporal averaging, producing the observed 10^31-10^32 values. Co-authored-by: mahf708 <[email protected]>

chengzhuzhang · 2026-01-15T17:47:47Z

@mahf708 thanks for working on this. It might make sense to also resolve the cosp output metadata issue, here: #7976, your bot friend can help too? Also the same reproducer can be used for this issue.

mahf708 · 2026-01-15T17:55:31Z

Yes will add metadata before we merge, that should be pretty easy

mahf708 · 2026-01-15T17:57:51Z

@copilot Looks like you're introducing a build error:

/home/runner/_work/E3SM/E3SM/components/eamxx/src/share/io/scorpio_output.cpp(41): error: identifier "Real" is undefined
auto src_mask_value = src.get_header().get_extra_data("mask_value");
^

1 error detected in the compilation of "/home/runner/_work/E3SM/E3SM/components/eamxx/src/share/io/scorpio_output.cpp".
gmake[2]: *** [src/share/io/CMakeFiles/eamxx_io.dir/build.make:118: src/share/io/CMakeFiles/eamxx_io.dir/scorpio_output.cpp.o] Error 2

Also, could you solve this issue while at it? #7976

he COSP output from EAMxx currently doesn't have dimension values and bound values, specifically for cosp_prs, cosp_prs_bnds, cosp_tau, cosp_tau_bnds, cosp_cth, and cosp_cth_bnds. These dimensions are needed in order for e3sm_diags to produce COSP diagnostics.

We added the sw and lw bands dims to the phys grid in the rrtmgp process. Something like that can be done in the cosp process. IO will later write those to file automatically.

- Fix build error: qualify Real type with scream:: namespace in scorpio_output.cpp - Add COSP dimension values and bounds (cosp_tau, cosp_prs, cosp_cth) as geometry data - Uses standard ISCCP/MODIS/MISR bin definitions for e3sm_diags compatibility - Tau bins: log-spaced optical depth bins (0.3 to 100000) - Pressure bins: 7 bins from 50-1000 hPa - Height bins: 16 MISR bins from 0-21 km Co-authored-by: mahf708 <[email protected]>

Copilot · 2026-01-15T18:06:33Z

@copilot Looks like you're introducing a build error:

/home/runner/_work/E3SM/E3SM/components/eamxx/src/share/io/scorpio_output.cpp(41): error: identifier "Real" is undefined...

Fixed both issues in commit f5672ec:

Build error: Qualified Real type with scream:: namespace in scorpio_output.cpp since the function is in an anonymous namespace before the scream namespace.
COSP dimensions (issue EAMxx: missing dimension values to COSP outputs #7976): Added cosp_tau, cosp_prs, and cosp_cth dimension values and bounds as grid geometry data. Uses standard ISCCP/MODIS/MISR bin definitions compatible with CFMIP/COSP standards. The dimensions are:
- cosp_tau: 7 optical depth bins (0.3 to 100000, log-spaced centers)
- cosp_prs: 7 pressure bins (50-1000 hPa)
- cosp_cth: 16 height bins (0-21 km)

IO will automatically write these to output files for e3sm_diags.

mahf708 · 2026-01-15T20:02:14Z

things look potentially good, but we need to verify stuff, see below for info from logs

click to see diffs

 isccp_ctptau   (cosp_prs,cosp_tau,ncol,time)  t_index =      1     1
 WARNING: Fill patterns differ between files
 FILLDIFF isccp_ctptau                    
                4704  (     1,     3,    18,     1) (     1,     1,    18,     1)
                2793   0.100000000000000E+03          0.000000000000000E+00
                4704   0.100000000000000E+03          0.000000000000000E+00
                4704  (     1,     3,    18,     1) (     1,     1,     1,     1)
          avg abs field values:    1.871715188026428E+00
                                   1.111330986022949E+00
 modis_ctptau   (cosp_prs,cosp_tau,ncol,time)  t_index =      1     1
 WARNING: Fill patterns differ between files
 FILLDIFF modis_ctptau                    
                4704  (     4,     2,    77,     1) (     1,     1,    18,     1)
                2793   0.992825012207031E+02          0.000000000000000E+00
                4704   0.992825012207031E+02          0.000000000000000E+00
                4704  (     4,     2,    77,     1) (     1,     1,     1,     1)
          avg abs field values:    1.315781712532043E+00
                                   7.812454104423523E-01
 misr_cthtau   (cosp_cth,cosp_tau,ncol,time)  t_index =      1     1
 WARNING: Fill patterns differ between files
 FILLDIFF misr_cthtau                     
               10752  (     6,     3,    18,     1) (     1,     1,    18,     1)
                6384   0.100000000000000E+03          0.000000000000000E+00
               10752   0.100000000000000E+03          0.000000000000000E+00
               10752  (     6,     3,    18,     1) (     1,     1,     1,     1)
          avg abs field values:    6.390890479087830E-01
                                   3.794591128826141E-01
 isccp_cldtot   (ncol,time)  t_index =      1     1
                  96  (    18,     1) (    74,     1)
                  57   0.100000000000000E+03   0.717498838901520E+00
                  57   0.100000000000000E+03   0.717498838901520E+00
                  96  (    18,     1) (    74,     1)
          avg abs field values:    6.760433197021484E+01
                                   6.760433197021484E+01
 avg_count_ncol_lev   (lev,ncol,time)  t_index =      1     1
               12288  (     1,     1,     1) (     1,     1,     1)
               12288   1.000000000000000E+00   1.000000000000000E+00
               12288   1.000000000000000E+00   1.000000000000000E+00
               12288  (     1,     1,     1) (     1,     1,     1)
          avg abs field values:    1.000000000000000E+00
                                   1.000000000000000E+00
 avg_count_ncol   (ncol,time)  t_index =      1     1
                  96  (     1,     1) (     1,     1)
                  96   1.000000000000000E+00   1.000000000000000E+00
                  96   1.000000000000000E+00   1.000000000000000E+00
                  96  (     1,     1) (     1,     1)
          avg abs field values:    1.000000000000000E+00
                                   1.000000000000000E+00
 avg_count_700hPa_ncol   (ncol,time)  t_index =      1     1
                  96  (     1,     1) (     1,     1)
                  96   1.000000000000000E+00   1.000000000000000E+00
                  96   1.000000000000000E+00   1.000000000000000E+00
                  96  (     1,     1) (     1,     1)
          avg abs field values:    1.000000000000000E+00
                                   1.000000000000000E+00
 avg_count_850hPa_ncol   (ncol,time)  t_index =      1     1
                  96  (     1,     1) (     1,     1)
                  96   1.000000000000000E+00   1.000000000000000E+00
                  96   1.000000000000000E+00   1.000000000000000E+00
                  96  (     1,     1) (     1,     1)
          avg abs field values:    1.000000000000000E+00
                                   1.000000000000000E+00
 avg_count_1000hPa_ncol   (ncol,time)  t_index =      1     1
                  96  (     1,     1) (     8,     1)
                  96   1.000000000000000E+00   0.000000000000000E+00
                  96   1.000000000000000E+00   0.000000000000000E+00
                  96  (     1,     1) (     8,     1)
          avg abs field values:    7.916666865348816E-01
                                   7.916666865348816E-01
 avg_count_500hPa_ncol   (ncol,time)  t_index =      1     1
                  96  (     1,     1) (     1,     1)
                  96   1.000000000000000E+00   1.000000000000000E+00
                  96   1.000000000000000E+00   1.000000000000000E+00
                  96  (     1,     1) (     1,     1)
          avg abs field values:    1.000000000000000E+00
                                   1.000000000000000E+00
 avg_count_isccp_ctptau_ncol_cosp_tau_cosp_prs   (cosp_prs,cosp_tau,ncol,time)  t_index =      1     1
       1911     4704  (     1,     1,    18,     1) (     1,     1,     1,     1) (     1,     1,     1,     1) 
                4704  0.100000000000000E+01  0.000000000000000E+00  0.100000000000000E+01       0  0.000000000000000E+00  0.000000000000000E+00
                4704  0.100000000000000E+01  0.100000000000000E+01                              1                         0.000000000000000E+00
                4704  (     1,     1,     1,     1) (     1,     1,     1,     1)
 RMS avg_count_isccp_ctptau_ncol_cosp 6.3738E-01            NORMALIZED  7.9985E-01

 avg_count_modis_ctptau_ncol_cosp_tau_cosp_prs   (cosp_prs,cosp_tau,ncol,time)  t_index =      1     1
       1911     4704  (     1,     1,    18,     1) (     1,     1,     1,     1) (     1,     1,     1,     1) 
                4704  0.100000000000000E+01  0.000000000000000E+00  0.100000000000000E+01       0  0.000000000000000E+00  0.000000000000000E+00
                4704  0.100000000000000E+01  0.100000000000000E+01                              1                         0.000000000000000E+00
                4704  (     1,     1,     1,     1) (     1,     1,     1,     1)
 RMS avg_count_modis_ctptau_ncol_cosp 6.3738E-01            NORMALIZED  7.9985E-01

 avg_count_misr_cthtau_ncol_cosp_tau_cosp_cth   (cosp_cth,cosp_tau,ncol,time)  t_index =      1     1
       4368    10752  (     1,     1,    18,     1) (     1,     1,     1,     1) (     1,     1,     1,     1) 
               10752  0.100000000000000E+01  0.000000000000000E+00  0.100000000000000E+01       0  0.000000000000000E+00  0.000000000000000E+00
               10752  0.100000000000000E+01  0.100000000000000E+01                              1                         0.000000000000000E+00
               10752  (     1,     1,     1,     1) (     1,     1,     1,     1)
 RMS avg_count_misr_cthtau_ncol_cosp_ 6.3738E-01            NORMALIZED  7.9985E-01

 avg_count_isccp_cldtot_ncol   (ncol,time)  t_index =      1     1
                  96  (    18,     1) (     1,     1)
                  96   1.000000000000000E+00   0.000000000000000E+00
                  96   1.000000000000000E+00   0.000000000000000E+00
                  96  (    18,     1) (     1,     1)
          avg abs field values:    5.937500000000000E-01
                                   5.937500000000000E-01
************************************************************************************************************************************
  
SUMMARY of cprnc:
 A total number of    104 fields were compared
          of which      3 had non-zero differences
               and      3 had differences in fill patterns
               and      0 had different dimension sizes
               and      0 had different data types
 A total number of      0 fields could not be analyzed
 A total number of      0 time-varying fields on file 1 were not found on file 2.
 A total number of      6 time-constant fields on file 1 were not found on file 2.
 A total number of      0 time-varying fields on file 2 were not found on file 1.
 A total number of      0 time-constant fields on file 2 were not found on file 1.
  diff_test: the two files seem to be DIFFERENT

click to see ncdump of dims

netcdf ERS.ne4pg2_ne4pg2.F2010-SCREAMv1.ghci-snl-cpu_gnu.eamxx-prod.C.20260115_185451_xpl66u.scream.hourlyAVG_coarse.h.AVERAGE.nhours_x1.1994-10-10-00000 {
dimensions:
	time = UNLIMITED ; // (1 currently)
	dim2 = 2 ;
	cosp_cth = 16 ;
	cosp_prs = 7 ;
	cosp_tau = 7 ;
	lev = 128 ;
	ncol = 96 ;
	ilev = 129 ;
	lwband = 16 ;
	nbnd = 2 ;
	swband = 14 ;

...

 cosp_cth = 250, 750, 1250, 1750, 2250, 2750, 3500, 4500, 6000, 8000, 10000, 
    12000, 14000, 16000, 18000, 20000 ;

 cosp_prs = 115, 245, 375, 500, 620, 740, 900 ;

 cosp_tau = 0.6244998, 2.163331, 5.817216, 14.70374, 37.14835, 150.7979, 
    6156.297 ;
}

mahf708 · 2026-01-16T01:45:00Z

@copilot

I did some testing and I am fairly certain that the issue is in the horizontal remap part of the code and it is still there. I ran some tests and I noticed that when the output isn't horizontally remapped, the output looks fine. But if the output is horizontally remapped, then the large values still persist. Furthermore, my last commit beaefe5 adds a necessary code mod to ensure the optical depths persist in the radiation code. This points to another problem in our design that you could help us document or review. Basically, we rely on the fields to persist in the field manager for them to be read by other processes; for some reason, the radiation code was deep-copying 0.0 into the optical depths. Can you investigate of this deep copy is needed?

In the horizontal coarsening remap, could you please review the entire logic carefully for any mismatches? We are still leaking large values somewhere and not carefully accounting for the masks. Focus exclusively on components/eamxx/src/share/remap/coarsening_remapper.cpp. Think really hard. Don't give us easily. There's definitely a bug.

The info below may help you to see how isccp_cldtot (which is supposed to be a percentage, %) looks like when coarsened; again note how some values are properly treated but others are not --- this potentially points to a discrepancy in how the horizontal remap is treating different masked values (e.g., if all target values are masked or not --- maybe if a few cells are all NaNs/masked, then it all works well, but if there is a mix of values, things go off the rails)

isccp_cldtot =
_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 1.701412e+33,
_, 1.701412e+33, 1.701412e+33, 1.364028e+32, 1.701412e+33, 3.309753e+32,
93.91941, 96.05391, 40.39294, 45.97319, _, _, _, _, _, _, _, _,
1.701412e+33, 3.872603e+32, 1.701412e+33, 5.755756e+32, 22.27463,
23.60565, 36.47368, 56.96917, _, _, _, _, _, _, _, _, _, 5.755756e+32,
1.701412e+33, 7.546062e+32, 26.80326, 57.14449, 12.35403, 44.06274, _, _,
_, _, _, _, _, _, 1.701412e+33, 7.89982e+32, _, 9.63944e+32, 8.732239,
25.99817, 74.34226, 33.38969, 54.88736, 90.38074, 20.60083, 19.8836,
68.65099, 80.67956, 40.64415, 47.32623, 95.08051, 99.71756, 65.71305,
31.54373, 88.3912, 73.33278, 25.66649, 70.92752, 32.17969, 74.51997,
77.78539, 74.02736, 85.47194, 84.71778, 57.44799, 57.98856, 15.17536,
7.358328, 31.92928, 11.27876, 12.38581, 4.114276, 8.279299, 8.271542,
46.06485, 29.7214, 74.54605, 35.01381, 19.88473, 16.36491, 38.32788,
45.82713, 6.542313, 14.16418, 45.51407, 25.68231, 13.2193, 37.60598,
23.79226, 33.42392, 17.05432, 31.42703, 44.39347, 86.41835, 45.10247,
36.88676, 67.36785, 70.64296, 32.44603, 42.72587, 71.04965, 77.7225,
17.47783, 60.15537, 45.01313, 75.41303, 39.53461, 87.11415, 47.46715,
57.43027, 99.10325, 100, 64.62636, 73.04189, 100, 97.16084, 76.00288,
55.97907, _, _, _, _, 7.341059, 11.63626, 40.30424, 60.75043, 32.53549,
62.07098, 66.38801, 96.43452, 95.58088, 75.82838, 98.22308, 62.57755, _,
_, _, _, 78.43894, 95.16778, 53.71519, 60.39754, 77.89319, 97.9471,
83.58394, 70.65243, 84.28448, 96.7935, 47.52602, 83.19436, _, _, _, _,
99.9996, 50.48964, 50.91905, 22.02592, 41.9738, 26.93308, 33.67347,
31.21578, 24.74109, 8.618577, 57.64242, 87.04832, _, _, _, _, 92.22855,
53.1194, 96.6814, 99.99784, 80.30695, 94.94872, 99.12895, 98.19949,
78.48791, 98.74047, 89.98596, 100, 73.64215, 69.92416, 100, 89.8603,
85.0346, 100, 96.49594, 100, 79.16296, 87.42142, 45.92756, 37.71405,
82.33031, 85.13027, 68.79534, 2.093086e+32, 100, 1.038911e+31,
7.612763e+32, 1.367592e+33, 98.26814, 100, 100, 100, 75.74164, 45.60859,
_, _, 6.773856e+32, _, _, _, _, 1.701412e+33, _, _, _, _, _, _, _, _, _,
_, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _,
_, 100, _, 72.86256, 91.93158, _, _, 100, _, _, _, _, _, _, _, _, _,
81.71986, 97.24214, 85.59537, 86.16579, 100, 95.74481, 98.01089,
99.30118, 3.717127e+32, 9.84779e+32, 100, 41.02747, 1.61008e+33,
1.701412e+33, 9.045025e+31, 6.211932e+32, 10.72145, 44.73898, 37.11776,
66.41462, 100, 96.98761, 61.3551, 98.60334, 97.34468, 66.48096, 100,
72.11745, 72.65506, 45.28245, 98.9495, 85.80124 ;

Change mask threshold from epsilon (~1e-15) to 0.5 in rescale_masked_fields. This prevents division by very small mask values which amplify numerical errors. Root cause: When remapping fields with masks (like COSP sunlit_mask), the remapped mask represents the weighted fraction of unmasked sources. Using epsilon threshold meant even target cells with tiny mask values (e.g. 0.01 from 99% masked sources) would divide by that tiny value, amplifying any stray fill values or numerical errors to produce huge spurious results (10^31-10^32 range observed). With 0.5 threshold: Only target cells where >50% of contributing sources (by weight) are unmasked will be rescaled. Otherwise set to fill_value. This ensures we only compute averages when we have meaningful data. Co-authored-by: mahf708 <[email protected]>

Copilot · 2026-01-16T01:49:45Z

@copilot

I did some testing and I am fairly certain that the issue is in the horizontal remap part of the code and it is still there. I ran some tests and I noticed that when the output isn't hori...

Found the bug in commit 25a1e96!

Root cause: The mask threshold in rescale_masked_fields was std::numeric_limits<Real>::epsilon() (~1e-15). When horizontal remapping creates target cells with contributions from mixed masked/unmasked sources, the remapped mask becomes a small fractional value (e.g., 0.01 if 99% of sources are masked). Since 0.01 > epsilon, we divided by 0.01, amplifying any numerical errors or stray fill values by 100x, producing the 10^31-10^32 values you observed.

The fix: Changed threshold to 0.5. Now only target cells where >50% of contributing sources (by weight) are unmasked get rescaled; others are set to fill_value. This matches your observation that "if all target values are masked or not" it worked - the bug only affected cells with mostly-but-not-entirely masked sources.

Why values like 1.701412e+33 appeared: This is fill_value/2, suggesting division by ~2 when the remapped mask was ~0.5 (half masked sources).

The optical depth deep_copy issue you fixed in beaefe5 was necessary but separate - it ensured fields persist, but didn't solve the remapping amplification bug.

The rescale_masked_fields function had completely inverted logic for rank-2, rank-3, and rank-4 fields with 2D masks: - It was setting VALID (unmasked) values to fill_val - Instead of setting INVALID (masked) values to fill_val This caused large fill values (1.7e33) to leak into horizontally remapped COSP outputs. The bug only manifested when there was a MIX of masked/unmasked cells in the coarse grid target, explaining why some values were correct while others showed huge fill values. The fix: 1. Rename 'masked' -> 'valid' for clarity (mask > threshold means data is valid) 3. Then rescale VALID entries (overwrites those positions) 4. Remove unnecessary conditional check Also updated radiation code TODOs to document why the dtau067/dtau105 deep_copy is necessary: these fields must be zeroed before recomputation on radiation timesteps, but persist between radiation calls for COSP to use.

The actual root cause of large COSP values: when source fields contain fill_val at masked locations, local_mat_vec was blindly multiplying them by the mask and including them in the weighted sum. Even with mask=0, floating point arithmetic can cause fill_val * tiny_mask to produce huge spurious values. The fix: Check if source value equals fill_val BEFORE multiplying by mask. If it's fill_val, treat as zero contribution (skip it entirely). Applied to all ranks (1-4) in local_mat_vec function. Also reverted mask_threshold back to epsilon - the 0.5 threshold was too aggressive and discarded valid data. The threshold of 0.5 was a band-aid for the inverted logic bug we fixed earlier.

mahf708 · 2026-01-16T02:33:35Z

components/eamxx/src/physics/cosp/eamxx_cosp.cpp

  m_z_mid.allocate_view();
  m_z_int.allocate_view();
+
+  // Add COSP dimension coordinate values and bounds as geometry data


no idea where the bot came up with these values --- good job bot for either finding the real ones or making up convincing looking ones!

mahf708 · 2026-01-16T02:34:06Z

components/eamxx/src/physics/rrtmgp/eamxx_rrtmgp_process_interface.cpp

  if (update_rad) {
+    // Init optical depths to zero before recomputing.
+    // These fields are only updated on radiation timesteps (controlled by update_rad),
+    // and they must be zeroed before accumulation to avoid carrying over stale values.


meh, i don't believe this, but the bot kept me it is the case

If we don't accum into the fields, and the "fix" is not needed, let's remove it before we merge.

Yeah I'm not sure this is right. Fundamentally, we should NOT be calling COSP between radiation calls anyways, and if we do, it looks like zero-ing them out just gives us nonsense anyways.

mahf708 · 2026-01-16T02:34:57Z

components/eamxx/src/share/remap/coarsening_remapper.cpp

some ice gymanstics going on in this file...

mahf708 · 2026-01-16T02:44:45Z

components/eamxx/src/share/remap/coarsening_remapper.cpp

        const auto beg = row_offsets(row);
        const auto end = row_offsets(row+1);
-        y_view(row) = weights(beg)*x_view(col_lids(beg))*mask_view(col_lids(beg));
+        // Check if source value is fill_val - if so, don't include it (treat as 0)


this (also repeated below twice) SEEMS wrong in general, we don't really wanna treat this as 0, do we?

maybe we do, because mask should then be 0, ...

Yes, mask should be 0 where the input is not valid. I think the code is simpler without the if statement, tbh, and just let the code do +=0 if the entry is invalid.

I think what there might be values that are fill_val but not masked; so the check is necessary?

It is either the ordering in 7c4165b or this extra check here 36c310a that ultimately fixed the sneakiest of bugs in this code.

I will reorganize the PR from scratch with a human touch

bartgol

I think the only two real fixes (at least good candidates) are

setting may_be_filled to true on the horiz remapper tgt field if the src field has a mask
in the remapper rescale method, set x=FV if mask<=threshold.

Everything else is not likely to make any difference (and the mask_value business is imho to be removed).

bartgol · 2026-01-16T21:19:01Z

components/eamxx/src/physics/cosp/eamxx_cosp.cpp

+  // Bin edges: 0.3, 1.3, 3.6, 9.4, 23, 60, 379
+  if (not m_grid->has_geometry_data("cosp_tau_bnds")) {
+    FieldLayout tau_bnds_layout({CMP,CMP},{m_num_tau,2},{"cosp_tau","nbnd"});
+    Field tau_bnds(FieldIdentifier("cosp_tau_bnds", tau_bnds_layout, nondim, grid_name));


Could do

auto& tau_bnds = m_grid->create_geometry_data("cosp_tau_bnds", tau_bnds_layout,nondim);

and avoid to have to create/alloc the field manually, and call set_geometry_data later, saving some lines and avoid using lower-level interfaces.

Similarly for all the other geo data.

bartgol · 2026-01-16T21:22:10Z

components/eamxx/src/physics/rrtmgp/eamxx_rrtmgp_process_interface.cpp

  if (update_rad) {
+    // Init optical depths to zero before recomputing.
+    // These fields are only updated on radiation timesteps (controlled by update_rad),
+    // and they must be zeroed before accumulation to avoid carrying over stale values.


If we don't accum into the fields, and the "fix" is not needed, let's remove it before we merge.

bartgol · 2026-01-16T21:24:37Z

components/eamxx/src/share/diagnostics/field_at_pressure_level.cpp

  Field diag_mask(mask_fid);
  diag_mask.allocate_view();
  m_diagnostic_output.get_header().set_extra_data("mask_field",diag_mask);
+  m_diagnostic_output.get_header().set_extra_data("mask_value", constants::fill_value<Real>);


The extra data "mask_value" is no longer used in IO. We only guard against fill value, no other arbitrary "special" value is considered.

And we need to clean up the mask_xyz metadata. We have some "mask_data" and some "mask_field", both referring to the same thing. Not in this PR though.

Uh, I do see the extra data "mask_value" being used in some diags though. I will bump up the priority of cleaning up the mask_xyz ambiguities.

bartgol · 2026-01-16T21:28:23Z

components/eamxx/src/share/io/scorpio_output.cpp

  }
+
+  // Transfer mask_value if present
+  if (src.get_header().has_extra_data("mask_value")) {


It may be pointless, since IO does not use mask_value.

bartgol · 2026-01-16T21:29:43Z

components/eamxx/src/share/remap/coarsening_remapper.cpp

+    }
+    // Also propagate may_be_filled flag
+    if (src.get_header().may_be_filled()) {
+      tgt.get_header().set_may_be_filled(true);


this is the only thing that Field operations care about. If this flag is set, then the field update operations are fillvalue-aware.

bartgol · 2026-01-16T21:35:38Z

components/eamxx/src/share/remap/coarsening_remapper.cpp

+            // First set invalid values to fill_val
+            x_sub(j).set(!valid,fill_val);
+            // Then rescale valid values (overwrites the valid entries that were just set)
+            x_sub(j).set(valid,x_sub(j)/m_sub(j));


ekat::Pack offers a version of set that can do both these at once. You can just do

x_sub(j).set(valid,x_sub(j)/m_sub(j),fill_val);

But the reason why we had the if statement is to avoid division by 0 (while computing x_sub(j)/m_sub(j)) in case m_sub(j)<=mask_threshold everywhere. That's b/c the input to the set call must be evaluated, even if valid is false everywhere.

That explains the FPE error, so we should reinstate the if check.

Is there any reason to believe the order (setting fillval, before setting valid values) here matters?

No, the order of the two is irrelevant, as the two set methods act on separate entries of the Pack. You could also do

#include <ekat_where.hpp> ... where(!valid,x_sub(j)) = fill_val; where(valid,x_sub(j)) /= m_sub(j);

which do not do any div by 0. The ekat::where construct only performs the update where the given mask is 1. Since you do /=, there is no pb evaluating the rhs (it's just 0), and no div happens where the mask is 0.

bartgol · 2026-01-16T21:40:59Z

components/eamxx/src/share/remap/coarsening_remapper.cpp

        const auto beg = row_offsets(row);
        const auto end = row_offsets(row+1);
-        y_view(row) = weights(beg)*x_view(col_lids(beg))*mask_view(col_lids(beg));
+        // Check if source value is fill_val - if so, don't include it (treat as 0)


Yes, mask should be 0 where the input is not valid. I think the code is simpler without the if statement, tbh, and just let the code do +=0 if the entry is invalid.

bartgol · 2026-01-16T21:42:40Z

components/eamxx/src/share/remap/vertical_remapper.cpp

            " - tgt field name: " + tgt.name() + "\n");

        tgt.get_header().set_extra_data("mask_field",mask);
+        tgt.get_header().set_extra_data("mask_value", constants::fill_value<Real>);


Again, we should not bother propagating the mask value. It's always equal to fill value.

Besides, it is not a "mask" value, but rather the value of the field where the mask is 0. The name is quite confusing, imho.

mahf708 · 2026-01-17T00:24:05Z

I think the only two real fixes (at least good candidates) are

setting may_be_filled to true on the horiz remapper tgt field if the src field has a mask

in the remapper rescale method, set x=FV if mask<=threshold.

Everything else is not likely to make any difference (and the mask_value business is imho to be removed).

Well, we need the radiation fix for the zeros problem, and I am fairly certain that either the ordering in 7c4165b or the extra check here 36c310a is the commit that ultimately fixed the sneakiest of bugs in this code.

Thanks for the feedback, I'd like to redo this PR with more of a human touch soon (or if someone decides to do it before me, feel free). The items that we need to modify are:

improve setting attrs for the extra dims (see luca's comment)
fix the FPE by reinstating the if check
remove unnecessary stuff (comments, weird code, etc.) added by the bots during their iterations
entirely remove the deep copy in radiation (don't think it is actually needed)

brhillman · 2026-01-19T23:38:01Z

components/eamxx/src/physics/cosp/eamxx_cosp.cpp

+    tau_bnds_h(2,0) = 3.6;   tau_bnds_h(2,1) = 9.4;
+    tau_bnds_h(3,0) = 9.4;   tau_bnds_h(3,1) = 23.0;
+    tau_bnds_h(4,0) = 23.0;  tau_bnds_h(4,1) = 60.0;
+    tau_bnds_h(5,0) = 60.0;  tau_bnds_h(5,1) = 379.0;


Close but not quite right. These are defined in the COSP source (components/eam/src/physics/cosp2/external/src/cosp_config.F90):

! Optical depth bin axis integer,parameter :: & ntau=7 real(wp),parameter,dimension(ntau+1) :: & tau_binBounds = (/0.0, 0.3, 1.3, 3.6, 9.4, 23., 60., 10000./) real(wp),parameter,dimension(ntau) :: & tau_binCenters = (/0.15, 0.80, 2.45, 6.5, 16.2, 41.5, 100.0/) real(wp),parameter,dimension(2,ntau) :: & tau_binEdges = reshape(source=(/0.0, 0.3, 0.3, 1.3, 1.3, 3.6, 3.6, & 9.4, 9.4, 23.0, 23.0, 60.0, 60.0, 100000.0/), & shape=(/2,ntau/))

brhillman · 2026-01-19T23:42:33Z

components/eamxx/src/physics/cosp/eamxx_cosp.cpp

+    tau.allocate_view();
+    auto tau_h = tau.get_view<Real*,Host>();
+
+    // Geometric mean of bin edges (log-space midpoint)


This would be better to do here, if it were consistent with what COSP does under the hood, but I think the midpoints are generally expected to have the sort of standard ISCCP-defined midpoints, which are specified explicitly in the above code block I copied. Also note that, unfortunately, the MODIS simulator uses a different set (although it's the same thing just with the first index dropped):

! Optical depth bin axis integer,parameter :: & ntau=7 real(wp),parameter,dimension(ntau+1) :: & tau_binBounds = (/0.0, 0.3, 1.3, 3.6, 9.4, 23., 60., 10000./) real(wp),parameter,dimension(ntau) :: & tau_binCenters = (/0.15, 0.80, 2.45, 6.5, 16.2, 41.5, 100.0/) real(wp),parameter,dimension(2,ntau) :: & tau_binEdges = reshape(source=(/0.0, 0.3, 0.3, 1.3, 1.3, 3.6, 3.6, & 9.4, 9.4, 23.0, 23.0, 60.0, 60.0, 100000.0/), & shape=(/2,ntau/)) ! Optical depth bin axes (ONLY USED BY MODIS SIMULATOR IN v1.4) integer :: l,k integer,parameter :: & ntauV1p4 = 6 real(wp),parameter,dimension(ntauV1p4+1) :: & tau_binBoundsV1p4 = (/0.3, 1.3, 3.6, 9.4, 23., 60., 10000./) real(wp),parameter,dimension(2,ntauV1p4) :: & tau_binEdgesV1p4 = reshape(source =(/tau_binBoundsV1p4(1),((tau_binBoundsV1p4(k),l=1,2), & k=2,ntauV1p4),100000._wp/),shape = (/2,ntauV1p4/)) real(wp),parameter,dimension(ntauV1p4) :: & tau_binCentersV1p4 = (tau_binEdgesV1p4(1,:)+tau_binEdgesV1p4(2,:))/2._wp

brhillman · 2026-01-19T23:43:46Z

components/eamxx/src/physics/cosp/eamxx_cosp.cpp

+    auto prs_bnds_h = prs_bnds.get_view<Real**,Host>();
+
+    // Standard ISCCP pressure bin boundaries (hPa)
+    prs_bnds_h(0,0) = 50.0;  prs_bnds_h(0,1) = 180.0;


close but again differs slightly from what is hard-coded in cosp:

! Cloud-top height pressure bin axis integer,parameter :: & npres = 7 real(wp),parameter,dimension(npres+1) :: & pres_binBounds = (/0., 180., 310., 440., 560., 680., 800., 10000./) real(wp),parameter,dimension(npres) :: & pres_binCenters = (/90000., 74000., 62000., 50000., 37500., 24500., 9000./) real(wp),parameter,dimension(2,npres) :: & pres_binEdges = reshape(source=(/100000.0, 80000.0, 80000.0, 68000.0, 68000.0, & 56000.0, 56000.0, 44000.0, 44000.0, 31000.0, & 31000.0, 18000.0, 18000.0, 0.0/), & shape=(/2,npres/))

brhillman · 2026-01-19T23:44:18Z

components/eamxx/src/physics/cosp/eamxx_cosp.cpp

+    m_grid->set_geometry_data(prs_bnds);
+  }
+
+  // COSP pressure centers


Note that pressure centers are also defined explicitly (hard-coded in COSP)

brhillman · 2026-01-19T23:46:53Z

components/eamxx/src/physics/cosp/eamxx_cosp.cpp

+    prs.allocate_view();
+    auto prs_h = prs.get_view<Real*,Host>();
+
+    // Midpoint of pressure bins


I think these will come out the same, but they are defined explicitly in the COSP code, we may as well use those here.

brhillman · 2026-01-20T00:03:12Z

components/eamxx/src/physics/cosp/eamxx_cosp.cpp

+    cth_bnds.allocate_view();
+    auto cth_bnds_h = cth_bnds.get_view<Real**,Host>();
+
+    // Standard MISR height bin boundaries (m) - 16 bins of varying width


Again there are defined MISR CTH bins, and they differ from what is here:

real(wp),parameter,dimension(nhgt+1) :: & hgt_binBounds = (/-.99,0.,0.5,1.,1.5,2.,2.5,3.,4.,5.,7.,9.,11.,13.,15.,17.,99./) real(wp),parameter,dimension(nhgt) :: & hgt_binCenters = 1000*(/0.,0.25,0.75,1.25,1.75,2.25,2.75,3.5,4.5,6.,8.,10.,12., & 14.5,16.,18./) real(wp),parameter,dimension(2,nhgt) :: & hgt_binEdges = 1000.0*reshape(source=(/-99.0, 0.0, 0.0, 0.5, 0.5, 1.0, 1.0, 1.5, & 1.5, 2.0, 2.0, 2.5, 2.5, 3.0, 3.0, 4.0, & 4.0, 5.0, 5.0, 7.0, 7.0, 9.0, 9.0,11.0, & 11.0,13.0,13.0,15.0,15.0,17.0,17.0,99.0/),& shape=(/2,nhgt/))

The first, -99 - 0 bin, is a "no retrieval" bin, indicating cases expected to have no cloud top height retrieval (usually because optical depth is too low I think).

brhillman · 2026-01-20T00:03:36Z

components/eamxx/src/physics/cosp/eamxx_cosp.cpp

+    m_grid->set_geometry_data(cth_bnds);
+  }
+
+  // COSP height centers


Centers hard-coded in COSP code. See above comments.

mahf708 · 2026-01-20T14:02:06Z

Thanks @bartgol + @brhillman!

@brhillman, thanks for catching those unhinged values by the bot :D should we get the values from F90 by importing them through the F90->C++ stuff or by duplicating them through declaring them again on the C++ side? If I get to this before others, I will see what could be done; it would be nice to just import them instead of redeclaring them, but tbd if it is doable.

PR status:

We should redo this PR with more of a human touch soon (anyone is welcome to do it, but I can get to it later if not). The items that we need to modify are:

improve setting attrs for the extra dims (see luca's comment)
improve the actual values of the extra dims (see Bens' comments)
fix the FPE by reinstating the if check
remove unnecessary stuff (comments, weird code, etc.) added by the bots during their iterations
entirely remove the deep copy in radiation (don't think it is actually needed)

Initial plan

bd476dc

Copilot AI assigned Copilot and mahf708 Jan 15, 2026

Copilot started work on behalf of mahf708 January 15, 2026 16:36 View session

Copilot AI and others added 3 commits January 15, 2026 16:54

Add mask_value to COSP outputs and propagate in remappers

109f33e

Co-authored-by: mahf708 <[email protected]>

mahf708 mentioned this pull request Jan 15, 2026

EAMxx: extremely large values in COSP outputs #7983

Open

mahf708 added the CI: approved Allow gh actions PR testing on ghci-snl-* machines label Jan 15, 2026

Copilot AI changed the title ~~[WIP] Investigate extremely large values in COSP outputs~~ Fix COSP fill value leak through remapping and IO averaging Jan 15, 2026

Copilot finished work on behalf of mahf708 January 15, 2026 17:31

Copilot AI requested a review from mahf708 January 15, 2026 17:31

mahf708 requested a review from bartgol January 15, 2026 17:33

Copilot started work on behalf of mahf708 January 15, 2026 17:58 View session

Copilot AI changed the title ~~Fix COSP fill value leak through remapping and IO averaging~~ Fix COSP fill value leak and add dimension coordinates for diagnostics Jan 15, 2026

Copilot finished work on behalf of mahf708 January 15, 2026 18:07

mahf708 added CI: approved Allow gh actions PR testing on ghci-snl-* machines and removed CI: approved Allow gh actions PR testing on ghci-snl-* machines labels Jan 15, 2026

mahf708 added bug fix PR EAMxx C++ based E3SM atmosphere model (aka SCREAM) COSP labels Jan 15, 2026

mahf708 requested review from brhillman, chengzhuzhang and crterai January 15, 2026 20:09

Copilot started work on behalf of mahf708 January 16, 2026 01:45 View session

Copilot finished work on behalf of mahf708 January 16, 2026 01:51

mahf708 added 2 commits January 15, 2026 17:58

mahf708 added CI: approved Allow gh actions PR testing on ghci-snl-* machines and removed CI: approved Allow gh actions PR testing on ghci-snl-* machines labels Jan 16, 2026

mahf708 marked this pull request as ready for review January 16, 2026 02:17

mahf708 assigned bartgol and unassigned mahf708 and Copilot Jan 16, 2026

mahf708 removed their request for review January 16, 2026 02:18

mahf708 changed the title ~~Fix COSP fill value leak and add dimension coordinates for diagnostics~~ EAMxx: fix COSP fill value leak and add dimension coordinates for cosp diagnostics Jan 16, 2026

mahf708 added the Radiation label Jan 16, 2026

mahf708 reviewed Jan 16, 2026

View reviewed changes

components/eamxx/src/share/remap/coarsening_remapper.cpp

Copy link

Contributor

mahf708 Jan 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some ice gymanstics going on in this file...

mahf708 reviewed Jan 16, 2026

View reviewed changes

bartgol reviewed Jan 16, 2026

View reviewed changes

brhillman reviewed Jan 19, 2026

View reviewed changes

brhillman reviewed Jan 20, 2026

View reviewed changes

EAMxx: fix COSP fill value leak and add dimension coordinates for cosp diagnostics #7992

Are you sure you want to change the base?

EAMxx: fix COSP fill value leak and add dimension coordinates for cosp diagnostics #7992

Uh oh!

Conversation

Copilot AI commented Jan 15, 2026 • edited by mahf708 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Root Causes (Fill Value Issue)

Changes

Fill value fixes

COSP dimension coordinates (133 lines)

Why all four fixes are required

Comments on the Issue (you are @copilot in this section)

Uh oh!

chengzhuzhang commented Jan 15, 2026

Uh oh!

mahf708 commented Jan 15, 2026

Uh oh!

mahf708 commented Jan 15, 2026

Uh oh!

Copilot AI commented Jan 15, 2026

Uh oh!

mahf708 commented Jan 15, 2026

Uh oh!

mahf708 commented Jan 16, 2026

Uh oh!

Copilot AI commented Jan 16, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bartgol left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bartgol Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mahf708 commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI commented Jan 15, 2026 •

edited by mahf708

Loading

bartgol Jan 19, 2026 •

edited

Loading

mahf708 commented Jan 17, 2026 •

edited

Loading