Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distribution fixes #788

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open

Distribution fixes #788

wants to merge 10 commits into from

Conversation

hkershaw-brown
Copy link
Member

@hkershaw-brown hkershaw-brown commented Dec 26, 2024

Description:

Fixes for distributions that have been hanging out on https://github.com/NCAR/DART/tree/beta_distribution_fix

  • beta distribution only supporting standard gamma (previously had some generalization that was only partially done causing tests to fail bug: beta distribution not correct (failing tests) #717)
  • gamma distribution only supporting standard gamma (previously had some generalization only partially done)
  • normal distribution, setting all distribution_params_type values (not causing known problems, but maybe future problems)

Forcing the gamma and beta qceff options (bounds) in qceff table to be:

  • GAMMA_DISTRIBUTION (lower bound at 0)
  • BETA_DISTRIBUTION (bound between 0 and 1)

test in developer_tests/qceff/test_force_bounds

I can split this into 3 pull requests if that is easier to review.

Screenshot 2024-12-26 at 3 41 28 PM

Fixes issue

fixes #717 only supporting standard beta
fixes #786 only supporting standard gamma - note see issue, still have upper and lower bound
fixes #787 see notes - initialize distribution_param_type to UNSET?

There was an E_MSG about failing to converge, switched this to E_ALLMSG since the failure could be on any task.

I think the documentation for the qceff available distributions probably needs updating to clarify that the GAMMA & BETA bounds are fixed (QCEFF table values are ignored) Edit: went ahead and changed the documentation.

DART/guide/qceff_probit.rst

Lines 126 to 135 in bddda57

Available distributions
------------------------
* NORMAL_DISTRIBUTION (default)
* BOUNDED_NORMAL_RH_DISTRIBUTION
* GAMMA_DISTRIBUTION
* BETA_DISTRIBUTION
* LOG_NORMAL_DISTRIBUTION
* UNIFORM_DISTRIBUTION
* KDE_DISTRIBUTION

Maybe this is sufficient, I dunno:

 * GAMMA_DISTRIBUTION (lower bound at 0)
 * BETA_DISTRIBUTION (bound between 0 and 1)

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update

Documentation changes needed?

  • My change requires a change to the documentation.
    • I have updated the documentation accordingly.

Tests

Please describe any tests you ran to verify your changes.
developer_tests/*_dist
developer_tests/qceff/test_force_bounds

Checklist for merging

  • Updated changelog entry
  • Documentation updated
  • Update conf.py

Checklist for release

  • Merge into main
  • Create release from the main branch with appropriate tag
  • Delete feature-branch

Testing Datasets

  • Dataset needed for testing available upon request
  • Dataset download instructions included
  • No dataset needed

Functions inv_beta_cdf_params and beta_cdf_params now include an error check to make sure that the
lower and upper bounds in the distribution_params_type have been set to 0 and 1 as other values are
not supported.  HK note it might be better to remove this and have pure functions with no side effects.

Subroutine set_beta_params_from_ens has changed the distribution_params_type to an intent out
argument and defines all six parameters correctly. It also set the parameter type to
BETA_DISTRIBUTION.

fixes #717
…mma distributions.

jla commit message:
A comment now makes it clear that this module only supports the standard gamma
distribution that is bounded below by 0 and unbounded above. Subroutines gamma_cdf, inverse_gamma_cdf,
set_gamma_params_from_ens, and inv_gamma_first_guess no longer have bounded_below, bounded_above,
lower_bound, and upper_bound as arguments. inv_gamma_cdf now sets the bounded_below, bounded_above,
lower_bound and upper_bound parameters to the correct values for the gamma.

Functions inv_gamma_cdf_params and gamma_cdf_params now include an error check to make sure that the
lower and upper bounds in the distribution_params_type have been set to 0 and missing_r8 as other
values are not supported.

Subroutine set_gamma_params_from_ens has changed the distribution_params_type to an intent out
argument and defines all six parameters correctly. It also sets the parameter type to
GAMMA_DISTRIBUTION.
jla commit message:
Functions inv_std_normal_cdf and set_normal_params_from_ens now set the appropriate values for the
bounded_below, bounded_above, lower_bound and upper_bound components of the distribution_params_type.
The distribution_params_type was changed to intent out in set_normal_params_from_ens.

The magic number definition of the maximum delta in the inv_cdf root searching routine was changed
to be a parameter but the value of 1e-8 was unchanged. A comment notes that changing this parameter to
1e-9 allows all of Ian Groom’s KDE distribution tests to pass.
HK note, assuming this is if you use inv_cdf from normal_distribution_mod rather than rootfinding mode
with KDE.

fixes #787
@hkershaw-brown hkershaw-brown added the QCEFF quantile conserving filters label Dec 31, 2024
Comment on lines +161 to +163
! The comment above is not consistent with the performance of these routines
! as validated by test_normal. There is no evidence that this is still
! required.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not following this comment.
"'digits12' is reserved for real variables that MUST retain 64 bits of precision", when dart is built with r8=r4

Are you running test_normal with r8=r4? Should the distribution tests all be run with r8=r4

! real precision:
! TO RUN WITH REDUCED PRECISION REALS (and use correspondingly less memory)
! comment OUT the r8 definition below and use the second one:
integer, parameter :: r4 = SELECTED_REAL_KIND(6,30)
integer, parameter :: r8 = SELECTED_REAL_KIND(12) ! 8 byte reals
!integer, parameter :: r8 = r4 ! alias r8 to r4
! complex precision:
integer, parameter :: c4 = SELECTED_REAL_KIND(6,30)
integer, parameter :: c8 = SELECTED_REAL_KIND(12)
! 'digits12' is reserved for real variables that MUST retain 64 bits of
! precision. DO NOT CHANGE '12' to a smaller number. BAD BAD BAD things happen.
! This is a small subset of the variables. Changing this will ruin the ability
! to distinguish timesteps that are a few seconds apart, for instance.
integer, parameter :: digits12 = SELECTED_REAL_KIND(12)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[hkershaw:work](distribution-fixes) > ./test_normal_dist

 --------------------------------------
 Starting ... at YYYY MM DD HH MM SS = 
                 2025  1  6 10 43 42
 --------------------------------------

  set_nml_output Echo NML values to log file only
 Absolute value of differences should be less than 1e-15
 Matlab Comparison Tests: PASS    0.00000000    
 FAIL: Max inversion diff    17.0500488      > bound    1.00000001E-10 for quantiles <   0.899999976    
 FAIL: Max inversion diff    17.0500488      > bound    1.00000001E-10 for quantiles <   0.990000010    
 FAIL: Max inversion diff    17.0500488      > bound    1.00000001E-10 for quantiles <   0.999000013    
 FAIL: Max inversion diff    17.0500488      > bound    1.00000001E-10 for quantiles <   0.999899983    
 FAIL: Max inversion diff    17.0500488      > bound    1.00000001E-10 for quantiles <   0.999989986    
 FAIL: Max inversion diff    17.0500488      > bound    9.99999972E-10 for quantiles <   0.999998987    
 FAIL: Max inversion diff    17.0500488      > bound    9.99999994E-09 for quantiles <   0.999999881    
 FAIL: Max inversion diff    17.0500488      > bound    1.00000001E-07 for quantiles <    1.00000000    
 FAIL: Max inversion diff    17.0500488      > bound    1.00000001E-07 for quantiles <    1.00000000    
 FAIL: Max inversion diff    17.0500488      > bound    9.99999997E-07 for quantiles <    1.00000000    
 FAIL: Max inversion diff    17.0500488      > bound    9.99999975E-06 for quantiles <    1.00000000    
 FAIL: Max inversion diff    17.0500488      > bound    9.99999975E-05 for quantiles <    1.00000000    
 FAIL: Max inversion diff    17.0500488      > bound    1.00000005E-03 for quantiles <    1.00000000    
 FAIL: Max inversion diff    17.0500488      > bound    9.99999978E-03 for quantiles <    1.00000000    
 FAIL: Max inversion diff    17.0500488      > bound   0.100000001     for quantiles <    1.00000000    
 FAIL: Max inversion diff    17.0500488      > bound    1.00000000     for quantiles <    1.00000000    

 --------------------------------------
 Finished ... at YYYY MM DD HH MM SS = 
                 2025  1  6 10 43 43
 --------------------------------------
[hkershaw:work](distribution-fixes) > git diff          
diff --git a/assimilation_code/modules/utilities/types_mod.f90 b/assimilation_code/modules/utilities/types_mod.f90
index 8e4b376ee..ecf791cf7 100644
--- a/assimilation_code/modules/utilities/types_mod.f90
+++ b/assimilation_code/modules/utilities/types_mod.f90
@@ -81,8 +81,8 @@ integer, parameter :: i8 = SELECTED_INT_KIND(13)
 ! TO RUN WITH REDUCED PRECISION REALS (and use correspondingly less memory)
 ! comment OUT the r8 definition below and use the second one:
 integer, parameter :: r4 = SELECTED_REAL_KIND(6,30)
-integer, parameter :: r8 = SELECTED_REAL_KIND(12)   ! 8 byte reals
-!integer, parameter :: r8 = r4                      ! alias r8 to r4
+!integer, parameter :: r8 = SELECTED_REAL_KIND(12)   ! 8 byte reals
+integer, parameter :: r8 = r4                      ! alias r8 to r4

qceff table bounds are ignored for these
issues #717 #786

Added developer test to check forced values are set correctly
Required changes to run_all.sh for qceff tests to deal with input.nml
@hkershaw-brown hkershaw-brown marked this pull request as ready for review January 7, 2025 20:17
@hkershaw-brown hkershaw-brown requested review from jlaucar and mjs2369 and removed request for mjs2369 and jlaucar January 9, 2025 19:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
QCEFF quantile conserving filters
Projects
None yet
1 participant