-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
qceff dartlab #740
base: main
Are you sure you want to change the base?
qceff dartlab #740
Conversation
THis involved adding optional arguments to return information about the weights in the prior and posterior bins to obs_increment_rhf.m
…ne the pdf for a bounded normal rank histogram given the ensemble members and information about the tail normals.
distribution plot is updated whenever the buttons selecte a new distribution.
…instead of using hard coded bounds from oned_ensemble's axes.
…se where bnrh is bounded on the left at 0. oned_ensemble is believed to be working for all cases with the improved plotting. bounded_oned_ensemble is believed to be functioning correctly except for the case of applying inflation with the gamma or BNRH distributions.
in which the inflated prior mean value was not being correctly computed. This was also changed in oned_ensemble.m where it did not produce a wrong answer (since inflated mean is always the same as prior mean) but could have led to a misunderstanding when looking at the code.
These are directly modified versions of the fortran routines of the same names. They include slightly less error checking at the moment and should be further polished for use in actual matlab assimilations as opposed to graphical demonstrations. These have been tested for direct inversion and appear to be bitwise for the small cases tested.
…unction weighted_norm_inv.m out of obs_increment_rhf.m since it is needed for the recently added cdf and inverse cdf of the bnrh. All inflation variants now work in bounded_oned_ensemble.
exisiting guis can use it without modification.
an additional plot box for the transformed ensemble and new button for the unobserved variable transform.
that observed variable is nonnegative. Put in appropriate error handling if a negative value is clicked. Fixed the error messages for the observation and obs error sd specification.
… and updates in the ppi joint plot.
…ies. The PPI space plots are now appropriately cleared in the same fashion as the standard state space.
…rved variable. Added PPI transforms for the observed variable. Options limited to normal and unbounded RH since the observed variable is unbounded in this demonstration.
distribution. Made correct use of SCALAR function for transforming observed posterior with BNRH.
…nd the existing oned_ensemble.m and twod_ensemble.m
… from the update_ens callback. Updates the observation at each cycle and plots the updated obs, prior and posterior.
…ation of the continuous mean and sd.
and associated call backs to 'Cycle DA'
the dynamically changeable growth rate.
first become meaninful.
the continuous distributions with solid curves. Added approximate fitted continuous distributions for the EnKF for comparison.
so that just the 1D KF can be seen.
the continuous and ensemble pdfs.
messaging accordingly.
so that all text scales up or down ('normalized') when the plot size is changed.
10 steps that were found with matlab 2024a. There are still some problems if one runs a large screen and then reduces it to a small screen while cycling.
Removed unneeded output from private routine.
Matlab guis. Sections 1, 2, 4 and 5 contain revised versions of the same material as in the previous versions. Section 4 is new and describes QCEFF filters. The EnKF is no longer described although the GUIs still support it.
I can't review this because it is my pull request. Changes needed: guide/DART_LAB/DART_LAB.rst needs to have the new section 4. & probably the new tutorial exercises. |
still waiting for slides, closing for now. |
@jlaucar read this |
…tandards. Other minor corrections have been made. The DART_LAB.rst has been updated to include the new section titles and to list all of the new matlab main scripts.
I didn't get too far before stumbling on something that I need to resolve before continuing. Section 1, slide 4 ("1:4") brings up the issue that I'm often confused in discussions of conditional probability because people seem to read the notation P(A|S) in a variety of ways, some of which seem contradictory. It doesn't make sense to me to say "the probability of this space given that a specific event happened" I'm guessing that all of the math is correct, but the descriptions confuse me (and maybe others). |
Kevin,
It's fine for you to dig in just a little on this topic, but... Some of
the issues are related to the difference between discrete and continuous
distributions for conditional probabilities and Bayes. There is a note in
the slides that indicates that professional statisticians would be
concerned about how we discuss things in a discrete way while applying to
continuous. Doing this formally would drag into hours of basic probs and
stats, so all we really need is to get to Bayes in some format that makes
some sense. The formal distinction between an observational error
distribution and a likelihood is challenging to even the best DA folks. For
instance, Craig Bishop totally gums this up in his gamma/inverse gamma
papers. There is no generally accepted notation that easily handles both
views.
Feel free to suggest some alternative wording that makes more sense and we
can discuss. Jeff
…On Fri, Dec 13, 2024 at 7:49 AM kdraeder ***@***.***> wrote:
I didn't get too far before stumbling on something that I need to resolve
before continuing.
Section 1, slide 4 ("1:4") brings up the issue that I'm often confused in
discussions of conditional probability because people seem to read the
notation P(A|S) in a variety of ways, some of which seem contradictory.
I was taught to read that conditional probability as
"The probability of the measurement being T_O, given that the true temp is
T"
This is consistent with Freund's Modern Elementary Statistics; a
conditional probability P(A|S) is the
"probability of event A relative to the sample space S", aka the
"probability of A given S".
So the thing before the | is a specific event or value.
Its probability depends on the space in which it happens,
or "all of the values which it could theoretically have", which is listed
after the |.
It doesn't make sense to me to say "the probability of this space given
that a specific event happened"
I'm guessing that all of the math is correct, but the descriptions confuse
me (and maybe others).
If I'm not wrong I'll suggest alternatives.
—
Reply to this email directly, view it on GitHub
<#740 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ANDHUIRX4OVO4T2W353UWJT2FLXWPAVCNFSM6AAAAABO3N54YSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNBRGYZDEOBSGY>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
I've worked through Section 1 and have a list of other suggestions and questions, I just want to dig into it enough to be able to interpret the figures In a description of "likelihood" I dug up (Wikipedia) it says that Later, in 1:20, things are labeled oppositely from this interpretation.
|
I'll stop trying to make all the notation consistent with my limited understanding. 1:2 It would fit the picture better to use a red line at 1 C to represent the ob, 1:3 I think the picture is a little misleading. I'd omit the star (or line), 1:4 The likelihood equation would be easier for me to read if it were 1:9 The notation here implies to me that this distribution is some kind of inversion of the likelihood. 1:12 Is the figure (still) showing T_O (now T_O,1) and its likelihood (or error) distribution? 1:18 Do students ever ask why independent random errors is sufficient to Why is denominator unaffected by this assumption? 1:19 Since T = T_True, the equation would be clearer to me with T replaced by T_True. 1;22 the T_O,2 likelihood looks identical to the T_O in previous slides. 1:24 Is this a definition of the normalization we want to use, 1:29 It looks like T_O is being redefined here to describe T_O,2, 1:44 Do you care about spaces in equations in 2.? sigma has more than T. 1:45 A nitpick; the subscripts look like zeros. Should they be capital o? 1:50 Notational consistency; the T subscripts in earlier equations have |
I'm suggesting a lot of notation changes. 2:2 "additional variable" ->? "unobserved additional variable" 2:32 first slide where 'h' appears, but the topic has moved beyond it. 2:38 Mixed notation convention here; some variables have subscript k to denote a time, 2:39 Obs. error covariance is used as the standard deviation (variance?) of the normal. 2:45 x in likelihood doesn't need time subscript? 2:47 "All zeros except a single 1 in the last number of obs columns." 2:48 This makes me think that the description (and in 2:47) should be 2:49 The x argument in the H is the extended state here (^), 2:50 Might want to highlight "Extended" to point out that the picture is not redundant. 2:51 (and 2:52) Does this picture want an h (or H)? 2:54 Would it be useful to have a picture (or exercize) showing some stage(s) 2:55 If we want to go the extra mile, this slide could have a different distribution 2:56 "next time with observations" ->? "next time having observations" |
I didn't test scripts and exercizes in this section as of 2024-12-20 3:4 "1/5 chance that this is in any given bin." -> 3:5 Nitpick; the rank histogram Count axis would be more accurate and less cluttered 3:9 "Rank histograms for good ensembles" ->? "Rank histograms for good assimilations" 3:10 Then "A biased ensemble" -> "Biased ensembles" 3:11 And so forth. 3:13 Locating the truth in a bin is OK for an idealized experiment, 3:21 In the Note; "inflation" -> "variance inflation" 3:34 "Weakly correlated observations" ->? "weak correlations" 3:39 "no weight is being given to state variables on the opposite side of the domain from an observation" |
4:6 This equation isn't used in the rest of the slides, so maybe it's not worth making the notation consistent. 4:15 1. 'an ensemble' -> 'a prior ensemble' would motivate the F^p notation in 2. 4:22 Nitpick; 'section 1' -> 'Section 1' 4:23 Is 'conjugate pairs' just an interesting aside for people who recognize it? 4:24-34 I didn't (can't) confirm the content, but I didn't see anything that looks like a typo 4:42 I don't know the U(0, 1) abbreviation. 4:49 I'm guessing that the numbers closest to the green stars are the number of members 4:53 'Correct distribution contours' -> 'Contours for correct distribution' 4:56 I don't follow what 'This is the probability integral transform' refers to. 4:57 Up to here quantiles have been discreet, taken from members representing a distribution. 4:59 Do you want a legend in the second figure? 4:64 What do the red arrows point to? |
Again, I'm suggesting notation changes. Tell me if I should implement any of them A motivation slide could be helpful; why and when is inflation needed? 5.2 'adaptive error tolerant filter' -> 'adaptive error-tolerant filter' throughout.
5.3 'expected separation' causes additional confusion here; to me, increasing 5.4 > The second comment (and prior usage) makes me think that 5.5 'Use Bayesian"... looks like magic. I assume it doesn't matter at this point. 5.6 If y_k is sufficient notation, how about simplifying Y_{t_k} to Y_k? (throughout) 5.7 > Is one intention of the top figure to show that variance "inflation" can reduce the variance? 5.13 'One option is to use Gaussian prior for lambda.' 5.15 This version would be easier for me to follow. 5.16 Labeling the x-axis would emphasize that the picture is not showing 5.18 3. would be clearer to me (and still correct?) as 5.19 Nitpick; spacing of p() is messed up. 5.20 It could be helpful to make the connection between the 'inflation mean' and 'value'; 5.26 2. Shouldn't there be the step of applying the inflation between b. and c.? |
Kevin,
Looks like you're done with all 5 sections. I'll be working on
incorporating your comments and suggestions moving forward and will write a
response to those that are not obvious or that are not implemented. Thanks.
…On Mon, Dec 23, 2024 at 8:52 AM kdraeder ***@***.***> wrote:
Again, I'm suggesting notation changes. Tell me if I should implement any
of them
and point me to the source files.
A motivation slide could be helpful; why and when is inflation needed?
5.2 'adaptive error tolerant filter' -> 'adaptive error-tolerant filter'
throughout.
This is what I think is shown here, but it took me a while to
(mis)understand it,
based on the current content. If the background assumptions are in the
verbal comments
near the beginning, then I probably would have followed it fine.
But if this should be a free-standing slide, then I think the figure
should be smaller and there should be more description.
I think that 'expected separation' brings up questions of what we expect
and why,
which don't need to be asked or answered here. If that term needs to be
kept,
then in the figure '4.714 SDs' should be changed to '4.7 ESs' (and in
following slides).
The text could be
1. Tiny probabilities at overlap; unlikely they describe the same thing.
2. The 'consistency threshhold' of 2 distributions is [sqrt(sigmi_p^2 + sigma_o^2)] = ~0.85
3. Actual separation is 4.7 times larger; one or both are poor representations of the truth
Model error? Obs error? Random chance?
5.3 'expected separation' causes additional confusion here; to me,
increasing
the expected separation means "I expect them to appear more separated",
while inflation makes them appear *less* separated. So
3. ->? 'Inflating increases the consistency threshhold (combined
uncertainty)
and makes them appear more consistent.'
5.4 > The second comment (and prior usage) makes me think that
'prior mean y to obs is' should be 'prior mean to obs y is'
but 'y' isn't used, so it could be omitted. How about
'Distance from prior mean to obs distribution; D = N(0, ...) == N(0,theta)'
> Then there's y_O or y_0. It's ambiguous and later slides make me think
that it's y_0 (zero) because of the y_k (time index) variables.
Can we omit the subscript here?
5.5 'Use Bayesian"... looks like magic. I assume it doesn't matter at this
point.
(Again) does the 'lambda,t_k' notation mean lambda at time t_k
and not 'the probability of t_k and lambda, given all the previous obs'?
5.6 If y_k is sufficient notation, how about simplifying Y_{t_k} to Y_k?
(throughout)
And use lambda_k instead of lambda,t_k?
5.7 > Is one intention of the top figure to show that variance "inflation"
can reduce the variance?
> It's not clear if this slide shows combining distributions or specific
probabilities.
The 2 stars in the second figure make me think "specific".
But plotting them on a probability graph is implies that the posterior has
0 probability,
which is probably not what's intended. And the prior and likelihood are
both 0.75,
which also doesn't make sense.
Ah! Reading ahead to 5.8 clarifies it somewhat. How about switching the
lambda values
in 5.8 and 5.7, so that it's easier to interpret what's happening?
And then the first "inflation" that readers see is actually inflating the
prior.
5.13 'One option is to use Gaussian prior for lambda.'
Wasn't this already assumed in 5.10 (5.6)? Should this be 'posterior'?
5.15 This version would be easier for me to follow.
1. Evaluate the numerator at mean (p_m == p(lambda_u^bar))
and second point (p_sigma == p(lambda_u^bar + sigma_{lambda,p})
2. Find sigma_{lambda,u}^2 so N(...) goes through p_m and p_sigma.
3. Compute as sigma.... where r = p_sigma/p_m
5.16 Labeling the x-axis would emphasize that the picture is not showing
the updated inflation distribution.
5.18 3. would be clearer to me (and still correct?) as
"Use inflated prior to compute posterior ensemble of estimated y."
5.19 Nitpick; spacing of p() is messed up.
5.20 It could be helpful to make the connection between the 'inflation
mean' and 'value';
'Minimum value of inflation (mean), often'...
5.26 2. Shouldn't there be the step of applying the inflation between b.
and c.?
—
Reply to this email directly, view it on GitHub
<#740 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ANDHUIXWGKEFM6CLQZZ36TT2HAWVFAVCNFSM6AAAAABO3N54YSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKNJZHEZTSOJVGM>
.
You are receiving this because you modified the open/close state.Message
ID: ***@***.***>
|
Description:
DARTLAB qceff from Jceff
Fixes issue
Fixes #739
Note the slides are not in this pull request (yet!)
Types of changes
Documentation changes needed?
Tests
Run by a group of workshop attendees at AGOS 2024
Checklist for merging
Checklist for release
Testing Datasets