Default CI level back to 95%? 😱 #250
Replies: 64 comments 10 replies
-
But some of the coolest papers use 89% HDI... ;-)
|
Beta Was this translation helpful? Give feedback.
-
Of course, there is no reason against 95% (as there is also no reason against 89%). Computational stability is an argument from the Stan developers (and Kruschke) to use 90%, so where to go now? Is 89 too narrow? is 95 to imprecise? is 90 ok (but if we break conventions, why not stick to 89?)? I'm not dogmatic here and open for any changes. One reason against 95% is, and I will cite (and translate) a German song text:
(not sure if this translation really works...?) |
Beta Was this translation helpful? Give feedback.
-
I thought some of the motivation was Kruschke's idea that for a 95% CI you need a lot of data, and so it would be no more than 90%? |
Beta Was this translation helpful? Give feedback.
-
"Computational stability" is indeed one of the strong argument for lowering the percentage. But I'd be curious to see some figures showing the impact of the CI level on the results... does any know where to find that? |
Beta Was this translation helpful? Give feedback.
-
What about those books or papers where this claim stems from? |
Beta Was this translation helpful? Give feedback.
-
I don't understand everything, but I think this post at least contributes a bit to your question... https://betanalpha.github.io/assets/case_studies/markov_chain_monte_carlo.html |
Beta Was this translation helpful? Give feedback.
-
I think I have become pro 95 tho 😅 but yeah I guess we've gone too far with 89 to make such a breaking change |
Beta Was this translation helpful? Give feedback.
-
Let's wait for the 2nd edition of Statistical Rethinking (should be in my office right now...) and if there's still the 89%, we will close this, else we'll make a breaking change to 95. |
Beta Was this translation helpful? Give feedback.
-
Ok, in the current (2nd) edition of Statistical Rethinking, it seems like McElreath still advices against 95% CI (just to avoid falling into the trap of thinking of NHST when you hear "95%"), but he doesn't seem to say 89% is the number. So... We could a) stick to 89% should we run |
Beta Was this translation helpful? Give feedback.
-
and the winner is... table(sample(c("a", "b", "c"), 1000, replace = TRUE))
#>
#> a b c
#> 343 331 326 Created on 2020-05-11 by the reprex package (v0.3.0) |
Beta Was this translation helpful? Give feedback.
-
lol I also prefer a or b (b a bit more). |
Beta Was this translation helpful? Give feedback.
-
I'm open to either a) or b) |
Beta Was this translation helpful? Give feedback.
-
the time that has passed has strongly reinforced my preference for c 😅 it seems like we are in a pickle... The argument of not doing something just because it reminds of something not good is a bit weak imho. And au contraire you could argue that 95% is great because it allows for comparison with previous results and, you know, reproducibility and all that kind of jazz. But aside from that, the real game changer for me is the correspondence between the 95% CI and the SD (or the SE - right @strengejacke :p)... |
Beta Was this translation helpful? Give feedback.
-
If being maintainer doubles my vote weight, we're still stuck between 2c vs. 2b 😁 also, I do remember you were more a brms than a rstanarm guy @strengejacke 🤔 |
Beta Was this translation helpful? Give feedback.
-
What is the correspondence between 95% CI and SD? (And a reminder: the SD of the posterior is not the SE 😋) |
Beta Was this translation helpful? Give feedback.
-
😄 we should default |
Beta Was this translation helpful? Give feedback.
-
Maybe we can start by adding a startup message saying that it may change in the future version? |
Beta Was this translation helpful? Give feedback.
-
👆 definitely this! |
Beta Was this translation helpful? Give feedback.
This comment has been hidden.
This comment has been hidden.
This comment has been hidden.
This comment has been hidden.
-
we may tag Richard McElreath, just to get another unbiased opinion here...? 😇 |
Beta Was this translation helpful? Give feedback.
This comment has been hidden.
This comment has been hidden.
-
tho I reckon it might be a slightly biased opinion 😅 |
Beta Was this translation helpful? Give feedback.
This comment has been hidden.
This comment has been hidden.
-
I think we should go ahead with this for next release to finally remove this ugly on-attach message 😁 |
Beta Was this translation helpful? Give feedback.
-
For the record, see here for some data related to the stability issue |
Beta Was this translation helpful? Give feedback.
-
Dear All, I am a user of your R packages. Thanks for all your contributions, they are super helpful!!! I would argue to keep the 89% default CI level for educational purposes. To explain, I have scientific background (plant pathology, agriculture), but limited formal education in statistics. I had no idea that R existed until about 5 years ago; and have never heard of Bayesian statistics until a bit over a year ago. Now ,I use rstanarm together with bayestestR and modelbased as a replacement for the agricultural-industry standard ANOVA + LSD tests. If you had not had 89% as the default, I would have never researched the topic to realize that the 95% is not set in stone. |
Beta Was this translation helpful? Give feedback.
-
There is still no consensus on the question which CI-level to use. I think the latest update of bayestestR reverted the 89% CI and now defaults to 95% again. The dark side has won, but the return of the Jedi is possibly happening soon. ;-) At least, there's a new hope, namely the parameters package, which still uses the 89% CI for Bayesian models. But I also think this debate is still ongoing, and not closed. |
Beta Was this translation helpful? Give feedback.
-
Since they are used to summarise posterior distribution, how about we report all: 89%, 90%, 95% credible intervals? This can be used for comparison purpose and also deliver the message that the cut-off is arbitrary?🤔 |
Beta Was this translation helpful? Give feedback.
-
The 89% credibility interval has this special property: knowing whether the true value is within or without this interval, amounts to almost exactly 0.5 shannons of uncertainty; 0.499916 Sh to be more exact. So that's about "half a binary uncertanity", so to speak, on the log-scale of Shannon information. Or almost exactly half the uncertainty of a 50% credibility interval. The 90% credibility interval amounts to 0.468996 Sh, and the 88% interval to 0.557438 Sh; both a bit farther away from 0.5. The interval corresponding to 0.5 Sh is 88.997%. Finally, the 95% interval corresponds to 0.286397 Sh. Just a curiosity, not an argument in favour of 89% :) |
Beta Was this translation helpful? Give feedback.
-
TLDR for new readers
Currently, most of bayestestR's functions have
ci=0.89
by default, yielding 89% CIs. We are thinking of changing that (maybe to 95%). To avoid any surprises or breaks when you update the package, please set theci
argument explicitly. For example, to retain the current behaviour, just addci=0.89
to most of bayestestR's functions (or in functions related to it, such asparameters::model_parameters()
).Discussion
Might be throwing a stone in the water here, but it's good to rechallenge our certainties :)
When we switched from 95% to 89%, the main reason was that there was no objective reason to keep 95% as it is a purely arbitrary value. And 89%, suggested by McElreath, has at least for it to be a prime number (not that it changes anything, but it might be nice [for some strange people such as @strengejacke 😁]). The real underlying motivation was also to shake the common (and mindless) procedures, to make people actually think about what they are doing.
Now, I think I have some new arguments to throw in the pot.
All opinions are appreciated
Beta Was this translation helpful? Give feedback.
All reactions