Multiple tests correction for Bayesian models #249

DominiqueMakowski · 2019-10-19T23:39:00Z

DominiqueMakowski
Oct 19, 2019
Maintainer

Working on correlation made this "issue" apparent, I wanted to brainstorm.

In the frequentist framework, standard practice is to adjust the significance index, the p values, for multiple tests/comparisons. This is particularly common when computing correlation matrices or testing contrasts.

As more and more people switch to the Bayesian framework, this issue will raise up again, especially since some of the Bayesian indices (e.g., the pd) is in the end practically close to the p-value. This begs the question, if we must adjust the p-value, why not te pd?

Bayesian can respond that there no need for posthoc corrections, because we have priors. Hence, once can eventually just specify more precise priors. For correlations in correlation, as it relies on BayesFactor::correlationBF, this can be done via the rscale argument (see details).

But what in the case of contrasts analysis? To my knowledge, BF, or the other indices, that we compute based on emmeans, are based on the model's priors. Hence, any "adjustement" should be done prior to model fitting. Or is it a thing that we can do in emmeans (and thus is an issue that belongs to estimate)? If not, we could think about an implementation for #238. For instance, is one is adding some factor with lotta levels as predictors with the idea of doing full pairwise contrasts analysis, he might want to specify a more precise prior for it. Could we suggest an appropriate scale with such informative_priors(model, multiple_correction = TRUE)? Or does this whole thing make no sense at all? What are your thoughts?

mattansb · 2019-10-20T07:12:26Z

mattansb
Oct 20, 2019
Maintainer

This is a very interesting question!

I think we need to distinguish between:

Adjustments for post hoc comparisons - I don't really have any questions, nor prior knowledge as to what the answers might be (usually this is actually: no prior knowledge as to what the sign/direction of the answers.)
Adjustments for multiple comparisons - I have many questions.
Adjustments for dependent comparisons - Knowing the answer to one of my questions can change my priors on another question before even testing it.

(In practice these are overlapping categories, and in the frequentist framework the solution to all of them is similar).

Bayes factors

BFs are the evidence ratio between two models - so in and of themselves they do not require any multiple comparisons correction: they are "just" the evidence, nothing more.

The conclusions drawn from BFs, however, are prone to such problems (but see next section). The method of dealing with this problem is also by adjusting the priors, but here the adjusted priors are the prior odds. By adjusting the prior odds, the conclusions (usually equated with the posterior odds) are less affected by BF itself, as they are weighted by both:

Posterior_odds = BF * Prior_odds

Tim de Jong's work covers several methods of doing such corrections on the prior odds in cases of dependent post-hoc tests (see above).

In practice, I haven't seen anyone report Posterior_odds in a paper (maybe once), so it seems like there is little call for Prior_odds/Posterior_odds adjustments. Note that some of Tim's methods are available by default in JASP:

Do we care?

There are those who say that this is all hogwash - caring about error rates is soooo frequentist 👻!

Also, even if a mistake was made, since in the Bayesian framework posteriors can become the new priors and be themselves updated, such mistakes will be corrected once more data is introduced (this is true for any Bayesian index really), and applying any corrections only limits our ability to learn (this is equal in a sense to a frequentist saying that p-value correction limit the rate of true discovery).

I tend to agree with this view, and think that post-hoc or multiple comparisons (see above) are not an issue in Bayesian inference.
However (!) dependent comparisons (see above) are an issue, since bits of learnt knowledge inform us to other untested questions. I do not know of methods for correcting for non-post-hoc dependent tests, so I usually advise my students to look at the effect size, or, if possible, design a orthogonal contrast scheme (thought this is not possible for many correlations, obviously...).

But what in the case of contrasts analysis? [...] Hence, any "adjustement" should be done prior to model fitting.

This seems rather odd that priors change to account for your analysis plan, no? Does your prior knowledge about the parameter space change because you're going to test two things instead of one?
Also, isn't this like saying "Because I have weak priors, I'll set a strong prior on 0!"

Or is it a thing that we can do in emmeans

For Bayesian, emmeans only does the estimation with the model parameters, it does not offer any adjustments and such.

Are there any works on multiple comparisons for, say, ROPE?

0 replies

DominiqueMakowski · 2019-10-21T07:12:18Z

DominiqueMakowski
Oct 21, 2019
Maintainer Author

Very interesting. Indeed the first dissociations between posthoc tests, multiple tests and conditional tests is important.

I tend to agree with this view, and think that post-hoc or multiple comparisons (see above) are not an issue in Bayesian inference.

While I surely agree with this on a theoretical level, in practice people might still want to look into such adjustment procedures to control for false discoveries. Especially in genetics or neuroimaging (fMRI), where you have like tons of variables (voxels, genes, ...). Let's take the famous fish fMRI poster, showing significant activations in a cluster of voxels in a dead fish, well the same issue would have probably arisen with Bayesian indices, right? On a bright side of things, aside from adjusting a posteriori for false discoveries (such as correcting p-values), which wouldn't really make sense with Bayesian indices (aside maybe, from increasing whatever thresholds one uses to consider and discuss the results as significant), one can also make a priori adjustments via priors

Are there any works on multiple comparisons for, say, ROPE?

I don't know of any of such work, but I don't really see how it would make sense 😕

But again, I've seen many people saying "multiple testing is not a problem with Bayes". But (while making theoretical sense), it just seems a bit too magical. It's like "although we have numbers that are related to the p-value, we don't need to adjust them because of reasons". So since we support Bayesian multiple comparisons (via emmeans in estimate) and Bayesian correlation matrices (in correlation), we should at least think about this and potentially provide clear and informative information to users that will have this interrogation :)

0 replies

mattansb · 2019-10-21T14:14:42Z

mattansb
Oct 21, 2019
Maintainer

I think the basic difference is that in freq stats, desicions are based on a criterion that reflects the error rate! If the likelihood of the data under H0 is smaller than your set error-rate (alpha), then you consider the null rejected.
But criterions in Bayes are not determined by the desired error-rate! It is that simple! For example, while in feq stats the type1 error rate is constant at alpha, this paper suggests that, at least for ROPE and BF, type1 error rates are not constant, but depends on the sample size (as N gets bigger, the error-rate gets smaller), so there doesn't really seem to be "an error rate" in the conventional sense.

Some more reading:

In this post Gelman explains how this magic is true (for non-flat priors).
Another useful explanation on stackexchange.

But... What about pd?
Well, going by Gelman's post, the more informative the priors, the less similar pd and p-value will be. But for flat priors (and uninformed priors) I guess this is an issue... Maybe we can offer some correction function (but not apply is be default)?

pd.adjust <- function(pd, method = p.adjust.methods, n = length(pd), one_sided = FALSE) {
  if (one_sided) {
    p <- 1 - pd
  } else {
    p <- 2 * (1 - pd)
  }
  
  p_ <- p.adjust(p, method = method, n = n)
  
  if (one_sided) {
    pd_ <- 1 - p_
  } else {
    pd_ <- 1 - (p_ / 2)
  }
  pd_
}

pd.adjust(c(0.87,0.97), method = "fdr")
#> [1] 0.87 0.94

(What about p-sig or pMAP? Who knows.....)

0 replies

mattansb · 2020-08-22T20:18:57Z

mattansb
Aug 22, 2020
Maintainer

http://eointravers.com/post/hypothesis/#bayesian-multiple-comparisons

0 replies

abadgerw · 2021-09-11T21:58:51Z

abadgerw
Sep 11, 2021

I have been computing contrasts of Bayesian models looking at pd and pMAP and wanted to see if a correction paradigm was implemented in the package that I could use when calling the describe_posterior function or whether I should use what you wrote above on a list of values afterwards?

1 reply

bwiernik Sep 11, 2021
Maintainer

If you are considering correcting for multiple comparisons, it would probably better to instead use regularization by estimating the model as a multilevel model. http://www.stat.columbia.edu/~gelman/research/published/multiple2f.pdf

See also https://statmodeling.stat.columbia.edu/2016/08/22/bayesian-inference-completely-solves-the-multiple-comparisons-problem/

abadgerw · 2021-09-11T22:53:29Z

abadgerw
Sep 11, 2021

Thank you for the documents. Since my data has a hierarchical structure and I'm already fitting a mixed model with a random effect, does that mean that I'm covered?

1 reply

bwiernik Sep 12, 2021
Maintainer

No, not necessarily. Please read the above resources. You would need to ensure that your effects of interest are regularized toward their mean effect.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple tests correction for Bayesian models #249

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 6 comments 2 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Multiple tests correction for Bayesian models #249

DominiqueMakowski Oct 19, 2019 Maintainer

Replies: 6 comments · 2 replies

mattansb Oct 20, 2019 Maintainer

Bayes factors

Do we care?

DominiqueMakowski Oct 21, 2019 Maintainer Author

mattansb Oct 21, 2019 Maintainer

mattansb Aug 22, 2020 Maintainer

abadgerw Sep 11, 2021

bwiernik Sep 11, 2021 Maintainer

abadgerw Sep 11, 2021

bwiernik Sep 12, 2021 Maintainer

DominiqueMakowski
Oct 19, 2019
Maintainer

Replies: 6 comments 2 replies

mattansb
Oct 20, 2019
Maintainer

DominiqueMakowski
Oct 21, 2019
Maintainer Author

mattansb
Oct 21, 2019
Maintainer

mattansb
Aug 22, 2020
Maintainer

abadgerw
Sep 11, 2021

bwiernik Sep 11, 2021
Maintainer

abadgerw
Sep 11, 2021

bwiernik Sep 12, 2021
Maintainer