-
-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model-averaged estimates/intervals/distributions #768
Comments
Do you see us being able to "extract" info from objects produced by these two packages (probably easy), or do you imagine us writing "wrappers" around those two packages (more maintenance)? |
For both, they take lists of models and return results back, so wrapping isn't likely to be much effort. We already wrap around loo for a much of functions in performance, so we can probably create some unified code for easystats generally |
For Bayes, I wrote this function that does this exactly a while back (years???) https://easystats.github.io/bayestestR/reference/weighted_posteriors.html Currently used BFs for weights (personally, that's what makes the most sense to me as we are mixing the posteriors from the same data, but different priors), but should be expandable (: |
There's a Vehtari, Gelman, et al paper discussing that the "stacking" method used in {loo} by default apparently has some important advantages over traditional Bayesian model averaging |
what would be the model-average estimate? Here's an example for the intervals, I think it's easy to re-implement library(insight)
library(MATA)
set.seed(0)
n <- 20 # 'n' is assumed to be even
x1 <- c(rep(0, n / 2), rep(1, n / 2)) # two groups: x1=0, and x1=1
x2 <- rnorm(n, mean = 10, sd = 3)
y <- rnorm(n, mean = 3 * x1 + 0.1 * x2) # data generation
x1 <- factor(x1)
m1 <- glm(y ~ x1) # using 'glm' provides AIC values.
m2 <- glm(y ~ x1 + x2) # using 'lm' doesn't.
aic <- c(m1$aic, m2$aic)
delta.aic <- aic - min(aic)
model.weights <- exp(-0.5 * delta.aic) / sum(exp(-0.5 * delta.aic))
# see also
# performance::compare_performance(m1, m2)
residual.dfs <- c(insight::get_df(m1), insight::get_df(m2))
g1 <- insight::get_datagrid(m1)
g2 <- insight::get_datagrid(m2)
nd1 <- as.data.frame(lapply(g1, function(i) {
if (is.factor(i)) {
as.factor(levels(i)[1])
} else {
unique(i)[1]
}
}))
nd2 <- as.data.frame(lapply(g2, function(i) {
if (is.factor(i)) {
as.factor(levels(i)[1])
} else {
unique(i)[1]
}
}))
p1 <- get_predicted(m1, data = nd1, ci = .95)
p2 <- get_predicted(m2, data = nd2, , ci = .95)
theta.hats <- c(p1, p2)
se.theta.hats <- c(attributes(p1)$ci_data$SE, attributes(p2)$ci_data$SE)
# 95% MATA-Wald confidence interval for theta:
mata.wald(theta.hats, se.theta.hats, model.weights,
mata.t = TRUE,
residual.dfs = residual.dfs
)
#> [1] -0.3756852 1.5499646 Created on 2022-09-05 with reprex v2.0.2 |
I think it should be |
I have used {glmulti} for multimodel inference. I believe it implements frequentist model average by default (and by design???). It explores through millions of possible model formula specifications and retains the top ones of an arbitrary size, optionally using a genetic algorithm, which I have not seen provided in other packages. It also provides model-averaged model summary statistics including coef, SEs, CIs. MuMIn Package was mentioned along with {glmulti} in https://www.metafor-project.org/doku.php/tips:model_selection_with_glmulti_and_mumin and https://www.r-bloggers.com/2013/02/model-selection-and-multi-model-inference, but I have not tried it.
@vincentarelbundock Would be nice to have model-averaged marginal effects. It appears very few have done so, except this one https://www.ajordannafa.com/blog/2022/being-less-wrong where {marginaleffects} was useful. However, I am not sure how to implement model-averaged marginal effects if using frequentist and not Bayesian models. I assume it is to extract coefficient estimates and vcov() from model-averaged summary statistics, and supply it just like a single model to {marginaleffects}. Otherwise, many single models need to be supplied, and it sounds lots of predictions to do for {marginaleffects}. |
There's now a convenient package for computing frequentist model-averaged CIs/CDs https://cran.r-project.org/web/packages/MATA/index.html
As well as for Bayesian model-averaging/stacking https://mc-stan.org/loo/reference/loo_model_weights.html
We should have a function to compute average estimates/intervals/densities. It should permit choice of weighting method, interval method, and be passable to {see} for plotting.
For frequentist averaging, that's usually done based on information criteria, so this would entail a soft dependency of {parameters} on {performance}.
The text was updated successfully, but these errors were encountered: