Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for emmeans methods not to concatenate grouping variables into single column #661

Closed
qdread opened this issue Aug 2, 2024 · 3 comments · Fixed by #672
Closed
Assignees
Labels
Enhancement 💥 Implemented features can be improved or revised

Comments

@qdread
Copy link

qdread commented Aug 2, 2024

I think it is amazing that this package has methods for emmeans objects. One thing that would make it way more convenient and performant for me would be if the output of functions like describe_posterior() did not concatenate all the grouping variable columns into a single Parameter column. As it is now, I have to either write code to parse the Parameter column back out into separate columns, or pull the separate grouping variable columns from the emmeans object and cbind them back to the bayestestR output. I often need the columns separate for producing plots and tables. The workarounds I have come up with are okay but not easy to implement "at scale" if I have to fit many models on different datasets, with different grouping variables. So I'd like to request that the emmeans methods for functions like ci(), p_pointnull(), etc., retain the grouping variable columns from their input. Thanks for all your work on this package which has truly been a game changer for me!

Example:

library(brms)
library(emmeans)
library(bayestestR)

myfit <- brm(mpg ~ factor(gear) + factor(cyl), data = mtcars)
myemms <- emmeans(myfit, ~ gear + cyl)
mypost <- describe_posterior(myemms)

# Not ideal because the gear and cyl columns get squashed together, without indication of which is which
as.data.frame(mypost)

# My workaround
cbind(
  as.data.frame(myemms)[, c('gear', 'cyl')],
  mypost
)
@mattansb
Copy link
Member

mattansb commented Aug 6, 2024

I tend to agree. We would need a function to get the grid info and then merge that with the results.

Here is a general solution (minus the formatting):

library(brms)
library(emmeans)
library(bayestestR)

myfit <- brm(mpg ~ factor(gear) + factor(cyl), data = mtcars)
myemms <- emmeans(myfit, pairwise ~ gear | cyl)

# general function to pull grid info
.get_emmeans_grid <- function(object) {
  s <- as.data.frame(myemms)
  s[,1:(which(colnames(s) == attr(s, "estName"))-1)]  
}

describe_posterior.emmGrid <- function(posterior, ...) {
  .grid <- .get_emmeans_grid(posterior)
  results <- bayestestR:::describe_posterior.emmGrid(posterior, ...)
  cbind(.grid, results[,-1])
}

describe_posterior.emmGrid(myemms)
#>    cyl gear      contrast     Median   CI    CI_low   CI_high      pd ROPE_CI ROPE_low ROPE_high ROPE_Percentage
#> 1    4    3             . 25.5062620 0.95 21.507828 29.310093 1.00000    0.95     -0.1       0.1      0.00000000
#> 2    4    4             . 26.7326123 0.95 24.450108 28.932544 1.00000    0.95     -0.1       0.1      0.00000000
#> 3    4    5             . 26.9537215 0.95 23.430046 30.718033 1.00000    0.95     -0.1       0.1      0.00000000
#> 4    6    3             . 18.8184019 0.95 15.042371 22.331300 1.00000    0.95     -0.1       0.1      0.00000000
#> 5    6    4             . 20.0190237 0.95 16.981356 23.146698 1.00000    0.95     -0.1       0.1      0.00000000
#> 6    6    5             . 20.2889764 0.95 16.429401 24.261156 1.00000    0.95     -0.1       0.1      0.00000000
#> 7    8    3             . 14.8589153 0.95 12.953114 16.780434 1.00000    0.95     -0.1       0.1      0.00000000
#> 8    8    4             . 16.0788715 0.95 11.963037 20.400845 1.00000    0.95     -0.1       0.1      0.00000000
#> 9    8    5             . 16.3595230 0.95 12.577941 20.074604 1.00000    0.95     -0.1       0.1      0.00000000
#> 10   4    . gear3 - gear4 -1.1957204 0.95 -5.254543  2.767766 0.74175    0.95     -0.1       0.1      0.03578947
#> 11   4    . gear3 - gear5 -1.5443451 0.95 -5.284180  2.427025 0.79000    0.95     -0.1       0.1      0.03447368
#> 12   4    . gear4 - gear5 -0.2152978 0.95 -4.316286  3.552688 0.54375    0.95     -0.1       0.1      0.04210526
#> 13   6    . gear3 - gear4 -1.1957204 0.95 -5.254543  2.767766 0.74175    0.95     -0.1       0.1      0.03578947
#> 14   6    . gear3 - gear5 -1.5443451 0.95 -5.284180  2.427025 0.79000    0.95     -0.1       0.1      0.03447368
#> 15   6    . gear4 - gear5 -0.2152978 0.95 -4.316286  3.552688 0.54375    0.95     -0.1       0.1      0.04210526
#> 16   8    . gear3 - gear4 -1.1957204 0.95 -5.254543  2.767766 0.74175    0.95     -0.1       0.1      0.03578947
#> 17   8    . gear3 - gear5 -1.5443451 0.95 -5.284180  2.427025 0.79000    0.95     -0.1       0.1      0.03447368
#> 18   8    . gear4 - gear5 -0.2152978 0.95 -4.316286  3.552688 0.54375    0.95     -0.1       0.1      0.04210526

@mattansb mattansb self-assigned this Aug 6, 2024
@mattansb mattansb added the Enhancement 💥 Implemented features can be improved or revised label Aug 6, 2024
@strengejacke
Copy link
Member

When we extract draws from Bayesian objects processed with emmeans, we use emmeans::as.mcmc.emmGrid() in insight::get_parameters(), which creates these column names. I agree we should have an argument that adds the names as separate column.

@mattansb
Copy link
Member

mattansb commented Sep 3, 2024

See new output style in #672

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement 💥 Implemented features can be improved or revised
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants