Skip to content

Simple model not compiling after 15 minutes, because code is not vectorized? #1791

@potash

Description

@potash

I am trying to fit the multilevel model using brms 2.22.0 and backend cmdstanr 2.36.0:

y ~ x_1 +... x_K + (x_1 + ... + x_K || g1 * g2) 

However, when K = 32 the model does not compile after 15 minutes. Note the following cases do compile quickly:

  • replace || with | -- however with K=32 I do not want to fit a 32 x 32 covariance matrix
  • K = 2
  • replacing g1 * g2 with just g1

Looking at the generated stan code-- when switching from | to ||, things that were vectorized get unvectorized. For example the priors on the random effects involve 32+1 calls to std_normal_lpdf

Is there a way to have uncorrelated effects but still vectorized code?

Here's a script to generate fake data, print the model, and try to compile (you can change || to | to get it to work):

N = 100 # number of observations
K = 32 # number of predictors
J1 = 2 # number of groups g1
J2 = 2 # number of groups g2

df = as.data.frame(matrix(rnorm(N*K), nrow=N))
x_names = paste0("x_", 1:K)
colnames(df) = x_names

df$g1 = factor(sample(1:J1, N, replace=TRUE))
df$g2 = factor(sample(1:J2, N, replace=TRUE))

# y has no relationship to x and g, just fake data for compiling model
df$y = rnorm(N) 

sum_x_str = paste(x_names, collapse="+")
formula = as.formula(sprintf("y ~ %s + (%s||g1*g2)", sum_x_str, sum_x_str))

brms::make_stancode(formula, df, backend="cmdstanr", chains=0)
brms::brm(formula, df, backend="cmdstanr", chains=0)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions