-
-
Notifications
You must be signed in to change notification settings - Fork 199
Open
Labels
Milestone
Description
Hi~
I am currently doing GLMM estimation on 1000 simulated datasets by brms and evaluating the computational cost. However, I find the compute time becomes longer and longer as interation increases. The two figures below show the computational slowdown observed when I running the experiments locally and on ALICE HPC platform, respectively.
Here are the versions of R and brms I used:
- Local environment: R version 4.4.3, brms version 2.22.0
- ALICE HPC platform: R version 4.4.0, brms version 2.22.0
And I am pretty sure the simulated datasets are homogeneous since they are generated by Monte Carlo simulation. The super smooth curves make me guess there might be accumulation of internal variables or uncleaned caches. Could you please help me to figure out the issue?
I give my script here for reference.
library(brms)
path = '' # edit as appropriate
n_cores = 4
n_threads = 2
sim_brms = function(data_ls, num_sub, num_time, num_sim, resp_dist){
out_path = paste0(path, 'brms_', 'n', num_sub, '_t', num_time, '_sim', num_sim, '_', resp_dist, '/')
if (!dir.exists(out_path)) {
dir.create(out_path, recursive = TRUE, showWarnings = FALSE)
}
for (i in 1:num_sim){
# Monitor
cat(paste('brms_i: ', i, '\n'))
data = data_ls[[i]]
# Define helper functions
check_convergence = function(model, n_term) {
if (inherits(model, "try-error")) return("error")
Rhat = rhat(model)
if (any(abs(Rhat[1:n_term] - 1) >= 0.1)) return("official_fail")
"good"
}
# Specify response distribution
if (resp_dist == 'binomial'){
fami = bernoulli(link = 'logit')
}
if (resp_dist == 'Poisson'){
fami = poisson(link = 'log')
}
# Fit full model on full dataset
compute_time = system.time({
brms_model_full = try(brm(y_full ~ time + group + time:group + (1|id),
family = fami,
data = data,
threads = threading(n_threads),
cores = n_cores,
seed = 123
))})[['elapsed']]
conv_status = check_convergence(brms_model_full, 5)
out = c('compute_time' = compute_time, 'converge_rate' = conv_status)
save(out, file = paste0(out_path, i, '.RData'))
gc()
}
}
n_sim = 1000
n_sub = 20
n_time = 2
re_dist = 'binomial'
cat(paste0('n', n_sub, '_t', n_time, '_', re_dist, '\n'))
load(paste0(path, 'data/n', n_sub, '_t', n_time, '_sim', n_sim, '_', re_dist, '_rdi.RData'))
sim_brms(data_ls = simulate_data_ls, num_sub = n_sub, num_time = n_time, num_sim = n_sim, resp_dist = re_dist)
Thank you in advance and look forward for a reply.