-
Notifications
You must be signed in to change notification settings - Fork 31
Open
Description
Hi Florian,
there seems to be a bug with parallelization in the DEzs algorithm. I noticed it with my model that when I used only the priors the samples do not converge to the prior distribution. The last internal chain always converges to the true solution but not the other internal chains (even when using 8 internal chains, only the last one converges). Here is a simplified example, it becomes more visible with more parameters, that's why I used ten. The samples should represent samples from a normal distribution with mean 5 and sd 2 (I set the prior to 0 and only used the "likelihood").
library(BayesianTools)
library(dplyr)
library(tidyr)
library(ggplot2)
lprior <- function(x) { 0.0 }
create_sample <- function() { rnorm(10) }
ll <- function(x) { sum(dnorm(x, mean = 5, sd = 2, log = TRUE)) }
ll_mat <- function(x) { apply(x, 1, ll) }
test_sample <- create_sample()
ll(test_sample)
ll_mat(matrix(c(test_sample, test_sample), nrow = 2, byrow = T))
bs_not_parallel = createBayesianSetup(likelihood = ll,
prior = createPrior(density = lprior, sampler = create_sample),)
out_not_parallel <- runMCMC(bayesianSetup = bs_not_parallel,
sampler = "DEzs",
settings = list(iterations = 150000))
bs_parallel = createBayesianSetup(likelihood = ll_mat,
prior = createPrior(density = lprior, sampler = create_sample),
parallel = "external")
### same result with:
# bs_parallel = createBayesianSetup(likelihood = ll,
# prior = createPrior(density = lprior, sampler = create_sample),
# parallel = T)
out_parallel <- runMCMC(bayesianSetup = bs_parallel,
sampler = "DEzs",
settings = list(iterations = 150000))
df_true_solution <- expand_grid(
tibble(x = seq(-5, 15, length.out = 100), y = dnorm(x, mean = 5, sd = 2)),
name = c("par.1", "par.2", "par.3", "par.4", "par.5", "par.6", "par.7", "par.8", "par.9", "par.10"))
df_not_parallel <- tibble(data.frame(getSample(out_not_parallel, parametersOnly = T))) %>%
mutate(iteration = rep(1:(nrow(.) / 3), each = 3),
chain = factor(rep(1:3, nrow(.) / 3))) %>%
pivot_longer(c(par.1, par.2, par.3, par.4, par.5, par.6, par.7, par.8, par.9, par.10))
df_parallel <- tibble(data.frame(getSample(out_parallel, parametersOnly = T))) %>%
mutate(iteration = rep(1:(nrow(.) / 3), each = 3),
chain = factor(rep(1:3, nrow(.) / 3))) %>%
pivot_longer(c(par.1, par.2, par.3, par.4, par.5, par.6, par.7, par.8, par.9, par.10))
ggplot() +
geom_density(data = df_not_parallel, aes(value, color = chain)) +
geom_line(data = df_true_solution, aes(x,y)) +
facet_grid(name ~ .) +
labs(title = "DEzs without parallelization")
ggplot() +
geom_density(data = df_parallel, aes(value, color = chain)) +
geom_line(data = df_true_solution, aes(x,y)) +
facet_grid(name ~ .) +
labs(title = "DEzs with external parallelization")
Without "parallel execution" all chains converge to the true solution (black line):
all chains except the last one are always off:
Metadata
Metadata
Assignees
Labels
No labels