Skip to content

bug in DEzs with parallelization? #274

@FelixNoessler

Description

@FelixNoessler

Hi Florian,

there seems to be a bug with parallelization in the DEzs algorithm. I noticed it with my model that when I used only the priors the samples do not converge to the prior distribution. The last internal chain always converges to the true solution but not the other internal chains (even when using 8 internal chains, only the last one converges). Here is a simplified example, it becomes more visible with more parameters, that's why I used ten. The samples should represent samples from a normal distribution with mean 5 and sd 2 (I set the prior to 0 and only used the "likelihood").

library(BayesianTools)
library(dplyr)
library(tidyr)
library(ggplot2)

lprior <- function(x) { 0.0 }
create_sample <- function() { rnorm(10) }
ll <- function(x) { sum(dnorm(x, mean = 5, sd = 2, log = TRUE)) } 
ll_mat <- function(x) { apply(x, 1, ll) }

test_sample <- create_sample()
ll(test_sample)
ll_mat(matrix(c(test_sample, test_sample), nrow = 2, byrow = T))


bs_not_parallel = createBayesianSetup(likelihood = ll, 
                                      prior =  createPrior(density = lprior, sampler = create_sample),)
out_not_parallel <- runMCMC(bayesianSetup = bs_not_parallel, 
                            sampler = "DEzs", 
                            settings = list(iterations = 150000))


bs_parallel = createBayesianSetup(likelihood = ll_mat,
                                  prior =  createPrior(density = lprior, sampler = create_sample),
                                  parallel = "external")

### same result with:
# bs_parallel = createBayesianSetup(likelihood = ll,
#                                  prior =  createPrior(density = lprior, sampler = create_sample),
#                                  parallel = T)

out_parallel <- runMCMC(bayesianSetup = bs_parallel, 
                        sampler = "DEzs", 
                        settings = list(iterations = 150000))



df_true_solution <- expand_grid(
    tibble(x = seq(-5, 15, length.out = 100), y = dnorm(x, mean = 5, sd = 2)),
    name = c("par.1", "par.2", "par.3", "par.4", "par.5", "par.6", "par.7", "par.8", "par.9", "par.10"))

df_not_parallel <- tibble(data.frame(getSample(out_not_parallel, parametersOnly = T))) %>%
    mutate(iteration = rep(1:(nrow(.) / 3), each = 3),
           chain = factor(rep(1:3, nrow(.) / 3))) %>%
    pivot_longer(c(par.1, par.2, par.3, par.4, par.5, par.6, par.7, par.8, par.9, par.10))

df_parallel <- tibble(data.frame(getSample(out_parallel, parametersOnly = T))) %>%
    mutate(iteration = rep(1:(nrow(.) / 3), each = 3),
           chain = factor(rep(1:3, nrow(.) / 3))) %>%
    pivot_longer(c(par.1, par.2, par.3, par.4, par.5, par.6, par.7, par.8, par.9, par.10))


ggplot() +
    geom_density(data = df_not_parallel, aes(value, color = chain)) +
    geom_line(data = df_true_solution, aes(x,y)) +
    facet_grid(name ~ .) +
    labs(title = "DEzs without parallelization")

ggplot() +
    geom_density(data = df_parallel, aes(value, color = chain)) +
    geom_line(data = df_true_solution, aes(x,y)) +
    facet_grid(name ~ .) +
    labs(title = "DEzs with external parallelization")

Without "parallel execution" all chains converge to the true solution (black line):
Image

all chains except the last one are always off:

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions