Skip to content

[FR] Support for shared mixture model #1778

@trinhdhk

Description

@trinhdhk

Currently when specifying a multivariate mixture model, the model assumes 3 independent sets of mixtures (theta_y1, theta_y2, theta_y3). Is there any plan for supporting the shared mixture model, as in constraining theta_y1 = theta_y2 = theta_y3 = theta.

x <- rnorm(1000)
x2 <- rpois(1000,2)
u <- rnorm(1000)
lp <- -.5+x+0.2*x2
mu1 <- c(0.5, -0.2, 0) |> sapply('+', 0.5 * x)
mu2 <- c(-1.5, -1, -0.5)

y <- sapply(seq_along(z),
            function(i){
              if (z[i] == 1) {
                return(sapply(plogis(mu1[i,]), rbinom, n=1, size=1))
              }
              sapply(plogis(mu2), rbinom, n=1, size=1)
            }) |> t()
colnames(y) <- paste0('y', 1:3)

dataset <-
  data.frame(x=x, x2=x2,) |> cbind(y)

# demonstration, I don't think this is identifiable
model <-
  brm(
    bf(y1 ~ 1,
       theta1 ~ x+x2,
       mu1 ~ x,
       mu2 ~ 1) +
      bf(y2 ~ 1,
         theta1 ~ x+x2,
         mu1 ~ x,
         mu2 ~ 1) +
      bf(y3 ~ 1,
         theta1 ~ x+x2,
         mu1 ~ x,
         mu2 ~ 1) +
      set_rescor(FALSE),
    algorithm='fullrank',
    tol_rel_obj = 1e-4,
    iter=4000,
    cores=2, chains=2,
    family = mixture(bernoulli, bernoulli),
    data = dataset
  )

model

Family: MV(mixture(bernoulli, bernoulli), mixture(bernoulli, bernoulli), mixture(bernoulli, bernoulli)) 
  Links: mu1 = logit; mu2 = logit; theta1 = identity; theta2 = identity
         mu1 = logit; mu2 = logit; theta1 = identity; theta2 = identity
         mu1 = logit; mu2 = logit; theta1 = identity; theta2 = identity 
Formula: y1 ~ 1 
         theta1 ~ x + x2
         mu1 ~ x
         mu2 ~ 1
         y2 ~ 1 
         theta1 ~ x + x2
         mu1 ~ x
         mu2 ~ 1
         y3 ~ 1 
         theta1 ~ x + x2
         mu1 ~ x
         mu2 ~ 1
   Data: dataset (Number of observations: 1000) 
  Draws: 1 chains, each with iter = 1000; warmup = 0; thin = 1;
         total post-warmup draws = 1000

Regression Coefficients:
<...>

I find this type of model not very useful in many cases, as we can model them independently. Rather than MV(mixture(bernoulli, bernoulli), mixture(bernoulli, bernoulli), mixture(bernoulli, bernoulli), I believe there are more needs for mixture(MV(bernoulli, bernoulli, bernoulli), MV(bernoulli, bernoulli, bernoulli)).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions