Skip to content

Validate get_variance() against remaining families #889

@strengejacke

Description

@strengejacke

As a follow-up of #877

Following families need validation or don't yet work:

  • betabinomial
  • hurdle models
  • zero-inflated models
  • glmmTMB::compois
  • glmmTMB::genpois
  • glmmTMB::lognormal - these models have low fixed effects variance, leading to R2 close to 0. Calculation needs revision here?

Tagging @bbolker FYI (also tagging @bwiernik )

  1. How could we possibly validate genpois and compois families? Are there any related families that would return a similar R2 so we have a reference for validating the code?

  2. How to calculate the distribution-specific variance for zero-inflated models? (or at least: what's more accurate, see Test get_variance for zero-inflation models #893 (comment)

  3. Is it expected that the R2 is lower for zero-inflated models, when the full model (zero-inflation and conditional part) is taken into account (as opposed to the conditional component of zero-inflated models only, see following example)?

Regarding Zero-Inflation models

Formerly the dispersion parameter for Poisson and ZI Poisson was set to 1. Now, the behaviour for ZI Poisson only has changed, returning a different dispersion / variance:

# For zero-inflated poisson models, the
# distributional variance is based on Zuur et al. 2012
# ----------------------------------------------
.variance_zip <- function(model, faminfo, family_var) {
  if (inherits(model, "glmmTMB")) {
    p <- stats::predict(model, type = "zprob")
    mu <- stats::predict(model, type = "conditional")
    pvar <- (1 - p) * (mu + p * mu^2)
  } else if (inherits(model, "MixMod")) {
    p <- stats::plogis(stats::predict(model, type_pred = "link", type = "zero_part"))
    mu <- suppressWarnings(stats::predict(model, type = "mean_subject"))
    pvar <- (1 - p) * (mu + p * mu^2)
  } else {
    pvar <- family_var
  }

  mean(pvar)
}

Taking following model, for this particular example, this comes closer to a Bayesian model than setting sigma/dispersion to 1 (for zero-inflation models!)

m <- glmmTMB::glmmTMB(count ~ mined + cover + (1 + cover | site),
  ziformula = ~mined,
  family = poisson(), data = Salamanders
)
performance::r2_nakagawa(m)

Formerly with sigma/dispersion = 1

# R2 for Mixed Models

  Conditional R2: 0.650
     Marginal R2: 0.525

Now

# R2 for Mixed Models

  Conditional R2: 0.414
     Marginal R2: 0.334

brms returns (marginal R2 only)

library(brms)
m2 <- brms::brm(bf(count ~ mined + (1 | site), zi ~mined),
  family = brms::zero_inflated_poisson(), data = Salamanders, backend = "rstan"
)
brms::bayes_R2(m2)
    Estimate  Est.Error      Q2.5     Q97.5
R2 0.1686378 0.01874732 0.1334217 0.2070608

Any ideas how to validate the results? @bbolker @bwiernik ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    get_variancefunction specific labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions