Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature request: Handling Inf in data #1694

Open
wpetry opened this issue Oct 16, 2024 · 2 comments
Open

feature request: Handling Inf in data #1694

wpetry opened this issue Oct 16, 2024 · 2 comments
Labels
Milestone

Comments

@wpetry
Copy link

wpetry commented Oct 16, 2024

Description of current behavior

When Inf or -Inf are encountered in data, brm passes these rows to Stan, which fails because it is not able to evaluate the lp at the initial values. I think the standard troubleshooting for this error is to specify init = ... and/or to use more informative priors. But Stan will fail with this same error regardless of the initial values or priors specified. This appears to be a fitting issue, when in reality the source of the problem is in the data.

reprex:

library(brms)

x <- 0:100
mu <- 10 + 0.3 * x
y <- rnorm(mu, sd = 2)
dat <- data.frame(x, y)
dat$y[1] <- Inf

mod <- brm(y ~ 1 + x, data = dat)  # fails with Stan initialization error
mod2 <- lm(y ~ 1 + x, data = dat)  # base R regression gives a (somewhat) informative error in the same circumstance

Desired feature behavior

I think the best approach would be to stop the model fitting with an informative error instead of a warning. Infinite values are likely artifacts of errors during the calculation of variables and warrant re-examination before fitting any model (e.g., dividing by 0, log-transforming 0, etc.).

A softer approach would be to drop rows containing infinite values with a warning on the R side, then pass the cleaned data to Stan for fitting. This mirrors the handling of rows containing NA (absent user-specified imputation with mi()). I don't favor this approach because I'm not able to think of cases when it's still reasonable to fit a model after learning that some of the variable values are infinite.

@paul-buerkner
Copy link
Owner

Thank you for opening this issue! I will address it in the next brms update.

@paul-buerkner paul-buerkner added this to the brms 2.23.0 milestone Oct 17, 2024
@wds15
Copy link
Contributor

wds15 commented Oct 17, 2024

Isn't brms using Inf to flag special values sometimes? That is a useful thing sometimes, which I am doing myself sometimes. Throwing out a warning is certainly appropriate as Inf values can easily make Stan go crazy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants