-
Notifications
You must be signed in to change notification settings - Fork 84
Description
Hello wonderful infer authors! I am reaching out about something that has come up as I've been teaching simulation-based inference and comparing it to the CLT-based approach.
Currently, as described in #509, the permutation test does not allow for us to test get_p_value() and shade_p_value() in this context do not entirely make sense:
library(infer)
obs_fit <- gss %>%
specify(hours ~ age + college) %>%
fit()
null_dist <- gss %>%
specify(hours ~ age + college) %>%
hypothesize(null = "independence") %>%
generate(reps = 1000, type = "permute") %>%
fit()
visualize(null_dist) +
shade_p_value(obs_stat = obs_fit, direction = "two-sided")
null_dist %>%
get_p_value(obs_stat = obs_fit, direction = "two-sided")
# A tibble: 3 × 2
term p_value
<chr> <dbl>
1 age 0.898
2 collegedegree 0.27
3 intercept 0.68
Comparing this to the result of summary.lm,
summary(lm(hours ~ age + college, gss))
Call:
lm(formula = hours ~ age + college, data = gss)
Residuals:
Min 1Q Median 3Q Max
-37.78 -4.93 -0.89 7.26 48.12
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 40.60859 2.15468 18.85 <2e-16 ***
age 0.00596 0.04988 0.12 0.90
collegedegree 1.53283 1.39325 1.10 0.27
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 15 on 497 degrees of freedom
Multiple R-squared: 0.00248, Adjusted R-squared: -0.00154
F-statistic: 0.617 on 2 and 497 DF, p-value: 0.54
yields sensible results for the p-values for the slopes (testing
Is this the preferred behavior? I understand that the permutation test does not help us test the "standard" hypothesis test for the intercept (
If it would be preferred to be able to carry out this test (which I can see the argument for especially if infer wants to be able to do the tests that the CLT-based methods do be default), I wonder if an implementation similar/analogous to bootstrapping for a single mean with stat = "mean" would be helpful? For example, writing code that is analogous to:
gss %>%
specify(response = hours) %>%
hypothesize(null = "point", mu = 0) %>%
generate(reps = 1000, type = "bootstrap") %>%
calculate(stat = "mean")
for regression would be
gss %>%
specify(hours ~ age + college) %>%
hypothesize(null = "point", beta0 = 0) %>%
generate(reps = 1000, type = "bootstrap") %>%
fit()
Another possible solution would be to shift the null distribution for the intercept to be centered at 0 after the permutation occurs (or to have an option for that, maybe it wouldn't be the default behavior?), e.g., with null_dist from above:
null_dist %>%
ungroup() %>%
filter(term == "intercept") %>%
mutate(estimate = estimate - mean(estimate))
Surely, neither of these are the most elegant solution -- but I image folks working on this package would have some excellent ideas for elegant integration of something similar.
This issue was a fun one to present to my students, and I actually think it got them thinking about what is happening in a permutation test more deeply. But I worry that by opting to keep results for the intercept here may typically lead folks astray.