Releases: tidymodels/infer
First major release
infer 1.0.0
v1.0.0 is the first major release of the {infer} package! By and large, the core verbs specify(), hypothesize(), generate(), and calculate() will interface as they did before. This release makes several improvements to behavioral consistency of the package and introduces support for theory-based inference as well as randomization-based inference with multiple explanatory variables.
Behavioral consistency
A major change to the package in this release is a set of standards for behavorial consistency of calculate() (#356). Namely, the package will now
- supply a consistent error when the supplied
statargument isn't well-defined
for the variablesspecify()d
gss %>%
specify(response = hours) %>%
calculate(stat = "diff in means")
#> Error: A difference in means is not well-defined for a
#> numeric response variable (hours) and no explanatory variable.or
gss %>%
specify(college ~ partyid, success = "degree") %>%
calculate(stat = "diff in props")
#> Error: A difference in proportions is not well-defined for a dichotomous categorical
#> response variable (college) and a multinomial categorical explanatory variable (partyid).- supply a consistent message when the user supplies unneeded information via
hypothesize()tocalculate()an observed statistic
# supply mu = 40 when it's not needed
gss %>%
specify(response = hours) %>%
hypothesize(null = "point", mu = 40) %>%
calculate(stat = "mean")
#> Message: The point null hypothesis `mu = 40` does not inform calculation of
#> the observed statistic (a mean) and will be ignored.
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 41.4and
- supply a consistent warning and assume a reasonable null value when the user does not supply sufficient information to calculate an observed statistic
# don't hypothesize `p` when it's needed
gss %>%
specify(response = sex, success = "female") %>%
calculate(stat = "z")
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 -1.16
#> Warning message:
#> A z statistic requires a null hypothesis to calculate the observed statistic.
#> Output assumes the following null value: `p = .5`. or
# don't hypothesize `p` when it's needed
gss %>%
specify(response = partyid) %>%
calculate(stat = "Chisq")
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 334.
#> Warning message:
#> A chi-square statistic requires a null hypothesis to calculate the observed statistic.
#> Output assumes the following null values: `p = c(dem = 0.2, ind = 0.2, rep = 0.2, other = 0.2, DK = 0.2)`.To accommodate this behavior, a number of new calculate methods were added or improved. Namely:
- Implemented the standardized proportion
$z$ statistic for one categorical variable - Extended
calculate()withstat = "t"by passingmuto thecalculate()method forstat = "t"to allow for calculation oftstatistics for one numeric variable with hypothesized mean - Extended
calculate()to allow lowercase aliases forstatarguments (#373). - Fixed bugs in
calculate()for to allow for programmatic calculation of statistics
This behavorial consistency also allowed for the implementation of observe(), a wrapper function around specify(), hypothesize(), and calculate(), to calculate observed statistics. The function provides a shorthand alternative to calculating observed statistics from data:
# calculating the observed mean number of hours worked per week
gss %>%
observe(hours ~ NULL, stat = "mean")
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 41.4
# equivalently, calculating the same statistic with the core verbs
gss %>%
specify(response = hours) %>%
calculate(stat = "mean")
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 41.4
# calculating a t statistic for hypothesized mu = 40 hours worked/week
gss %>%
observe(hours ~ NULL, stat = "t", null = "point", mu = 40)
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 2.09
# equivalently, calculating the same statistic with the core verbs
gss %>%
specify(response = hours) %>%
hypothesize(null = "point", mu = 40) %>%
calculate(stat = "t")
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 2.09We don't anticipate that these changes are "breaking" in the sense that code that previously worked will continue to, though it may now message or warn in a way that it did not used to or error with a different (and hopefully more informative) message.
A framework for theoretical inference
This release also introduces a more complete and principled interface for theoretical inference. While the package previously supplied some methods for visualization of theory-based curves, the interface did not provide any object that was explicitly a "null distribution" that could be supplied to helper functions like get_p_value() and get_confidence_interval(). The new interface is based on a new verb, assume(), that returns a null distribution that can be interfaced in the same way that simulation-based null distributions can be interfaced with.
As an example, we'll work through a full infer pipeline for inference on a mean using infer's gss dataset. Supposed that we believe the true mean number of hours worked by Americans in the past week is 40.
First, calculating the observed t-statistic:
obs_stat <- gss %>%
specify(response = hours) %>%
hypothesize(null = "point", mu = 40) %>%
calculate(stat = "t")
obs_stat
#> Response: hours (numeric)
#> Null Hypothesis: point
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 2.09The code to define the null distribution is very similar to that required to calculate a theorized observed statistic, switching out calculate() for assume() and replacing arguments as needed.
null_dist <- gss %>%
specify(response = hours) %>%
assume(distribution = "t")
null_dist
#> A T distribution with 499 degrees of freedom.This null distribution can now be interfaced with in the same way as a simulation-based null distribution elsewhere in the package. For example, calculating a p-value by juxtaposing the observed statistic and null distribution:
get_p_value(null_dist, obs_stat, direction = "both")
#> # A tibble: 1 x 1
#> p_value
#> <dbl>
#> 1 0.0376…or visualizing the null distribution alone:
visualize(null_dist)…or juxtaposing the two visually:
visualize(null_dist) +
shade_p_value(obs_stat, direction = "both")Confidence intervals lie in data space rather than the standardized scale of the theoretical distributions. Calculating a mean rather than the standardized t-statistic:
obs_mean <- gss %>%
specify(response = hours) %>%
calculate(stat = "mean")The null distribution here just defines the spread for the standard error calculation.
ci <-
get_confidence_interval(
null_dist,
level = .95,
point_estimate = obs_mean
)
ci
#> # A tibble: 1 x 2
#> lower_ci upper_ci
#> <dbl> <dbl>
#> 1 40.1 42.7Visualizing the confidence interval results in the theoretical distribution being recentered and rescaled to align with the scale of the observed data:
visualize(null_dist) +
shade_confidence_interval(ci)Previous methods for interfacing with theoretical distributions are superseded—they will continue to be supported, though documentation will forefront the assume() interface.
Support for multiple regression
The 2016 "Guidelines for Assessment and Instruction in Statistics Education" [1] state that, in introductory statistics courses, "[s]tudents should gain experience with how statistical models, including multivariable models, are used." In line with this recommendation, we introduce support for randomization-based inference with multiple explanatory variables via a new fit.infer core verb.
If passed an infer object, the method will parse a formula out of the formula or response and explanatory arguments, and pass both it and data to a stats::glm call.
gss %>%
specify(hours ~ age + college) %>%
fit()
#> # A tibble: 3 x 2
#> term estimate
#> <chr> <dbl>
#> 1 intercept 40.6
#> 2 age 0.00596
#> 3 collegedegree 1.53Note that the function returns the model coefficients as estimate rather than their associated t-statistics as stat.
If passed a generate()d object, the model will be fitted to each replicate.
gss %>%
specify(hours ~ age + college) %>%
hypothesize(null = "independence") %>%
generate(reps = 100, type = "permute") %>%
fit()
#> # A tibble: 300 x 3
#> # Groups: replicate [100]
#> replicate term estimate
#> <int> <chr> <dbl>
#> 1 1 intercept 44.4
#> 2 1 age -0.0767
#> 3 1 collegedegree 0.121
#> 4 2 intercept 41.8
#> 5 2 age 0.00344
#> 6 2 collegedegree -1.59
#> 7 3 intercept 38.3
#> 8 3 age 0.0761
#> 9 3 collegedegree 0.136
#> 10 4 intercept 43.1
#> # … with 290 more rowsIf type = "permute", a set of unquoted column names in the data to permute (independently of each other) can be passed via the variables argument to generate. It defaults to only the response variable.
gss %>%
specify(hours ~ age + college) %>%
hypothesize(null = "independence") %>%
generate(reps = 100, type = "permute", variables = c(age, college)) %>%
fit()
#> # A tibble: 300 x 3
#> # Groups: replicate [100]
#> ...Standardized proportion z statistic, improvements to various helpers
-
rep_sample_n()no longer errors when supplied aprobargument (#279) - Added
rep_slice_sample(), a light wrapper aroundrep_sample_n(), that more closely resemblesdplyr::slice_sample()(the function that supersedesdplyr::sample_n()) (#325) - Added a
success,correct, andzargument toprop_test()(#343, #347, #353) - Implemented observed statistic calculation for the standardized proportion
$z$ statistic (#351, #353) - Various bug fixes and improvements to documentation and errors.
Bias-corrected confidence intervals
get_confidence_interval()can now produce bias-corrected confidence intervals
by settingtype = "bias-corrected". Thanks to @davidbaniadam for the
initial implementation (#237, #318)!get_confidence_interval()now uses column names ('lower_ci' and 'upper_ci')
in output that are consistent with other infer functionality (#317).- Fix CRAN check failures related to long double errors.
New test statistics and vignettes, improved warnings/errors
- Warn the user when a p-value of 0 is reported (#257, #273)
- Added new vignettes:
chi_squaredandanova(#268) - Updates to documentation and existing vignettes (#268)
- Add alias for
hypothesize()(hypothesise()) (#271) - Subtraction order no longer required for difference-based tests--a warning will be raised in the case that the user doesn't supply an
orderargument (#275, #281) - Add new messages for common errors (#277)
- Increase coverage of theoretical methods in documentation (#278, #280)
- Drop missing values and reduce size of
gssdataset used in examples (#282) - Add
stat = "ratio of props"andstat = "odds ratio"tocalculate(#285) - Add
prop_test(), a tidy interface toprop.test()(#284, #287) - Updates to
visualize()for compatibility withggplot2v3.3.0 (#289) - Fix error when bootstrapping with small samples and raise warnings/errors
when appropriate (#239, #244, #291) - Fix unit test failures resulting from breaking changes in
dplyrv1.0.0 - Fix error in
generate()when response variable is namedx(#299) - Add
two-sidedandtwo sidedas aliases fortwo_sidedfor the
directionargument inget_p_value()andshade_p_value()(#302) - Fix
t_test()andt_stat()ignoring theorderargument (#310)
Documentation and other tweaks
0.5.1 Update NEWS.md
Update chi squared tests
infer 0.5.0
Breaking changes
shade_confidence_interval()now plots vertical lines starting from zero (previously - from the bottom of a plot) (#234).shade_p_value()now uses "area under the curve" approach to shading (#229).
Other
- Updated
chisq_test()to take arguments in a response/explanatory format, perform goodness of fit tests, and default to the approximation approach (#241). - Updated
chisq_stat()to do goodness of fit (#241). - Make interface to
hypothesize()clearer by adding the options for the point null parameters to the function signature (#242). - Manage
inferclass more systematically (#219). - Use
vdiffrfor plot testing (#221).
change p-val computation; add visualization layers
infer 0.4.0
Breaking changes
- Changed method of computing two-sided p-value to a more conventional one. It also makes
get_pvalue()andvisualize()more aligned (#205).
Deprecation changes
- Deprecated
p_value()(useget_p_value()instead) (#180). - Deprecated
conf_int()(useget_confidence_interval()instead) (#180). - Deprecated (via warnings) plotting p-value and confidence interval in
visualize()(use new functionsshade_p_value()andshade_confidence_interval()instead) (#178).
New functions
shade_p_value()- {ggplot2}-like layer function to add information about p-value region tovisualize()output. Has aliasshade_pvalue().shade_confidence_interval()- {ggplot2}-like layer function to add information about confidence interval region tovisualize()output. Has aliasshade_ci().
Other
- Account for
NULLvalue in left hand side of formula inspecify()(#156) andtypeingenerate()(#157). - Update documentation code to follow tidyverse style guide (#159).
- Remove help page for internal
set_params()(#165). - Fully use {tibble} (#166).
- Fix
calculate()to not depend on order ofpfortype = "simulate"(#122). - Reduce code duplication (#173).
- Make transparancy in
visualize()to not depend on method and data volume. - Make
visualize()work for "One sample t" theoretical type withmethod = "both". - Add
stat = "sum"andstat = "count"options tocalculate()(#50).
Bug fixes and switch to {glue}
- Stop using package {assertive} in favor of custom type checks (#149)
- Fixed
t_stat()to use...sovar.equalworks - With the help of @echasnovski, fixed
var.equal = TRUEforspecify() %>% calculate(stat = "t") - Use custom functions for error, warning, message, and
paste()handling (#155)
Add p_value and conf_int functions and observed stat shortcut with specify() %>% calculate()
- Added
conf_intlogical argument andconf_levelargument tot_test() - Switched
shade_colorargument invisualize()to bepvalue_fillinstead
since fill color for confidence intervals is also added now - Shading for Confidence Intervals in
visualize()- Green is default color for CI and red for p-values
-
direction = "between"to get the green shading - Currently working only for simulation-based methods
- Implemented
conf_int()function for computing confidence interval provided a simulation-based method with astatvariable-
get_ci()andget_confidence_interval()are aliases forconf_int() - Converted longer confidence interval calculation code in vignettes to use
get_ci()instead
-
- Implemented
p_value()function for computing p-value provided a simulation-based method with astatvariable-
get_pvalue()is an alias forp_value() - Converted longer p-value calculation code in vignettes to use
get_pvalue()instead
-
- Implemented Chi-square Goodness of Fit observed stat depending on
paramsbeing set inhypothesizewithspecify() %>% calculate()shortcut - Removed "standardized" slope
$t$ since its formula is different than "standardized" correlation and there is no way currently to give one over the other - Implemented correlation with bootstrap CI and permutation hypothesis test
- Filled the
typeargument automatically ingenerate()based
onspecify()andhypothesize()- Added message if
typeis given differently than expected
- Added message if
- Implemented
specify() %>% calculate()for getting observed
statistics.-
visualize()works with either a 1x1 data frame or a vector
for itsobs_statargument - Got
stat = "t"working
-
- Refactored
calculate()into smaller functions to reduce complexity - Produced error if
muis given inhypothesize()butstat = "median"
is provided incalculate()and other similar mis-specifications - Tweaked
chisq_stat()andt_stat()to match withspecify() %>% calculate()framework- Both work in the one sample and two sample cases by providing
formula - Added
orderargument tot_stat()
- Both work in the one sample and two sample cases by providing
- Added implementation of one sample
t_test()by passing in themuargument tot.test
fromhypothesize() - Tweaked
pkgdownpage to include ToDo's using {dplyr} example
Theoretical distributions to visualize and a few wrapper functions
- Switched to
!!instead ofUQ()sinceUQ()is deprecated in
{rlang} 0.2.0 - Added many new files:
CONDUCT.md,CONTRIBUTING.md, andTO-DO.md - Updated README file with more development information
- Added wrapper functions
t_test()andchisq_test()that use a
formula interface and provide an intuitive wrapper tot.test()and
chisq.test() - Created
stat = "z"andstat = "t"options - Added many new arguments to
visualize()to prescribe colors to shade and
use for observed statistics and theoretical density curves - Added check so that a bar graph created with
visualize()if number of
unique values for generated statistics is small - Added shading for
method = "theoretical" - Implemented shading for simulation methods w/o a traditional distribution
- Use percentiles to determine two-tailed shading
- Changed
method = "randomization"tomethod = "simulation" - Added warning when theoretical distribution is used that
assumptions should be checked - Added theoretical distributions to
visualize()alone and as overlay with
current implementations being- Two sample t
- ANOVA F
- One proportion z
- Two proportion z
- Chi-square test of independence
- Chi-square Goodness of Fit test
- Standardized slope (t)


