Open
Description
Note the following examples:
res <- correlation::correlation(mtcars, partial = TRUE)
res[res$Parameter1=="mpg",]
#> Parameter1 | Parameter2 | r | t | df | p | 95% CI | Method | n_Obs
#> ---------------------------------------------------------------------------------------
#> mpg | cyl | -0.02 | -0.13 | 30 | 1.000 | [-0.37, 0.33] | Pearson | 32
#> mpg | disp | 0.16 | 0.89 | 30 | 1.000 | [-0.20, 0.48] | Pearson | 32
#> mpg | hp | -0.21 | -1.18 | 30 | 1.000 | [-0.52, 0.15] | Pearson | 32
#> mpg | drat | 0.10 | 0.58 | 30 | 1.000 | [-0.25, 0.44] | Pearson | 32
#> mpg | wt | -0.39 | -2.34 | 30 | 1.000 | [-0.65, -0.05] | Pearson | 32
#> mpg | qsec | 0.24 | 1.34 | 30 | 1.000 | [-0.12, 0.54] | Pearson | 32
#> mpg | vs | 0.03 | 0.18 | 30 | 1.000 | [-0.32, 0.38] | Pearson | 32
#> mpg | am | 0.26 | 1.46 | 30 | 1.000 | [-0.10, 0.56] | Pearson | 32
#> mpg | gear | 0.10 | 0.52 | 30 | 1.000 | [-0.26, 0.43] | Pearson | 32
#> mpg | carb | -0.05 | -0.29 | 30 | 1.000 | [-0.39, 0.30] | Pearson | 32
res <- ppcor::pcor(mtcars)
data.frame(r = res$estimate[-1,1],
t = res$statistic[-1,1],
p = res$p.value[-1,1])
#> r t p
#> cyl -0.02326429 -0.1066392 0.91608738
#> disp 0.16083460 0.7467585 0.46348865
#> hp -0.21052027 -0.9868407 0.33495531
#> drat 0.10445452 0.4813036 0.63527790
#> wt -0.39344938 -1.9611887 0.06325215
#> qsec 0.23809863 1.1234133 0.27394127
#> vs 0.03293117 0.1509915 0.88142347
#> am 0.25832849 1.2254035 0.23398971
#> gear 0.09534261 0.4389142 0.66520643
#> carb -0.05243662 -0.2406258 0.81217871
Created on 2020-04-06 by the reprex package (v0.3.0)
The resulting partial correlations are identical, but the t values are not (and by extension so are the CIs, and the unadjusted p values). Why?
Because correlation()
computes partial correlations by residualizing variables, and then computing the correlations between them. But the df
of the residualizing process - that is, the degree of uncertainty in estimating the residuals - is not accounted for. (Note that this should be true for Bayesian partial correlations as well - the priors and likelihood of the residualizing process are not accounted for).
Solutions:
- Account for these. [HARD]
- Update the docs to explicitly mention this - that inference and CIs are conditional on, and do not account for the uncertainty in estimating the residuals. [EASY]