Description
I noticed that there might be an inconsistency in the Pareto k-values between different approaches for standard importance sampling (SIS):
library(loo)
log_ratios <- -1 * example_loglik_array()
log_ratios <- log_ratios[1:3, , ]
r_eff <- relative_eff(exp(-log_ratios))
# Call psis():
psis_result <- psis(log_ratios, r_eff = r_eff)
# In fact, SIS was used (due to the small number of draws):
lw_sis <- apply(log_ratios, 3, as.vector)
lw_sis <- sweep(lw_sis, 2, apply(lw_sis, 2, matrixStats::logSumExp))
stopifnot(all.equal(weights(psis_result), lw_sis,
tolerance = .Machine$double.eps))
# Now request SIS explicitly:
sis_result <- sis(log_ratios, r_eff = r_eff)
# The (log) weights are as expected:
stopifnot(all.equal(weights(sis_result), lw_sis,
tolerance = .Machine$double.eps))
# However:
table(pareto_k_values(psis_result))
## Inf
## 32
table(pareto_k_values(sis_result))
## 0
## 32
The point is that calling psis()
with a small number of draws will cause the Pareto smoothing not to take place. Instead, SIS is used, as demonstrated above. In that case, the Pareto k-values are Inf
. When using sis()
explicitly, the Pareto k-values are 0
.
Background: In projpred, it is possible (although not encouraged and in particular, this is not the default behavior) to use PSIS-LOO CV with the search being excluded from the CV (validate_search = FALSE
) and a small number of thinned draws. In principle, projpred could use sis()
explicitly in such a case (and then either continue with the Pareto k-values which are all 0
or even skip the Pareto k checks), but that requires to catch the "small S" case manually (which is not a problem, but if loo changes anything in its "small S" decision rule in the future, this would require adapting projpred's decision rule analogously). Using psis()
would be more straightforward, but then we have Pareto k-values which are Inf
, which would trigger warnings in the Pareto k checks.