Add function: `p_direction()` #90

DominiqueMakowski · 2023-07-29T13:54:56Z

Following the discussion here TuringLang/MCMCChains.jl#428, adding p_direction here defined for a ::AbstractArray{<:Union{Missing,Real}}

codecov · 2023-07-29T15:05:16Z

Codecov Report

Patch coverage has no change and project coverage change: -1.33% ⚠️

Comparison is base (3689384) 96.64% compared to head (acc75b7) 95.32%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #90      +/-   ##
==========================================
- Coverage   96.64%   95.32%   -1.33%     
==========================================
  Files          11       12       +1     
  Lines         865      877      +12     
==========================================
  Hits          836      836              
- Misses         29       41      +12

Files Changed	Coverage Δ
src/MCMCDiagnosticTools.jl	`100.00% <ø> (ø)`
src/p_direction.jl	`0.00% <0.00%> (ø)`

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

sethaxen · 2023-07-29T19:07:26Z

Thanks for the PR! However, I disagree with @devmotion that this functionality belongs in this package. Within the ArviZ project at least, we draw a distinction between a diagnostic, which tells us something about the quality of inference using the given inference method (here, some MCMC method) and a statistic, which tells us something about the posterior (or something derived from the posterior such as a posterior predictive distribution). So for example, it would not make sense to include HDI, LOO, WAIC, or LOO-PIT in this package, even though most of these are also implemented in this org for Chains inputs, since these are statistics. Thus I would say the same about the p-direction.

I'm splitting out ArviZ's statistics implementations into their own ArviZStats package (currently a submodule here), and this could potentially belong there. But in ArviZ we're careful to only include methods that have been thoroughly compared with other methods so that we can make informed recommendations about how and when to use them. I've personally never used p-directions before, so I would need to read more about them and in particular see how they are used in practice and how they relate to and perform compared to other notions of Bayesian p-values, e.g. those proposed in Section 6 of BDA3.

yebai · 2023-08-02T22:16:02Z

Maybe we can add an src/contrib to store these less general diagnostics?

devmotion · 2023-08-03T08:13:33Z

Maybe it would be reasonable to not only split out ArvizStats in a standalone package but also generalize it to generic chains as MCMCDiagnosticTools (maybe MCMCStatistics would be a reasonable name), if that's not the case yet?

sethaxen · 2023-08-03T08:49:49Z

Maybe we can add an src/contrib to store these less general diagnostics?

For the reasons in #90 (comment) I don't think this belongs here at all. It's not specific to MCMC, and it is a posterior analysis, not a diagnostic of either an inference method or of a model. This would fall in the category of Bayesian hypothesis testing, which would be a nice package on its own or in a larger package.

Maybe it would be reasonable to not only split out ArvizStats in a standalone package but also generalize it to generic chains as MCMCDiagnosticTools (maybe MCMCStatistics would be a reasonable name), if that's not the case yet?

ArviZStats is already generalized this way. In particular, every function has a method that operates on either arrays of Monte Carlo draws or an output of a previous analysis computed from MC draws. It does have convenience methods for InferenceObjects types, but these will be moved to an InferenceObjects-ArviZStats extension.

Nothing in ArviZStats is MCMC-specific (it depends on this package only to set reff for loo, which for non-MCMC should be estimated to be ~1). It's fine to use it with draws from a variational model or a GP posterior. So a name like "MCMCStats" would be too narrow. I'm certainly open to a more descriptive name though. For me the best reason to use "ArviZStats" as opposed to something more generic like "InferenceStats" or "MonteCarloStats" is that it reduces the expectation we include everything that might fall under that name and makes it easier to restrict our scope to methods widely useful in Bayesian workflows whose properties and pitfalls have been well-established. The corresponding Python package will also be called ArviZ_stats.

DominiqueMakowski · 2023-08-03T08:59:38Z

I must say that I do agree that this type of indices (also including other ones like region of practical equivalence - ROPE) are neither diagnostic indices nor specific to MCMC (and thus don't really fit here).

A package for Turing-model testing would make sense if the goal is to add more indices (including Bayes Factors), but at this stage if it's just for one function it might be overkill. Perhaps it could be taken step by step, i.e., first add that to say MCMCChains, and once there is need & effort to create that additional package we move the appropriate functions there?

sethaxen · 2023-08-03T09:40:58Z

Actually, @DominiqueMakowski, some of the ArviZ plots do use ROPE (e.g. plot_posterior), Bayes factors (plot_bf), and the Bayesian "p-values" in BDA3 (plot_bpv), so as we port these to Julia, we will need Julia implementations of these functions. In Python ArviZ, these are not included in the stats API, but we could include them in ArviZStats. In that case p_direction would make sense to add to ArviZStats (or whatever it is named).

devmotion · 2023-08-03T09:53:17Z

So a name like "MCMCStats" would be too narrow. I'm certainly open to a more descriptive name though. For me the best reason to use "ArviZStats" as opposed to something more generic like "InferenceStats" or "MonteCarloStats" is that it reduces the expectation we include everything that might fall under that name and makes it easier to restrict our scope to methods widely useful in Bayesian workflows whose properties and pitfalls have been well-established.

Maybe something like PosteriorStats would be an alternative that would also fit with e.g. GP posteriors. Generally, I think a descriptive name is fine even if it is a bit general - we could always make it clear in the docs and README what's the intended scope of the package.

sethaxen · 2023-08-03T13:12:47Z

Maybe something like PosteriorStats would be an alternative that would also fit with e.g. GP posteriors.

Done! https://github.com/arviz-devs/PosteriorStats.jl. ~~Will register and then work on moving all InferenceObjects code to an extension.~~ Will move the InferenceObjects code to an extension first, and then register.

yebai · 2023-08-30T09:15:33Z

Closed in favour of arviz-devs/PosteriorStats.jl#10

DominiqueMakowski added 5 commits July 29, 2023 14:54

add p_direction()

5a08b0e

minor format

7f60557

add test

7359de6

change example to jldoctest

df2b52b

add docs

acc75b7

DominiqueMakowski requested a review from devmotion July 29, 2023 15:15

sethaxen mentioned this pull request Aug 8, 2023

Adding functions related to Bayesian hypothesis testing arviz-devs/PosteriorStats.jl#10

Open

yebai closed this Aug 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add function: `p_direction()` #90

Add function: `p_direction()` #90

DominiqueMakowski commented Jul 29, 2023

codecov bot commented Jul 29, 2023 •

edited

Loading

sethaxen commented Jul 29, 2023

yebai commented Aug 2, 2023

devmotion commented Aug 3, 2023

sethaxen commented Aug 3, 2023

DominiqueMakowski commented Aug 3, 2023 •

edited

Loading

sethaxen commented Aug 3, 2023

devmotion commented Aug 3, 2023

sethaxen commented Aug 3, 2023 •

edited

Loading

yebai commented Aug 30, 2023

Add function: p_direction() #90

Add function: p_direction() #90

Conversation

DominiqueMakowski commented Jul 29, 2023

codecov bot commented Jul 29, 2023 • edited Loading

Codecov Report

sethaxen commented Jul 29, 2023

yebai commented Aug 2, 2023

devmotion commented Aug 3, 2023

sethaxen commented Aug 3, 2023

DominiqueMakowski commented Aug 3, 2023 • edited Loading

sethaxen commented Aug 3, 2023

devmotion commented Aug 3, 2023

sethaxen commented Aug 3, 2023 • edited Loading

yebai commented Aug 30, 2023

Add function: `p_direction()` #90

Add function: `p_direction()` #90

codecov bot commented Jul 29, 2023 •

edited

Loading

DominiqueMakowski commented Aug 3, 2023 •

edited

Loading

sethaxen commented Aug 3, 2023 •

edited

Loading