Skip to content

VariateForms and ValueSupports #3

Open
@sethaxen

Description

@sethaxen

Every distribution in Distributions requires specification of a VariateForm and a ValueSupport. For manifolds, we need to decide which concepts map into each of these types. This is just a rethinking of the types defined in https://github.com/JuliaManifolds/Manifolds.jl/tree/master/ext/ManifoldsDistributionsExt.

Here's a summary of each of these from the docs.

VariateForm

F <: VariateForm specifies the form or shape of the variate or a sample.

The VariateForm subtypes defined in Distributions.jl are:

Type A single sample Multiple samples
Univariate == ArrayLikeVariate{0} a scalar number A numeric array of arbitrary shape, each element being a sample
Multivariate == ArrayLikeVariate{1} a numeric vector A matrix, each column being a sample
Matrixvariate == ArrayLikeVariate{2} a numeric matrix An array of matrices, each element being a sample matrix

The docs don't mention it here, but there's also NamedTupleVariate representing points what are represented with NamedTuples and CholeskyVariate representing points represented with Cholesky factorizations.

Every VariateForm in Distributions is a type, not a struct.

ValueSupport

Abstract type that specifies the support of elements of samples.

It is either Discrete or Continuous.

Type Default element type Description Examples
Discrete Int Samples take countably many values ${0,1,2,3}$, $\mathbb{N}$
Continuous Float64 Samples take uncountably many values $[0, 1]$, $\mathbb{R}$

Multiple samples are often organized into an array, depending on the variate form.

These are structs, not types.

Other notes

The following unofficial interpretations were proposed in JuliaStats/Distributions.jl#224 (comment):

Actually, having thought more about this, these should be new "support" types (instead of Continuous). My rationale is that Continuous support really means "absolutely continuous with respect to the Lebesgue measure", and Discrete means "absolutely continuous with respect to the counting measure on the integers". These new types would then mean that the distributions are absolutely continuous wrt to the uniform measure on the relevant spaces.

ManifoldsDistributionsExt

The extension in Manifolds defines these types:

"""
    MPointvariate

Structure that subtypes `VariateForm`, indicating that a single sample
is a point on a manifold.
"""
struct MPointvariate <: VariateForm end

"""
    MPointSupport(M::AbstractManifold)

Value support for manifold-valued distributions (values from given
[`AbstractManifold`](@extref `ManifoldsBase.AbstractManifold`)  `M`).
"""
struct MPointSupport{TM<:AbstractManifold} <: ValueSupport
    manifold::TM
end

"""
    MPointDistribution{TM<:AbstractManifold}

An abstract distribution for points on manifold of type `TM`.
"""
abstract type MPointDistribution{TM<:AbstractManifold} <:
              Distribution{MPointvariate,MPointSupport{TM}} end

Rethinking the implementations

VariateForm seems to define expectations about the object representing the input point, i.e. corresponds somewhat to our AbstractManifoldPoint, though our implementations often just use plain arrays.

It would then make sense to either use the same MPointvariate and interpret it as defining the input as any type representing a point on a manifold, with the actual representing being defined by constraints on point arguments to methods. Or we could redefine an MPointvariate{P<:AbstractManifoldPoint} <: VariateForm and support users using Arrayvariate if the points will be plain arrays. I don't like this option because a given method would require constraints both on the variate and the point arguments, which increases code complexity.

ValueSupport subtypes should specify both the manifold and base measure. Whether this is implicit or explicit will depend on the feature set we want to implement, e.g. an explicit implementation might look like

abstract type AbstractMeasure{M<:AbstractManifold} <: ValueSupport end

struct HausdorffMeasure{M<:AbstractManifold} <: AbstractMeasure{M}
    manifold::M
end

Then a simple version of VonMisesFisher on just the real n-sphere could be implemented as

struct VonMisesFisher{P,ME<:AbstractMeasure{<:Sphere}} <: Distribution{MPointvariate,ME}
    params::P
    measure::ME
end
function VonMisesFisher::AbstractVector{<:Number}, κ::Number)
    params = (; μ, κ)
    measure = HausdorffMeasure(Sphere(length(μ)-1))
    return VonMisesFisher(params, measure)
end

logpdf(d::VonMisesFisher{<:NamedTuple{(:μ,:κ)},<:HausdorffMeasure}, x::AbstractVector{<:Number}) = ...

Note that this implementation allows for different parameterizations and measures. In principle one could also compute the density wrt the Lebesgue measure on e.g. the spherical coordinates, which would give a different value, but figuring out how one should specify this would be its own issue.

Concerns

It's probably very common for packages that use manifold distributions to dispatch on Discrete or Continuous support types and/or Arrayvariate forms and constrain inputs to be Real or AbstractArray{<:Real}. I suspect that the above approach would limit how many packages can work with our distributions with us requesting those packages expand their supported types (see #2).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions