Description
Every distribution in Distributions requires specification of a VariateForm
and a ValueSupport
. For manifolds, we need to decide which concepts map into each of these types. This is just a rethinking of the types defined in https://github.com/JuliaManifolds/Manifolds.jl/tree/master/ext/ManifoldsDistributionsExt.
Here's a summary of each of these from the docs.
VariateForm
F <: VariateForm
specifies the form or shape of the variate or a sample.
The
VariateForm
subtypes defined inDistributions.jl
are:
Type A single sample Multiple samples Univariate == ArrayLikeVariate{0}
a scalar number A numeric array of arbitrary shape, each element being a sample Multivariate == ArrayLikeVariate{1}
a numeric vector A matrix, each column being a sample Matrixvariate == ArrayLikeVariate{2}
a numeric matrix An array of matrices, each element being a sample matrix
The docs don't mention it here, but there's also NamedTupleVariate
representing points what are represented with NamedTuple
s and CholeskyVariate
representing points represented with Cholesky
factorizations.
Every VariateForm
in Distributions is a type, not a struct.
ValueSupport
Abstract type that specifies the support of elements of samples.
It is either
Discrete
orContinuous
.
Type Default element type Description Examples Discrete
Int
Samples take countably many values ${0,1,2,3}$ ,$\mathbb{N}$ Continuous
Float64
Samples take uncountably many values $[0, 1]$ ,$\mathbb{R}$ Multiple samples are often organized into an array, depending on the variate form.
These are structs, not types.
Other notes
The following unofficial interpretations were proposed in JuliaStats/Distributions.jl#224 (comment):
Actually, having thought more about this, these should be new "support" types (instead of
Continuous
). My rationale is thatContinuous
support really means "absolutely continuous with respect to the Lebesgue measure", andDiscrete
means "absolutely continuous with respect to the counting measure on the integers". These new types would then mean that the distributions are absolutely continuous wrt to the uniform measure on the relevant spaces.
ManifoldsDistributionsExt
The extension in Manifolds defines these types:
"""
MPointvariate
Structure that subtypes `VariateForm`, indicating that a single sample
is a point on a manifold.
"""
struct MPointvariate <: VariateForm end
"""
MPointSupport(M::AbstractManifold)
Value support for manifold-valued distributions (values from given
[`AbstractManifold`](@extref `ManifoldsBase.AbstractManifold`) `M`).
"""
struct MPointSupport{TM<:AbstractManifold} <: ValueSupport
manifold::TM
end
"""
MPointDistribution{TM<:AbstractManifold}
An abstract distribution for points on manifold of type `TM`.
"""
abstract type MPointDistribution{TM<:AbstractManifold} <:
Distribution{MPointvariate,MPointSupport{TM}} end
Rethinking the implementations
VariateForm
seems to define expectations about the object representing the input point, i.e. corresponds somewhat to our AbstractManifoldPoint
, though our implementations often just use plain arrays.
It would then make sense to either use the same MPointvariate
and interpret it as defining the input as any type representing a point on a manifold, with the actual representing being defined by constraints on point arguments to methods. Or we could redefine an MPointvariate{P<:AbstractManifoldPoint} <: VariateForm
and support users using Arrayvariate
if the points will be plain arrays. I don't like this option because a given method would require constraints both on the variate and the point arguments, which increases code complexity.
ValueSupport
subtypes should specify both the manifold and base measure. Whether this is implicit or explicit will depend on the feature set we want to implement, e.g. an explicit implementation might look like
abstract type AbstractMeasure{M<:AbstractManifold} <: ValueSupport end
struct HausdorffMeasure{M<:AbstractManifold} <: AbstractMeasure{M}
manifold::M
end
Then a simple version of VonMisesFisher
on just the real n-sphere could be implemented as
struct VonMisesFisher{P,ME<:AbstractMeasure{<:Sphere}} <: Distribution{MPointvariate,ME}
params::P
measure::ME
end
function VonMisesFisher(μ::AbstractVector{<:Number}, κ::Number)
params = (; μ, κ)
measure = HausdorffMeasure(Sphere(length(μ)-1))
return VonMisesFisher(params, measure)
end
logpdf(d::VonMisesFisher{<:NamedTuple{(:μ,:κ)},<:HausdorffMeasure}, x::AbstractVector{<:Number}) = ...
Note that this implementation allows for different parameterizations and measures. In principle one could also compute the density wrt the Lebesgue measure on e.g. the spherical coordinates, which would give a different value, but figuring out how one should specify this would be its own issue.
Concerns
It's probably very common for packages that use manifold distributions to dispatch on Discrete
or Continuous
support types and/or Arrayvariate
forms and constrain inputs to be Real
or AbstractArray{<:Real}
. I suspect that the above approach would limit how many packages can work with our distributions with us requesting those packages expand their supported types (see #2).