Allow user to choose between safe/unsafe/fastcholesky inverse in Gaussian nodes #76

bvdmitri · 2022-01-10T11:39:12Z

This might be a meta flag for Gaussian nodes.
For example by default we might assume Hermittian structure and use meta = AssumeHermittian() that fallbacks to fastcholesky.

cholinv(::AssumeHermittian, something) = fastcholesky(something)

In other cases we might prever pinv(), e.g. meta = AssumeNonHermittian()

cholinv(::AssumeNonHermittian, something) = pinv(something)

@bartvanerp , @albertpod

The text was updated successfully, but these errors were encountered:

bartvanerp · 2022-01-10T12:08:42Z

I am not sure whether this should be a flag for Gaussian nodes. I think it is better to always constrain the cholinv method. Cholesky by definition only applies for Hermitian/Symmetric matrices. Therefore cholinv should actually only be defined as cholinv(A::Hermitian) instead of cholinv(A::AbstractMatrix).

This issue might be better formulated whether we wish to constrain covariance/precision matrices to always be Hermitian. I am not sure whether we have cases when this does not apply.

albertpod · 2022-01-10T15:17:14Z

Agree with @bartvanerp that cholinv should be applied to Hermitian matrices only.

However, we can easily encounter situations when the pos-def property is violated.

                 (u)
                  |
    (x_t_prev) --> + --> = --> (x_t)
                        | 
                 (H)--> *
                        |
                (c')--> *
                        |
                        N
                        |
                      (y_t)

Assuming x_t in R, H is 2x1 vector, and c' is transposed unit vector of length 2. N denotes Gaussian node.
This is a fair setup from ReactiveMP.jl perspective, moreover, it's also fair mathematically; there is nothing but linear transformation of a gaussian x_t. However, if you look at the marginal between * edges, you can easily spot that it results from the multiplication of two degenerate distributions.

Another 'easy to spot' example is from the rule itself:

@rule typeof(*)(:out, Marginalisation) (m_A::PointMass{ <: AbstractVector }, m_in::UnivariateNormalDistributionsFamily) = begin
    a = mean(m_A)

    μ_in, v_in = mean_var(m_in)

    # TODO: check, do we need correction! here? (ForneyLab does not have any correction in this case)
    # TODO: Σ in this rule is guaranteed to be ill-defined, has rank equal to one and has determinant equal to zero
    μ = μ_in * a
    Σ = mul_inplace!(v_in, a * a')

    return MvNormalMeanCovariance(μ, Σ)
end

You are almost always guaranteed to end up with a non-Hermitian "covariance" matrix in this case.

bartvanerp · 2022-01-10T15:55:51Z

I agree that sometimes we end up with improper distributions, from which the covariance/precision matrix is not positive definite. However, this does not mean that the matrix is not Hermitian I believe. Maybe I am wrong here, but the cholesky function just assumes that the covariance/precision matrix can be represented by just its lower/upper triangular matrix.

What happens in #69 is the following:

We assume that the matrix is symmetrical/Hermitian. Then fastcholesky will calculate the cholesky decomposition by using only the lower/upper triangular of the matrix. The other part of the matrix is used for intermediate storage to prevent extra allocations.
If the matrix turned out to be positive definite and the algorithm finishes, great job.
If the algorithm fails (because of taking a negative square root as a result of not positive definiteness) then the algorithm returns a flag, after which we call cholesky(Positive, A), which takes care or the indefiniteness of the matrix.

bvdmitri · 2022-01-10T16:09:28Z

@bartvanerp the problem here is that cholesky(Positive, A) also assumes that the matrix is symmetrical/Hermitian. But it accepts slightly non-sym/non-Hermitian, makes a "smart" correction and then executes standard cholesky method.

This issue came up from our conversion with Albert. There are some matrices that do have a proper inverse, but they are neither symmetrical nor Hermitian. In some situation (IMO rare, but anyway) user might prefer to use a different inverse method if they now the structure of their improper covariance matrix. E.g.

julia> A = [ 1.0 1.0; 0.0 1.0 ]
2×2 Matrix{Float64}:
 1.0  1.0
 0.0  1.0

julia> inv(A) * A
2×2 Matrix{Float64}:
 1.0  0.0
 0.0  1.0

julia> ReactiveMP.cholinv(A) * A
2×2 Matrix{Float64}:
 1.0  1.0
 0.0  1.0

I do think though that having such an inverse for lets say covariance matrix might make little sense from mathematical point of view. As a workaround a user can always define such methods with the @rule macro on their side with custom meta and use whatever inverse method they prefer.

bartvanerp · 2022-01-10T16:42:42Z

I am not sure whether I am following entirely here, so better to discuss this in person.

There are some matrices that do have a proper inverse, but they are neither symmetrical nor Hermitian

So then we just use inv() right?

if they now the structure of their improper covariance matrix.

What do you mean with improper here? If we are dealing with an arbitrary non-symmetric covariance/precision matrix then we can make it symmetric by just redistributing its elements without any loss of generality. Then we can call cholinv and solve the problem:

julia> B = [1.0 0.5; 0.5 1]
2×2 Matrix{Float64}:
 1.0  0.5
 0.5  1.0

julia> ReactiveMP.cholinv(B) * B
2×2 Matrix{Float64}:
 1.0  -1.11022e-16
 0.0   1.0

bvdmitri · 2022-01-10T16:47:13Z

So then we just use inv() right?
then we can make it symmetric by just redistributing its elements without any loss of generality

Yes and yes. But neither of these options can be controlled by a user in message update rules. In message update rules we always use cholinv at the moment. I opened this issue to discuss do we need to provide user any sort of control on matrix inversion in our rules or just let them define their own with whatever inverse method they prefer.

bvdmitri · 2023-10-23T11:00:31Z

We recently discussed an associated issue, with cholinv. The fastcholesky tries to correct the matrix as much as possible, even if it is obviously not positive definite, e.g.

julia> fastcholesky([ 0.0 0.0; 0.0 0.0 ])
LinearAlgebra.Cholesky{Float64, Matrix{Float64}}
L factor:
2×2 LinearAlgebra.LowerTriangular{Float64, Matrix{Float64}}:
 1.0   ⋅
 0.0  1.0

The remedy would be to assume that any matrix that is passed to the Gaussian is proper. For non-proper non-invertible matrices we must implement a special type of matrix, e.g. MatrixWithInverse, which would carry the inverse without actually calling the fastcholesky.

bvdmitri changed the title ~~Allow user to choose between safe/unsafe/fast cholesky in Gaussian nodes~~ Allow user to choose between safe/unsafe/fastcholesky inverse in Gaussian nodes Jan 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow user to choose between safe/unsafe/fastcholesky inverse in Gaussian nodes #76

Allow user to choose between safe/unsafe/fastcholesky inverse in Gaussian nodes #76

bvdmitri commented Jan 10, 2022 •

edited

Loading

bartvanerp commented Jan 10, 2022

albertpod commented Jan 10, 2022 •

edited

Loading

bartvanerp commented Jan 10, 2022

bvdmitri commented Jan 10, 2022

bartvanerp commented Jan 10, 2022 •

edited

Loading

bvdmitri commented Jan 10, 2022

bvdmitri commented Oct 23, 2023

Allow user to choose between safe/unsafe/fastcholesky inverse in Gaussian nodes #76

Allow user to choose between safe/unsafe/fastcholesky inverse in Gaussian nodes #76

Comments

bvdmitri commented Jan 10, 2022 • edited Loading

bartvanerp commented Jan 10, 2022

albertpod commented Jan 10, 2022 • edited Loading

bartvanerp commented Jan 10, 2022

bvdmitri commented Jan 10, 2022

bartvanerp commented Jan 10, 2022 • edited Loading

bvdmitri commented Jan 10, 2022

bvdmitri commented Oct 23, 2023

bvdmitri commented Jan 10, 2022 •

edited

Loading

albertpod commented Jan 10, 2022 •

edited

Loading

bartvanerp commented Jan 10, 2022 •

edited

Loading