Skip to content

Gradient wrt to a sparse matrix is mathematically wrong #1507

Open
@mohamed82008

Description

@mohamed82008

Currently Zygote.gradient projects the co-tangent returned from pullback to have the same sparsity structure as the input. This is mathematically incorrect when the matrix input is sparse. According to Zygote, the following function has different gradients wrt the same inputs (mathematically speaking).

using Zygote, SparseArrays

Zygote.gradient(sum, zeros(1,1))[1] != Zygote.gradient(sum, spzeros(1,1))[1] # true

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions