Open
Description
Currently Zygote.gradient
projects the co-tangent returned from pullback
to have the same sparsity structure as the input. This is mathematically incorrect when the matrix input is sparse. According to Zygote, the following function has different gradients wrt the same inputs (mathematically speaking).
using Zygote, SparseArrays
Zygote.gradient(sum, zeros(1,1))[1] != Zygote.gradient(sum, spzeros(1,1))[1] # true
Metadata
Metadata
Assignees
Labels
No labels