Gradient wrt to a sparse matrix is mathematically wrong

Currently `Zygote.gradient` projects the co-tangent returned from `pullback` to have the same sparsity structure as the input. This is mathematically incorrect when the matrix input is sparse. According to Zygote, the following function has different gradients wrt the same inputs (mathematically speaking).

```julia
using Zygote, SparseArrays

Zygote.gradient(sum, zeros(1,1))[1] != Zygote.gradient(sum, spzeros(1,1))[1] # true
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Gradient wrt to a sparse matrix is mathematically wrong #1507

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Gradient wrt to a sparse matrix is mathematically wrong #1507

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions