You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi there,
I found an inconsistent behaviour when differentiating a function which takes in an AbstractCuSparseArray versus the same function differentiating an AbstractSparseArray
using CUDA
using Zygote
using SparseArrays
logpos(a) = a >0?log(a) :zero(a)
l(A) =sum(logpos.(A))
A =sprandn(Float32, 10,10,0.4)
dA =gradient(l, A)[1] # returns a sparse array, as expected
Acu =sparse(CuArray(A))
dAcu =gradient(l, Acu)[1] # returns a dense CuArray
Maybe someone here has an idea of where could this come from?
The text was updated successfully, but these errors were encountered:
vboussange
changed the title
CuArray instead of SparseCuArray returned during differentiation
Dense instead of sparse matrix returned during differentiation
Nov 3, 2024
I suspect the issue is with missing rule(s) in ChainRules since Zygote has almost nothing in the way of machinery for diffing sparse arrays. What happens if you set CUDA.allowscalar(false) before running the MWE?
Hi there,
I found an inconsistent behaviour when differentiating a function which takes in an
AbstractCuSparseArray
versus the same function differentiating an AbstractSparseArrayMaybe someone here has an idea of where could this come from?
The text was updated successfully, but these errors were encountered: