You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
even though DiffRules has a rule for abs (used for real inputs), for complex inputs the computation passes through hypot and the DiffRule method for the derivative of hypot in (0, 0) gives NaN
Not sure what the best fix is here. If DiffRules is open to it, maybe the easiest is to fix their hypot derivative rule?
Version info
I'm on julia 1.10.5, on a fresh environment with
(jl_FHvUua) pkg> st
Status `/tmp/jl_FHvUua/Project.toml`
[052768ef] CUDA v5.5.2
[e88e6eb3] Zygote v0.6.71
The text was updated successfully, but these errors were encountered:
Somehow it doesn't... Unless I messed something up, I checked out JuliaDiff/ForwardDiff.jl#669 (manually changing version ForwardDiff version number to 0.10) and still get
Bug description
I've experienced the following inconsistency between GPU and CPU gradient computation for
sum(abs, _)
.The last one is particularly problematic, as it leads to
NaN
values in the gradient that may be hard to understand in a more complex model.Slack discussion
On Slack, @mcabbott explained to me the most likely cause for this:
sum(abs, x)
tosum(abs.(x))
and the broadcasting part is differentiated via ForwardDiffabs
(used for real inputs), for complex inputs the computation passes throughhypot
and the DiffRule method for the derivative ofhypot
in(0, 0)
givesNaN
Not sure what the best fix is here. If DiffRules is open to it, maybe the easiest is to fix their
hypot
derivative rule?Version info
I'm on julia 1.10.5, on a fresh environment with
The text was updated successfully, but these errors were encountered: