gradient buffer not declared in mnist test#153
Conversation
the gradientbuffer is commented out and then never used in training. Throws a compilation error when testing.
What version are you using? julia> @time SimpleChains.train_batched!(
p,
lenetloss,
xtrain4,
SimpleChains.ADAM(3e-4),
10
)
5.798421 seconds (2.23 M allocations: 142.881 MiB, 1.38% gc time, 36.31% compilation time)
44426-element StrideArray{Float32, 1, (1,), Tuple{Int64}, Tuple{Nothing}, Tuple{Static.StaticInt{1}}, Vector{Float32}}:
-0.08823633
0.10486636
0.2803934
-0.18992418
-0.28120536
-0.15369925
0.09791076
-0.024629174
0.36833355
-0.1904188
-0.077544756
0.24522759
-0.22895455
-0.18702294
⋮
-0.3233441
-0.4580964
-0.12954155
0.11518178
0.008558591
0.00065564807
-0.015164471
-0.005364319
-0.02529709
0.038416866
-0.020364538
0.017151155
-0.024268592
0.019850327
julia> if VERSION >= v"1.10"
@test_opt SimpleChains.train_batched!(
p,
lenetloss,
xtrain4,
SimpleChains.ADAM(3e-4),
10
)
end
Test Passed
(@simplechainsm) pkg> st -m SimpleChains
Status `~/.julia/environments/simplechainsm/Manifest.toml`
[de6bee2f] SimpleChains v0.4.6You can check that tests pass on the main branch. |
|
So maybe more complicated than I thought ;) I am using an AMD CPU, could that be the cause? Output for the patched version of simplechains: (@v1.9) pkg> dev SimpleChains julia> import SimpleChains (@v1.9) pkg> test SimpleChains Platform Info: |
|
this is the output after I remove my patched version of simplechains and add the official package back: (@v1.9) pkg> rm SimpleChains (@v1.9) pkg> add SimpleChains (@v1.9) pkg> test SimpleChains Platform Info: [333055] signal (11.1): Speicherzugriffsfehler |
|
The earliest version where it works is 4.2 (@v1.9) pkg> add [email protected] julia> exit() (@v1.9) pkg> test SimpleChains Platform Info: |
|
I just tested on the official build of julia 1.9.3 as well as SimpleChains 0.4.6 and noticed that the tests fail if I only use one thread. But starting julia with -t auto -p auto the tests pass. |
I can confirm this is true for me, too. I'll look into it. |
|
This might be a LoopVectorization issue. It works without crashing or error when I start Julia with # using LoopVectorization: matmul_params, @turbo
using LoopVectorization: matmul_params
macro turbo(ex)
esc(ex)
end
macro turbo(ex0, ex1)
esc(ex1)
end
macro turbo(ex0, ex1, ex2)
esc(ex2)
end |
|
That works, but is also highly undesirable speedwise. They explicitly write that they dont do any memory or bounds checking in LoopVectorisation.jl. Since it also works when I manually preallocate the gradient buffer, maybe there is an error in the memory allocation in SimpleChains.jl before @turbo is used? with @turbo (2 threads) |
|
This comment was before your fix: bisecting leaves me at this commit of SimpleChains.jl: I suspected StrideArraysCore but changing to version 0.4.17 didnt help All of these seem to work fine with more than one thread |
the gradientbuffer is commented out and then never used in training. Throws a compilation error when testing.