@@ -62,8 +62,6 @@ tree formed by the model and update the parameters using the gradients.
62
62
63
63
There is also [ ` Optimisers.update! ` ] ( @ref ) which similarly returns a new model and new state,
64
64
but is free to mutate arrays within the old one for efficiency.
65
- The method of ` apply! ` for each rule is likewise free to mutate arrays within its state;
66
- they are defensively copied when this rule is used with ` update ` .
67
65
(The method of ` apply! ` above is likewise free to mutate arrays within its state;
68
66
they are defensively copied when this rule is used with ` update ` .)
69
67
For ` Adam() ` , there are two momenta per parameter, thus ` state ` is about twice the size of ` model ` :
@@ -87,17 +85,18 @@ Yota is another modern automatic differentiation package, an alternative to Zygo
87
85
88
86
Its main function is ` Yota.grad ` , which returns the loss as well as the gradient (like ` Zygote.withgradient ` )
89
87
but also returns a gradient component for the loss function.
90
- To extract what Optimisers.jl needs, you can write ` _, (_, ∇model) = Yota.grad(f, model, data) `
91
- or, for the Flux model above:
88
+ To extract what Optimisers.jl needs, you can write (for the Flux model above):
92
89
93
90
``` julia
94
91
using Yota
95
92
96
93
loss, (∇function , ∇model, ∇image) = Yota. grad (model, image) do m, x
97
- sum (m (x))
94
+ sum (m (x)
98
95
end ;
99
- ```
100
96
97
+ # Or else, this may save computing ∇image:
98
+ loss, (_, ∇model) = grad (m -> sum (m (image)), model);
99
+ ```
101
100
102
101
## Usage with [Lux.jl](https://github.com/avik-pal/Lux.jl)
103
102
0 commit comments