Skip to content

New Gradients ruin everything #2580

Closed
Closed
@mposysoev

Description

@mposysoev

Dear Flux developers,

I'm not someone who usually leaves feedback or writes comments. However, this is a case where I cannot remain silent.

I want to discuss the changes that occurred after version 0.14.25. Now users are required to use gradients ONLY in the form of a NamedTuple structure. Previously, it was possible to use a weight update function approximately in this form:

function update_model!(model::Chain, optimizer, loss_gradients::Vector{<:AbstractArray{T}}) where {T <: AbstractFloat}
    for (gradient, parameter) in zip(loss_gradients, Flux.params(model))
        Flux.Optimise.update!(optimizer, parameter, gradient)
    end
    return model
end

In this function, weight updates occur in a loop that directly iterates over the model weights. The gradient was represented as a vector of vectors, whose sizes corresponded to the layer size. This is no longer possible since Flux.Optimise.update! expects a NamedTuple.

I believe this is a major strategic mistake. Now gradients are supposed to be calculated like this: gradient(m -> loss(m, x, y), model). However, this isn't always possible. Sometimes gradients for neural network updates can be obtained in completely different ways. For example, in statistical modeling through accumulating gradients of various other functions at points of interest and subsequent operations. Performing these operations with NamedTuples is unbearably inconvenient. Additionally, this requires rewriting large sections of code and storing gradients in inefficient structures. (I'm willing to bet that a simple array is much faster than NamedTuples). I'm aware of the Flux.destructure function, but this doesn't solve the problem.

When you decided to switch to a more complex structure, you lost a lot in terms of universality. After all, the simpler the structure, the easier it is to use in different cases. Unfortunately, the recent changes mean that I can no longer use Flux for my research.

P.S.: Perhaps if I spend enough time, I might even manage to make everything work with the new version. But this is not something I want to spend time on, and this is not what you expect from a framework that you heavily depend on.


TL;DR: Please save old way to work with Gradients. Do not deprecate Flux.params(model).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions