Skip to content

Conversation

@MarcoCerino23
Copy link
Contributor

This PR implements the EIESN Model 2 (Additive Input) variant as discussed in Issue #353.

In this model, the input is added linearly to the reservoir state after the non-linear activation, rather than being passed through the activation function along with the recurrent part.

Changes:

  • Added EIESNAdditiveCell struct in src/layers/eiesn_additive_cell.jl.
  • Added EIESNAdditive model wrapper in src/models/eiesn_additive.jl.
  • Implemented the specific math: $x(t) = \dots + g W_{in} u(t)$.
  • Added input_scale parameter (representing $g$).
  • Added comprehensive tests for both the cell and the model.
  • Updated documentation in layers.md and models.md.
  • Exported new types in ReservoirComputing.jl.

Checklist

  • Appropriate tests were added
  • Any code changes were done in a way that does not break public API
  • All documentation related to code changes were updated
  • The new code follows the contributor guidelines, in particular the SciML Style Guide and COLPRAC.
  • Any new documentation only uses public API

Additional context

Ref: #353.
This complements the previously merged Model 1 implementation.

@MartinuzziFrancesco
Copy link
Collaborator

Nice, feels like it's almost ready! maybe change the name to AdditiveEIESN and AdditiveEIESNCell to maintain the style ModificationModel that we have going in the library. Also, I don't see a bias. do they really not use one in the paper?

@MarcoCerino23
Copy link
Contributor Author

Done.
Naming: I've renamed all structs and files to AdditiveEIESN and AdditiveEIESNCell to match the library style.
Bias: I double-checked the reference paper (Panahi et al., 2025). Specifically, Equation (5) in the preprint explicitly defines the update rule as a linear addition of the input term without any bias vector.

@MartinuzziFrancesco
Copy link
Collaborator

MartinuzziFrancesco commented Jan 19, 2026

I would add the bias, one for each activation function. Also note that I transcribed the equations wrong in the issue it seems, eq 5 in the paper indicates that g is a nonlinear function non not a constant. Additionally, I would make the activation also take an array/tuple as input, so the use can either use the same function for both the ones that now are tanh or use different ones. you can add a check similar to here

@MartinuzziFrancesco
Copy link
Collaborator

you can just add one use_bias flag and create the three biases in initialparameters

state is not provided.
"""

@concrete struct EIESNAdditiveCell <: AbstractReservoirRecurrentCell
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@concrete struct EIESNAdditiveCell <: AbstractReservoirRecurrentCell
@concrete struct EIESNAdditiveCell <: AbstractEchoStateNetworkCell

Comment on lines 91 to 107
function initialparameters(rng::AbstractRNG, cell::EIESNAdditiveCell)
ps = (
input_matrix = cell.init_input(rng, cell.out_dims, cell.in_dims),
reservoir_matrix = cell.init_reservoir(rng, cell.out_dims, cell.out_dims),
)
return ps
end

function initialstates(rng::AbstractRNG, cell::EIESNAdditiveCell)
return (rng = sample_replicate(rng),)
end

function (cell::EIESNAdditiveCell)(inp::AbstractArray, ps, st::NamedTuple)
rng = replicate(st.rng)
hidden_state = init_hidden_state(rng, cell, inp)
return cell((inp, (hidden_state,)), ps, merge(st, (; rng)))
end
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you subtybe the struct EIESNAdditiveCell <: AbstractEchoStateNetworkCell these can be removed

- `rng`: Replicated RNG used to initialize the hidden state when an external
state is not provided.
"""

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete the space to make docs build

- `states_modifiers` — a `Tuple` with states for each modifier layer.
- `readout` — states for [`LinearReadout`](@ref).
"""

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete space to make docs build

Comment on lines 112 to 113
win = ps.input_matrix
A = ps.reservoir_matrix
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use directly the ps., it doesn't save too much space doing this

@MartinuzziFrancesco
Copy link
Collaborator

I would say that we are there, just a couple of finishing touches and I can merge this. Feel free to also up the version in the toml so we can register it as soon as we merge it

@MarcoCerino23
Copy link
Contributor Author

Thanks for the feedback. Regarding the recurrent part, I noticed that the EIESN formulation requires scaling the matrix product before adding the bias: $\alpha (\mathbf{A}\mathbf{x}) + \mathbf{b}$.
ESNCell uses dense_bias because it computes $\mathbf{W}\mathbf{x} + \mathbf{b}$. To achieve the equivalent result here using dense_bias, I would need to scale the matrix itself beforehand (e.g., passing α .* A as the first argument).
I suspect this might be less efficient (especially with sparse matrices) compared to the current implementation where I scale the resulting vector. Do you agree with keeping the explicit calculation for efficiency, or would you prefer using dense_bias with matrix scaling to clean up the code?

@MartinuzziFrancesco
Copy link
Collaborator

that's a good point. Since the scaling is just for the reservoir matrix, and we just have a single bias, you could circumvent that by summing the bias to the input matrix instead, so something like

win_inp = dense_bias(ps.input_matrix, inp, bias)
w_state = dense_bias(ps.reservoir_matrix, hidden_state, nothing)
# scale w_state here

@MarcoCerino23
Copy link
Contributor Author

That's a great observation regarding associativity, and it would work ideally for a standard ESN.
However, there is a structural constraint in the EIESN architecture that prevents this optimization:

  1. Dual Biases: Unlike a standard ESN, we have two distinct biases (bias_ex and bias_inh) for the excitatory and inhibitory branches.
  2. Shared Input: We calculate the input projection W_in * u once and share it across both branches for efficiency.
    If I were to use dense_bias to merge bias_ex into the input, the inhibitory branch would receive the wrong bias (or bias_ex + bias_inh). I would effectively have to compute the input projection twice (once with bias_ex, once with bias_inh) using dense_bias, which doubles the computational cost of the input layer.
    For AdditiveEIESNCell(Model 2), the input is even added outside the non-linearity, so merging the recurrent bias into the input matrix is mathematically impossible there.

@MartinuzziFrancesco
Copy link
Collaborator

I see, that blocks a lot of potential solutions. Then I think your implementation is the right approach, there's probably less overhead for the ifs that there would be to compute the same matmul twice. Just up the version, and make sure that the docstrings also do not go over 92 cols and we can merge this!

@MarcoCerino23
Copy link
Contributor Author

Thanks for the feedback!

@MartinuzziFrancesco MartinuzziFrancesco linked an issue Jan 21, 2026 that may be closed by this pull request
@MartinuzziFrancesco MartinuzziFrancesco merged commit 2b986a1 into SciML:master Jan 21, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Neuromorphic reservoir computing

2 participants