Why does LSTMCell keep rngs in its state? #4509

JoaoAparicio · 2025-01-28T11:40:35Z

It seems that LSTMCell keeps rngs in its state:

Line 137 in a8a192f

self.rngs = rngs

Is this intentional? Why?

I stumbled upon this because my recipe for checkpointing breaks when my model contains an LSTM:

import orbax.checkpoint as ocp
def savemodel(model, path):
    _, state = nnx.split(model)
    checkpointer = ocp.StandardCheckpointer()
    checkpointer.save(path, state)

Calling savemodel(model, path) throws:

TypeError: JAX array with PRNGKey dtype cannot be converted to a NumPy array. Use jax.random.key_data(arr) if you wish to extract the underlying integer array.

This was surprising because I've been using that recipe before and never had a problem while using other non-LSTM modules.

The text was updated successfully, but these errors were encountered:

cgarciae · 2025-02-03T17:01:16Z

Hi @JoaoAparicio, great question. Its because in general the carry initializer might use the random state. In practice its almost always zeros but currently we support the general case.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why does LSTMCell keep rngs in its state? #4509

Why does LSTMCell keep rngs in its state? #4509

JoaoAparicio commented Jan 28, 2025 •

edited

Loading

cgarciae commented Feb 3, 2025

Why does LSTMCell keep rngs in its state? #4509

Why does LSTMCell keep rngs in its state? #4509

Comments

JoaoAparicio commented Jan 28, 2025 • edited Loading

cgarciae commented Feb 3, 2025

JoaoAparicio commented Jan 28, 2025 •

edited

Loading