`optim_adam()` results differ across platforms even with `torch_manual_seed()`

Hi there,

I'd like to use `optim_adam()` for a use case that requires reproducible results across platforms. I set both `set.seed()` and `torch_manual_seed()`, but still got different results on macOS (ARM) and Linux (x86_64). Here’s a simple example taken from the [optimizer tutorial](https://torch.mlverse.org/technical/optimizers/):

```
library(torch)
set.seed(0)
torch_manual_seed(0)

### generate training data -----------------------------------------------------

# input dimensionality (number of input features)
d_in <- 3
# output dimensionality (number of predicted features)
d_out <- 1
# number of observations in training set
n <- 100


# create random data
x <- torch_randn(n, d_in)
y <- x[, 1, NULL] * 0.2 - x[, 2, NULL] * 1.3 - x[, 3, NULL] * 0.5 + torch_randn(n, 1)



### define the network ---------------------------------------------------------

# dimensionality of hidden layer
d_hidden <- 32

model <- nn_sequential(
  nn_linear(d_in, d_hidden),
  nn_relu(),
  nn_linear(d_hidden, d_out)
)

### network parameters ---------------------------------------------------------

# for adam, need to choose a much higher learning rate in this problem
learning_rate <- 0.08

optimizer <- optim_adam(model$parameters, lr = learning_rate)

### training loop --------------------------------------------------------------

for (t in 1:200) {
  
  ### -------- Forward pass -------- 
  
  y_pred <- model(x)
  
  ### -------- compute loss -------- 
  loss <- nnf_mse_loss(y_pred, y, reduction = "sum")
  if (t %% 10 == 0)
    cat("Epoch: ", t, "   Loss: ", loss$item(), "\n")
  
  ### -------- Backpropagation -------- 
  
  # Still need to zero out the gradients before the backward pass, only this time,
  # on the optimizer object
  optimizer$zero_grad()
  
  # gradients are still computed on the loss tensor (no change here)
  loss$backward()
  
  ### -------- Update weights -------- 
  
  # use the optimizer to update model parameters
  optimizer$step()
}

# CHECK REPRODUCIBILITY: Print final bias of last layer
sprintf("%.20f", model$parameters$`2.bias`)
```

Results:
- macOS (Apple Silicon)
```
> sessionInfo()
R version 4.4.1 (2024-06-14)
Platform: aarch64-apple-darwin20
Running under: macOS Sonoma 14.6.1

> sprintf("%.20f", model$parameters$`2.bias`)
[1] "0.12490314245223999023"
```

- Linux (x86_64)
```
> sessionInfo()
R version 4.4.2 (2024-10-31)
Platform: x86_64-pc-linux-gnu
Running under: Ubuntu 24.04.1 LTS

> sprintf("%.20f", model$parameters$`2.bias`)
[1] "0.13555328547954559326"
```

Even though the seed is fixed, results still differ across platforms. Is full cross-platform reproducibility currently feasible with `torch` in R? Or is there something additional I should configure (e.g., thread settings, environment variables)?

Thanks so much for making this amazing package available!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`optim_adam()` results differ across platforms even with `torch_manual_seed()` #1311

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

optim_adam() results differ across platforms even with torch_manual_seed() #1311

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`optim_adam()` results differ across platforms even with `torch_manual_seed()` #1311