[Question] Things to do to reproduce the evaluation results in hyperparameters optimization

### ❓ Question

Hello, thanks for the awesome function of auto-searching for the best hyperparameters. Recently I recorded the hyperparameters reported and tried to train the same algorithm (PPO) with the custom environment (non-deterministic but with the same np.random.seed) under the same seed input for `train.py`. But the evaluation performance (mean reward: -2.2e6) on the actual training after the same timesteps (57600) is quite different from the performance (mean reward: -1.7e6) reported in the phase of hyperparameter optimization.  I would be very grateful if you could point out some possible missing steps to reproduce the performance reported in the hyperparameter optimization phase.

FYI, the custom env can be found at: https://github.com/whxru/rl-baselines3-zoo/blob/master/rl_zoo3/aoi_cbu/env_hybrid.py

### Checklist

- [X] I have checked that there is no similar [issue](https://github.com/DLR-RM/rl-baselines3-zoo/issues) in the repo
- [X] I have read the [SB3 documentation](https://stable-baselines3.readthedocs.io/en/master/)
- [X] I have read the [RL Zoo README](https://github.com/DLR-RM/rl-baselines3-zoo/blob/master/README.md)
- [X] If code there is, it is minimal and working
- [X] If code there is, it is formatted using the [markdown code blocks](https://help.github.com/en/articles/creating-and-highlighting-code-blocks) for both code and stack traces.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Question] Things to do to reproduce the evaluation results in hyperparameters optimization #323

❓ Question

Checklist

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Question] Things to do to reproduce the evaluation results in hyperparameters optimization #323

Description

❓ Question

Checklist

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions