Open
Description
🐛 Bug
Given a wrapped env, options passed with the recommended way (wrapped_env.set_options
) are ignored when reset is triggered by episode termination / truncation in step_wait
of the env wrapper.
A (myopic) fix could be to pass self._options[env_idx]
and self._seeds[env_idx]
to the linked code above, or refactor a single-env resetting function to use both in DummyVecEnv.step_wait
and DummyVecEnv.reset
.
Apologies if this expected behavior. If so, what is the recommended way to pass reset options that affect the above resetting scenario?
To Reproduce
from stable_baselines3.common.vec_env import DummyVecEnv
import gymnasium as gym
import numpy as np
# dummy env
class CustomEnv(gym.Env):
def __init__(self):
super().__init__()
self.action_space = gym.spaces.Discrete(3)
self.observation_space = gym.spaces.Box(low=0, high=5, shape=(5,), dtype=np.uint8)
# step terminates the episode
def step(self, action):
terminated = 1
return np.zeros(5), 0, terminated, 0, {}
# reset prints the options, if provided
def reset(self, seed=None, options=None):
if options is not None:
print(" -- Options supplied:", options)
return np.zeros(5), {}
# make and wrap the env
env = CustomEnv()
env = DummyVecEnv([lambda: env])
# resetting by invoking the wrapper function -- options are passed
print("Resetting by invoking DummyVecEnv.reset() :")
env.set_options({'opt': 1})
env.reset()
# reset by a terminating environment step:
# the wrapper step function calls self.envs[env_idx].reset(), which ignores self._options
print("Resetting by an episode-terminating invokation to DummyVecEnv.step() :")
env.set_options({'opt': 1})
env.step([0])
print("Done.")
Relevant log output / Error message
Resetting by invoking DummyVecEnv.reset() :
-- Options supplied: {'opt': 1}
Resetting by an episode-terminating invokation to DummyVecEnv.step() :
Done.
System Info
- OS: Linux-6.6.7-arch1-1-x86_64-with-glibc2.34 # 1 SMP PREEMPT_DYNAMIC Thu, 14 Dec 2023 03:45:42 +0000
- Python: 3.8.15
- Stable-Baselines3: 2.2.1
- PyTorch: 1.13.1+cu117
- GPU Enabled: True
- Numpy: 1.24.1
- Cloudpickle: 2.2.1
- Gymnasium: 0.28.1
Checklist
- My issue does not relate to a custom gym environment. (Use the custom gym env template instead)
- I have checked that there is no similar issue in the repo
- I have read the documentation
- I have provided a minimal and working example to reproduce the bug
- I've used the markdown code blocks for both code and stack traces.