Skip to content

[Bug]: Reset options ignored when resetting due to termination / truncation from within wrapper's step #1790

Open
@npit

Description

@npit

🐛 Bug

Given a wrapped env, options passed with the recommended way (wrapped_env.set_options) are ignored when reset is triggered by episode termination / truncation in step_wait of the env wrapper.

A (myopic) fix could be to pass self._options[env_idx] and self._seeds[env_idx] to the linked code above, or refactor a single-env resetting function to use both in DummyVecEnv.step_wait and DummyVecEnv.reset.

Apologies if this expected behavior. If so, what is the recommended way to pass reset options that affect the above resetting scenario?

To Reproduce

from stable_baselines3.common.vec_env import DummyVecEnv
import gymnasium as gym
import numpy as np


# dummy env
class CustomEnv(gym.Env):
    def __init__(self):
        super().__init__()
        self.action_space = gym.spaces.Discrete(3)
        self.observation_space = gym.spaces.Box(low=0, high=5, shape=(5,), dtype=np.uint8)

    # step terminates the episode
    def step(self, action):
        terminated = 1
        return np.zeros(5), 0, terminated, 0, {}

    # reset prints the options, if provided
    def reset(self, seed=None, options=None):
        if options is not None:
            print(" -- Options supplied:", options)
        return np.zeros(5), {}

# make and wrap the env
env = CustomEnv()
env = DummyVecEnv([lambda: env])

# resetting by invoking the wrapper function -- options are passed
print("Resetting by invoking DummyVecEnv.reset() :")
env.set_options({'opt': 1})
env.reset()

# reset by a terminating environment step:
# the wrapper step function calls self.envs[env_idx].reset(), which ignores self._options
print("Resetting by an episode-terminating invokation to DummyVecEnv.step() :")
env.set_options({'opt': 1})
env.step([0])

print("Done.")

Relevant log output / Error message

Resetting by invoking DummyVecEnv.reset() :
 -- Options supplied: {'opt': 1}
Resetting by an episode-terminating invokation to DummyVecEnv.step() :
Done.

System Info

  • OS: Linux-6.6.7-arch1-1-x86_64-with-glibc2.34 # 1 SMP PREEMPT_DYNAMIC Thu, 14 Dec 2023 03:45:42 +0000
  • Python: 3.8.15
  • Stable-Baselines3: 2.2.1
  • PyTorch: 1.13.1+cu117
  • GPU Enabled: True
  • Numpy: 1.24.1
  • Cloudpickle: 2.2.1
  • Gymnasium: 0.28.1

Checklist

  • My issue does not relate to a custom gym environment. (Use the custom gym env template instead)
  • I have checked that there is no similar issue in the repo
  • I have read the documentation
  • I have provided a minimal and working example to reproduce the bug
  • I've used the markdown code blocks for both code and stack traces.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingdocumentationImprovements or additions to documentationhelp wantedHelp from contributors is welcomed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions