-
Notifications
You must be signed in to change notification settings - Fork 20
Description
I am developing a model-based agent that interacts with the environment. A key part of my strategy is to use the environment's built-in baseline controller as a "safe action" whenever my agent's predictive model has high uncertainty. I'm having trouble figuring out how to command the environment to do this for a single step.
What I've Tried
- I see that in the baseline examples, the environment is initialized with actions=[]. In this mode, calling env.step([]) correctly runs the internal baseline controller for the entire episode.
- However, my agent needs to initialize the environment to control a specific variable, for example, actions=['oveHeaPumY_u']. In this mode, when my agent becomes uncertain and I try to pass an empty list [] to env.step(), the simulation crashes with an IndexError inside boptestGymEnv.py.
-I also attempted to pass None to env.step(), which resulted in a TypeError, as the code tries to index a NoneType object.
My Question
What is the intended way for an agent to defer control back to the environment's internal baseline controller for just a single timestep?
It seems like a common use case for advanced controllers (MPC, RL, etc.) to need a reliable, state-dependent fallback. If this isn't currently possible, could you consider supporting this? A simple check like if action is not None: before processing the action in the step() method would enable this functionality and prevent the crash.
Thank you for your work on this valuable project and for any help you can provide!