Skip to content

Commit f9b7f57

Browse files
authored
Merge pull request #31 from aivarsoo/simmer_saute_refactor
Simmer saute refactor
2 parents cce61a5 + ba04ece commit f9b7f57

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

56 files changed

+252
-4864
lines changed

README.md

Lines changed: 6 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,7 @@ Huawei, Noah's Ark Lab.
1010
- [Bayesian Optimisation with Compositional Optimisers](./CompBO)
1111
- [AntBO: Antibody Design with Combinatorial Bayesian Optimisation](./AntBO)
1212
- Reinforcement Learning Research
13-
- [Sauté RL: Almost Surely Safe RL Using State Augmentation](./SAUTE)
14-
- [SIMMER - Enhancing Safe Exploration Using Safety State Augmentation](./SIMMER)
13+
- [Sauté RL and Simmer RL: Safe Reinforcement Learning Using Safety State Augmentation ](./SIMMER)
1514
- [Model-Based Offline Reinforcement Learning with Pessimism-Modulated Dynamics Belief](./PMDB)
1615

1716
Further instructions are provided in the README files associated to each project.
@@ -119,27 +118,11 @@ in vitro experimentation.
119118

120119
# Reinforcement Learning Research
121120

122-
## [Sauté RL: Almost Surely Safe RL Using State Augmentation](./SAUTE/)
121+
## [Sauté RL and Simmer RL: Safe Reinforcement Learning Using Safety State Augmentation](./SIMMER)
123122

124-
### Sautéing a safe environment
123+
Codebase associated to: [Sauté RL: Almost Surely Safe RL Using State Augmentation](https://arxiv.org/pdf/2202.06558.pdf) and [Enhancing Safe Exploration Using Safety State Augmentation](https://arxiv.org/pdf/2206.02675.pdf).
125124

126-
Safety state augmentation (sautéing) is done in a straightforward manner. Assume a safe environment is defined in
127-
a class `MySafeEnv`. The sautéed environment is defined using a decorator `saute_env`, which contains all the
128-
required definitions. Custom and overloaded functions can be defined in the class body.
129-
130-
```python
131-
from envs.common.saute_env import saute_env
132-
133-
134-
@saute_env
135-
class MySautedEnv(MySafeEnv):
136-
"""New sauteed class."""
137-
```
138-
139-
Codebase associated to: [Sauté RL: Almost Surely Safe RL Using State Augmentation](https://arxiv.org/pdf/2202.06558.pdf).
140-
.
141-
142-
##### Abstract
125+
##### Abstract for Sauté RL: Almost Surely Safe RL Using State Augmentation (ICML 2022)
143126

144127
Satisfying safety constraints almost surely (or with probability one) can be critical for deployment of Reinforcement
145128
Learning (RL) in real-life applications. For example, plane landing and take-off should ideally occur with probability
@@ -151,12 +134,9 @@ approach has a plug-and-play nature, i.e., any RL algorithm can be "sauteed". Ad
151134
for policy generalization across safety constraints. We finally show that Saute RL algorithms can outperform their
152135
state-of-the-art counterparts when constraint satisfaction is of high importance.
153136

154-
## [SIMMER](./SIMMER)
155-
156137

157-
Codebase associated to: [Enhancing Safe Exploration Using Safety State Augmentation](https://arxiv.org/pdf/2206.02675.pdf).
158138

159-
##### Abstract
139+
##### Abstract for Effects of Safety State Augmentation on Safe Exploration (NeurIPS 2022)
160140
Safe exploration is a challenging and important problem in model-free reinforcement learning (RL). Often the safety cost
161141
is sparse and unknown, which unavoidably leads to constraint violations -- a phenomenon ideally to be avoided in
162142
safety-critical applications. We tackle this problem by augmenting the state-space with a safety state, which is
@@ -168,6 +148,7 @@ Safe exploration is a challenging and important problem in model-free reinforcem
168148
that simmering a safe algorithm can improve safety during training for both settings. We further show that Simmer can
169149
stabilize training and improve the performance of safe RL with average constraints.
170150

151+
171152
## [Model-Based Offline Reinforcement Learning with Pessimism-Modulated Dynamics Belief](./PMDB)
172153

173154
Code associdated to: [Model-Based Offline Reinforcement Learning with Pessimism-Modulated Dynamics Belief](https://nips.cc/Conferences/2022/Schedule?showEvent=54842) accepted

SAUTE/.gitignore

Lines changed: 0 additions & 140 deletions
This file was deleted.

SAUTE/README.md

Lines changed: 0 additions & 114 deletions
This file was deleted.

SAUTE/common/__init__.py

Whitespace-only changes.

SAUTE/common/argument_parser.py

Lines changed: 0 additions & 18 deletions
This file was deleted.

0 commit comments

Comments
 (0)