Caution Parameters for Reinforcement Learning in Safety-Critical Settings (CARL).

This project was created as part of the course “Advanced Topics in Reinforcement Learning” at TU Berlin in the WS22/23. The project is outlined in my blogpost. This repository is a slightly adapted version of the original code for the work of Zhang et al. (2020): Cautious Adaptation for RL in Safety-Critical Settings (CARL). CPU multiprocessing is enabled.

Installation

Clone this repository with git clone https://github.com/Safe-RL-Team/CARL-params.git. In order to experiment on MuJoco environments, you must have MuJoco 200 installed with an appropriate MuJuco license linked. See here to download and setup MuJoco 200: mujoco. On Ubuntu, we some extra packages may have to be installed first: sudo apt install -y libosmesa6-dev libgl1-mesa-glx libglfw3 patchelf.

To install the base packages required, simply pip install -r requirements.txt (Tested with Python 3.7.12).

Running for Experiments with Different Caution Parameters

Experiments for CARL State or CARL Reward over a range of caution parameters and target domains can be run using the necessary flags:

python exp_caution_params.py --CARL 'CARL' --min_caution 'min_caution' --max_caution 'max_caution' --ncaution_params 'ncaution_params'

Here is an example for running CARL State, looping over caution parameter lambda_2 in {0.5, 1, 1.5, 2} and target domains with pole length {1, 2} and using a pretrained model from 'log/ex_dir'.

python exp_caution_params.py --CARL State --min_caution 0.5 --max_caution 2 --ncaution_params 4 --min_td 1 --max_td 2 --ntds 2 --pretrain_dir log/ex_dir

Results will be saved in log/<date+time of experiment start>_<caution_param>_td_<test_domain>/. Trial data will be contained in log/<date+time of experiment start>_<caution_param>_td_<test_domain>/-tboard, You can run tensorboard --logdir <logdir> to visualize the results.

Directory Structure

Configuration files are located in config/, modify these python files to change some environment/model/training parameters.

The ensemble model class is also located in config/.

env/ contains gym environment files.

MPC.py contains training code and acting code for the MPC controller.

optimizers.py contains the optimizers (CEM) used for optimizing actions with MPC.

MBExperiment.py contains training, adaptation, and testing loop code.

Agent.py contains logic for interacting with the environment with model-based planning and collecting samples

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Caution Parameters for Reinforcement Learning in Safety-Critical Settings (CARL).

Installation

Running for Experiments with Different Caution Parameters

Directory Structure

About

Uh oh!

Uh oh!

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
README_assets		README_assets
config		config
env		env
Agent.py		Agent.py
DotmapUtils.py		DotmapUtils.py
MBExperiment.py		MBExperiment.py
MPC.py		MPC.py
README.md		README.md
exp_caution_params.py		exp_caution_params.py
optimizers.py		optimizers.py
requirements.txt		requirements.txt

Safe-RL-Team/CARL-params

Folders and files

Latest commit

History

Repository files navigation

Caution Parameters for Reinforcement Learning in Safety-Critical Settings (CARL).

Installation

Running for Experiments with Different Caution Parameters

Directory Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors 2

Uh oh!

Languages