Comparing different methods (CPO, Lyapunov, own sPPO) for safe continuous-state RL as part of EE-618 course at EPFL
Sergei Volodin. Swiss Federal Institute of Technology in Lausanne (EPFL)
Course project for Theory and Methods of Reinforcement Learning, EE-618 at EPFL
We consider CPO, sDQN, PPPO and a random agent. See our report for more details
Tested on Ubuntu 16.04.5 LTS with 12 CPU, 60GB of RAM and 2x GPU NVidia GeForce 1080.
git clone https://github.com/sergeivolodin/SafeContinuousStateRL.git; cd SafeContinuousStateRL
conda env create -f environment.yml
run_all.sh
output/*.output
files and output/figures/*.pdf
files, as well as will output run information to run_*.txt
analyze_run.ipynb
notebook to produce figuresexperiment.py
is the main file containing one experiment (loading agent, training, computing metrics)saferl.py
defines a ConstrainedEnvironment
and the ConstrainedAgent
abstract classes as well as helpers and the function to create a safe environment make_safe_env
sppo.py
implements Projected Proximal Policy Optimizationbaselines.py
implements CPO and a random agentcartpole_safety_sdqn.ipynb
is the (non-working) implementation of sDQNconfig.py
contains the parameters of the experimenthelpers.py
contains the functions for run analysistf_helpers.py
contains some helper functions using TensorFlowcosts.py
implements costs for environmentscartpole_safety_a2c.ipynb
implements an (unsafe) A2Ctfshow.py
embeds a TF graph into a Jupyter notebook, from StackOverflowcreate_run.py
creates the .sh
script from config.py
analyze_run.ipynb
analyzes output produced by training (the .sh
script) and writes output to run_*.txt
and figures to output/figures
output/*.sh
files consist of many lines of the form python ../experiment.py --param1 v1 --param2 v2 ...
, running at most 16 processes in total (8 per GPU)output/*.output
files contain outputs of experiment.py
(one run corresponds to one file)output/figures
contains generated figuresrun_setting.sh
runs a particular setting (create + .sh
+ analyze) and writes data to a filerun_all.sh
runs all settings