SafeLife Benchmark Experiments

First runs and observations in the Safelife benchmark. Made by Stacey Svetlichnaya using Weights & Biases
Stacey Svetlichnaya

Goal: Measure & improve safety in reinforcement learning

How can we train a reinforcement learning agent to be safe, or minimize side effects—in the most general sense, and without explicit enumeration—while accomplishing goals?

Training videos (scroll for more)

The videos above capture a variety of agents after a short training period across the three task types:
You can hover over and scroll inside the panel to show more videos. While some of these early attempts are impressive (especially the "prune" agents), many of these agents seem to get stuck, oscillate between two states, or make many unnecessary moves. This is very much an unsolved problem with high headroom for improvement!

Launching runs

Follow the setup instructions on the Safelife benchmark then run
python3 start-training.py --wandb --steps 1000 test_run

Game play: train/validate, then benchmark

Each agent plays on three types of levels:

Key metrics to track

The line plots below show these metrics over the course of training (note that the x-axis fo these needs to be training/steps and not the default Steps, which tracks wandb.log steps). The bar charts report the final averages from benchmark levels. Below the charts, you can click on individual runset tabs to show/hide each group of agents by task type (append, prune, or navigate) independently. Note that scores are not directly comparable across task types. After some fast tests, I tried using DQN instead of PPO (all worse), then modifying some of the PPO hyperparameters.

Compare Reinforcement Learning Tasks

3 types: append (blue), prune (green), navigate (orange)

Group by task type to see average metrics

Append (blue) is hardest but as safe as prune (green)

Evaluating on 6M steps

Below are three baseline runs, each trained from the starter code for 6 million time steps. Note that the score is not directly comparable across tasks. Some initial observations:

Next steps

SafeLife is a general environment for benchmarking safety in reinforcement learning, offering many possible directions for further research. Some ideas you could try next:
We hope you find this benchmark fun and useful. Please comment below if you have any questions or feedback.