Skip to main content

Complete Mountain Car Online Experiments

Created on April 23|Last edited on September 10

Experiment 1: Regular DQN, Large Catastrophe Zone, Vary Penalty Values

Experiment 1A: Penalty Value = -4


Select runs that logged average_score
to visualize data in this line chart.
Run set
0


Experiment 1B: Penalty Value = -6

Experiment 1C: Penalty Value = -8

Experiment 2: Duplicating Interventions, Large Catastrophe Zone

Experiment 3: Experiment 1 with Longer Period of Training

Experiment 4: Experiment 2 Sweep with Penalty = -4

Experiment 5: Parameter that Controls Proportion of Human Actions

Experiment 6: Appending Interventions

Experiment 7: Learned Blocker

Experiment 8: Replacing Robot Interventions with Human Actions

Experiment 8A: Learned Blocker; 50% Probability of Replacement


Regular
0
Control
0


Experiment 8B: Perfect (Algo) Blocker


Regular
0
Control
0


Experiment 9: Learned Blocker (with only State)