Skip to main content

Shortlisted Intervention RL Experiments

Created on November 27|Last edited on February 5

Environment: Pong

Human Oversight Phase: 120,000 Time-steps

Environment: Mountain Car

Human Oversight Phase: 200,000 Time-steps; Steps 64; Batch Size 256

Human Oversight Phase: 500,000 Time-steps; Steps 64; Batch Size 256

Human Oversight Phase: 500,000 Time-steps; Steps 16; Batch Size 32

Human Oversight Phase: 300,000 Time-steps; Steps 64; Batch Size 256

Human Oversight Phase: 300,000 Time-steps; Steps 16; Batch Size 32

Human Oversight Phase: (NEW)

Environment: Lunar Lander

Human Oversight Phase: 20,000 Time-steps

Human Oversight Phase: 500,000 Time-steps

Human Oversight Phase: 500,000 Time-steps (New)

Human Oversight Phase: 800,000 Time-steps (New)

Human Oversight Phase: 1,000,000 Time-steps (New)

Environment: Breakout

Human Oversight Phase

Environment: Space Invaders

Human Oversight Phase: 120,000

Human Oversight Phase: 80,000

Human Oversight Phase: 60,000

Environment: Safety

Hyper-parameter Sweep

Hyper-parameter Sweep

Hyper-parameter Sweep

Hyper-parameter Sweep

Hyper-parameter Sweep 9

Hyper-parameter Sweep 10

Hyper-parameter Sweep 11

Hyper-parameter Sweep 12

Hyper-parameter Sweep 13

Environment: Half Cheetah

Environment: Pick Place

Template