Shortlisted Intervention RL Experiments
Created on November 27|Last edited on February 5
Comment
Environment: Pong
Human Oversight Phase: 120,000 Time-steps
Environment: Mountain Car
Human Oversight Phase: 200,000 Time-steps; Steps 64; Batch Size 256
Human Oversight Phase: 500,000 Time-steps; Steps 64; Batch Size 256
Human Oversight Phase: 500,000 Time-steps; Steps 16; Batch Size 32
Human Oversight Phase: 300,000 Time-steps; Steps 64; Batch Size 256
Human Oversight Phase: 300,000 Time-steps; Steps 16; Batch Size 32
Human Oversight Phase: (NEW)
Environment: Lunar Lander
Human Oversight Phase: 20,000 Time-steps
Human Oversight Phase: 500,000 Time-steps
Human Oversight Phase: 500,000 Time-steps (New)
Human Oversight Phase: 800,000 Time-steps (New)
Human Oversight Phase: 1,000,000 Time-steps (New)
Environment: Breakout
Human Oversight Phase
Environment: Space Invaders
Human Oversight Phase: 120,000
Human Oversight Phase: 80,000
Human Oversight Phase: 60,000
Environment: Safety
Hyper-parameter Sweep
Hyper-parameter Sweep
Hyper-parameter Sweep
Hyper-parameter Sweep
Hyper-parameter Sweep 9
Hyper-parameter Sweep 10
Hyper-parameter Sweep 11
Hyper-parameter Sweep 12
Hyper-parameter Sweep 13
Environment: Half Cheetah
Environment: Pick Place
Template
Add a comment