Skip to main content

Train policy

Created on June 23|Last edited on July 12

50100150200250Time (minutes)0123
50100150200250Time (minutes)1020304050
50100150200250Time (minutes)0.10.150.20.250.30.35
Select runs that logged objective/kl_coef
to visualize data in this line chart.
5001k1.5kStep-1.5-1-0.50
Select runs that logged objective/scores
to visualize data in this line chart.
Select runs that logged objective/kl
to visualize data in this line chart.
torch adam 1e-5
1
openai
40
openai2
torch adam 5e-4
1
torch adam 5e-4 batch_size=64
1



Ours
3
openai
1
openai2