Skip to main content
costa-huang
Projects
cleanRL
Reports
descriptiveness
Log in
Sign up
Share
Comment
Star
Share
Comment
Star
descriptiveness
Costa
Created on July 16
|
Last edited on August 10
Comment
study batch size
objective/kl, ppo/objective/kl
objective/kl, ppo/objective/kl
500
1k
1.5k
2k
Step
0
2
4
6
8
10
objective/kl_coef, ppo/objective/kl_coef
objective/kl_coef, ppo/objective/kl_coef
500
1k
1.5k
2k
Step
0.1
0.2
0.3
0.4
objective/scores, ppo/objective/score, objective/score
objective/scores, ppo/objective/score, objective/score
500
1k
1.5k
2k
Step
0
1
2
3
ppo/objective/score_total, objective/score_total
ppo/objective/score_total, objective/score_total
Select runs that logged ppo/objective/score_total
to visualize data in this line chart.
ppo/ppo/policy/approxkl, ppo/policy/approxkl_avg
ppo/ppo/policy/approxkl, ppo/policy/approxkl_avg
500
1k
1.5k
2k
Step
0.001
0.002
0.003
0.004
ppo/ppo/policy/clipfrac, ppo/policy/clipfrac_avg
ppo/ppo/policy/clipfrac, ppo/policy/clipfrac_avg
500
1k
1.5k
2k
Step
0
0.002
0.004
0.006
0.008
batch_size=64
1
batch_size=128
1
batch_size=256
1
batch_size=512
1
batch_size=512, adam 1e-5
1
oai
10
batch_size=512, adam 3e-5
1
oai 2
batch_size=512, adam 8e-5
1
batch_size=512 sgd
1
batch_size=512, adam 1e-4
1
batch_size=512, 2e-4
1
batch_size=512 init_kl_coef=0.2
1
batch_size=512 init_kl_coef=0.25
2
batch_size=512, adam 8e-5, init_kl_coef=0.25
1
batch_size=512, adam 4e-5, init_kl_coef=0.3
1
Run set 17
1
bigger KL penalty
1
Run set 19
4
Run set 20
10
Run set 21
18
new adam
8
tensorflow-style adam
10
tensorflow-style adam 2
10
tensorflow-style adam gpt2-medium
10
gpt2-large
1
gpt2
10
gpt2-xl
10
gpt2-xl PT
10
gpt2-medium
20
gpt2-large
20
Run set 31
0
gpt2 PT
10
Run set 34
31
gpt2 fix clip val
10
alt clip val fix
10
Run set 37
1
gpt2 fix clip val
10
Run set 39
10
Run set 40
10
Add a comment