Lvwerra's workspace
Runs
1
Name
1 visualized
State
Notes
User
Tags
Created
Runtime
Sweep
batch_size
cliprange
cliprange_value
cls_model_name
forward_batch_size
gamma
horizon
init_kl_coef
lam
lm_name
lr
ppo_epochs
ref_lm_name
steps
target
tk_name
txt_in_len
txt_out_len
vf_coef
env/reward_mean
env/reward_std
objective/entropy
objective/kl
objective/kl_coef
ppo/loss/policy
ppo/loss/total
ppo/loss/value
ppo/mean_non_score_reward
ppo/policy/advantages_mean
ppo/policy/approxkl
ppo/policy/clipfrac
ppo/policy/entropy
ppo/policy/policykl
ppo/returns/mean
ppo/returns/var
ppo/val/clipfrac
ppo/val/error
ppo/val/mean
ppo/val/var
ppo/val/var_explained
ppo/val/vpred
time/build_input_sentiment
time/epoch
time/get_response
Finished
lvwerra
3h 12m 1s
-
256
0.2
0.2
lvwerra/bert-imdb
16
1
10000
0.2
0.95
lvwerra/gpt2-imdb
0.0000141
4
lvwerra/gpt2-imdb
25600
6
gpt2
5
15
0.1
3.22827
1.57719
40.0892
10.72071
0.31874
-0.089755
0.0079188
0.97674
-3.41715
1.3333e-8
0.72372
0.50111
2.92883
-0.13081
1.46849
1.36592
0.26647
1.69509
1.70844
1.21722
-0.24099
1.65932
0.10658
113.9246
9.59736
1-1
of 1