Fix heads dtype
zero3 now works for ILQL, no changes for PPO
Created on January 23|Last edited on January 23
Comment
metrics/sentiments
metrics/sentiments
reward/mean
reward/mean
Run set
3
Name
3 visualized
State
Notes
User
Tags
Created
Runtime
Sweep
accelerate
accelerate_config_path
alpha
awac_scale
batch_size
beta
betas
checkpoint_dir
checkpoint_interval
chunk_size
cliprange
cliprange_value
cql_scale
device
distributed.gradient_accumulation_steps
distributed.gradient_clipping
distributed.mixed_precision
distributed.num_gpus
distributed.offload_optimizer_device
distributed.offload_param_device
distributed.zero_stage
epochs
eval_interval
gamma
gen_kwargs.do_sample
gen_kwargs.max_length
gen_kwargs.min_length
gen_kwargs.top_k
gen_kwargs.top_p
gen_size
grad_clip
horizon
init_kl_coef
initial_learning_rate
input_size
lam
learning_rate_init
learning_rate_target
log_interval
lr_decay_steps
lr_init
lr_ramp_steps
lr_target
method.alpha
Finished
-
sorry
main/84a0711/2023-01-18
1m 53s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
1
1
bf16
2
none
none
2
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
Finished
-
sorry
8m 55s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
1
1
bf16
2
none
none
3
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
0.001
Finished
-
sorry
1m 54s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
1
1
bf16
2
none
none
2
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
Finished
-
sorry
42s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
1
1
bf16
2
none
none
3
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
0.1
Finished
-
sorry
35s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
1
1
bf16
2
none
none
3
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
0.1
Failed
-
sorry
12s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
1
1
bf16
2
none
none
3
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
Failed
-
sorry
13s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
1
1
fp16
2
none
none
3
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
Finished
-
sorry
39s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
1
1
fp16
2
none
none
3
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
0.1
Failed
-
sorry
12s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
1
1
fp16
2
none
none
3
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
0.1
Failed
-
sorry
7s
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
1
1
fp16
2
none
none
3
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
0.1
1-10
of 2,224
Add a comment
Created with ❤️ on Weights & Biases.
https://wandb.ai/sorry/trlx/reports/Fix-heads-dtype--VmlldzozMzk3OTM3