Skip to main content

different base models

Created on September 13|Last edited on September 18

5001k1.5kglobal_step012
exp_name: train_policy_accelerate, ppo.gradient_accumulation_steps: 1, base_model: EleutherAI/pythia-160m Run set
exp_name: train_policy_accelerate, ppo.gradient_accumulation_steps: 1, base_model: cerebras/Cerebras-GPT-111M Run set
exp_name: train_policy_accelerate, ppo.gradient_accumulation_steps: 64, base_model: gpt2 Run set
exp_name: train_policy_accelerate, ppo.gradient_accumulation_steps: 1, base_model: gpt2 Run set
exp_name: train_policy_accelerate Run set 2
5001k1.5kglobal_step051015
100200300400500Time (minutes)20406080100
100200300400500Time (minutes)20406080
Run set
19
Run set 2
5
Name
5 visualized
5
5
State
Notes
User
Tags
Created
Runtime
Sweep
base_model
batch_size
cuda
debug_normalize
deepspeed
distributed
eps
exp_name
global_learner_decices
gradient_accumulation_steps
label_dataset
labels.num_labels
labels.num_train
learner_device_ids
learner_devices
local_batch_size
local_micro_batch_size
local_normalize_samples
local_rank
local_rollout_batch_size
lr
normalize_after
normalize_before
normalize_samples
ppo.batch_size
ppo.cliprange
ppo.cliprange_value
ppo.eps
ppo.gamma
ppo.gradient_accumulation_steps
ppo.lam
ppo.local_batch_size
ppo.local_micro_batch_size
ppo.local_mini_batch_size
ppo.lr
ppo.minibatch_size
ppo.nminibatches
ppo.noptepochs
ppo.num_updates
ppo.total_episodes
Finished
costa-huang
9h 19m 57s
-
tiiuae/falcon-rw-1b
-
true
-
false
-
-
train_policy_accelerate
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
512
0.2
0.2
0.00001
1
1
0.95
64
64
64
0.00001
512
1
4
1953
1000000
Finished
costa-huang
18m 51s
-
tiiuae/falcon-rw-1b
32
true
0
-
-
0.00001
train_reward_accelerate
-
1
descriptiveness/offline_5k.json
4
4992
-
-
4
4
256
-
512
0.00005
true
true
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
1-2
of 2