Chandanms's workspace
Runs
5
State
Notes
User
Tags
Created
Runtime
Sweep
batch_size
compile
dtype
flash_attention
grad_clip
inference_only
intermediate_checkpoints
learning_rate
learning_rate_decay_frac
model_id
num_iterations
output_dir
sample_every
tensorcores
total_batch_size
train_dataset_config.column_name
train_dataset_config.is_tokenized
train_dataset_config.n_ctx
train_dataset_config.name
train_dataset_config.seed
train_dataset_config.split
train_dataset_config.streaming
train_dataset_config.tokenizer_file_path
train_log_every
val_dataset_config.column_name
val_dataset_config.is_tokenized
val_dataset_config.n_ctx
val_dataset_config.name
val_dataset_config.seed
val_dataset_config.split
val_dataset_config.streaming
val_dataset_config.tokenizer_file_path
val_loss_every
val_max_steps
wandb_project
warmup_iters
weight_decay
zero_stage
lr
train_loss
val_loss
Finished
-
chandanms
1h 29m 29s
-
64
true
bfloat16
true
1
false
true
0.0001
0.1
llama-35M
60000
/home/chandan/simple_stories_train/out
1000
true
32768
story
false
512
SimpleStories/SimpleStories
0
train
false
tokenizer/simplestories-tokenizer.json
100
story
false
512
SimpleStories/SimpleStories
0
test
false
tokenizer/simplestories-tokenizer.json
100
20
simplestories-v2
600
0.1
0
0.00001
1.50963
1.5456
Finished
-
chandanms
1h 15m 36s
-
64
true
bfloat16
true
1
false
true
0.0001
0.1
llama-30M
60000
/home/chandan/simple_stories_train/out
1000
true
32768
story
false
512
SimpleStories/SimpleStories
0
train
false
tokenizer/simplestories-tokenizer.json
100
story
false
512
SimpleStories/SimpleStories
0
test
false
tokenizer/simplestories-tokenizer.json
100
20
simplestories-v2
600
0.1
0
0.00001
1.54267
1.57164
Finished
-
chandanms
45m 54s
-
64
true
bfloat16
true
1
false
true
0.0001
0.1
llama-11M
60000
/home/chandan/simple_stories_train/out
1000
true
32768
story
false
512
SimpleStories/SimpleStories
0
train
false
tokenizer/simplestories-tokenizer.json
100
story
false
512
SimpleStories/SimpleStories
0
test
false
tokenizer/simplestories-tokenizer.json
100
20
simplestories-v2
600
0.1
0
0.00001
1.71437
1.73661
Finished
-
chandanms
50m 3s
-
64
true
bfloat16
true
1
false
true
0.0001
0.1
llama-5M
60000
/home/chandan/simple_stories_train/out
1000
true
32768
story
false
512
SimpleStories/SimpleStories
0
train
false
tokenizer/simplestories-tokenizer.json
100
story
false
512
SimpleStories/SimpleStories
0
test
false
tokenizer/simplestories-tokenizer.json
100
20
simplestories-v2
600
0.1
0
0.00001
1.86298
1.88238
Finished
-
chandanms
27m 58s
-
64
true
bfloat16
true
1
false
true
0.0001
0.1
llama-1.25M
60000
/home/chandan/simple_stories_train/out
1000
true
32768
story
false
512
SimpleStories/SimpleStories
0
train
false
tokenizer/simplestories-tokenizer.json
100
story
false
512
SimpleStories/SimpleStories
0
test
false
tokenizer/simplestories-tokenizer.json
100
20
simplestories-v2
600
0.1
0
0.00001
2.29854
2.32478
1-5
of 5