Rohit8y's workspace
Runs
2
State
Notes
User
Tags
Created
Runtime
Sweep
batch_size
cache_dir
checkpoint_dir
context_len
dataset_name
epochs
grad_clip
log_interval
max_lr
min_lr
n_embd
n_heads
n_layer
seed
seq_len
vocab_size
wandb_entity
wandb_project
warmup_steps
weight_decay
epoch/average_loss
epoch/checkpoint_path
epoch/number
epoch/perplexity
epoch/time_seconds
epoch/tokens_per_sec
performance/step_time_ms
performance/tokens_per_sec
progress/epoch
progress/step
progress/total_tokens
train/grad_norm
train/learning_rate
train/loss
train/perplexity
Failed
-
thecr7guy3
8h 2m 45s
-
8
./data/
./checkpoints
1024
andersonbcdefg/cc-stories-parquet
16
1
100
0.0003
0.00003
768
12
12
7
1024
50304
training-transformers-vast
gpt2-sai
16384
0.1
3.67769
checkpoints/epoch_9.pt
9
39.55481
2248.29682
85844.32367
85.66952
95623.27695
10
212540
1741135872
1.6408
0.00014639
3.66022
38.86972
Crashed
-
thecr7guy3
13h 25m 31s
-
8
./data/
./checkpoints
1024
andersonbcdefg/cc-stories-parquet
16
1
100
0.0006
0.00006
768
12
12
1337
1024
50304
training-transformers-vast
gpt2-sai
16384
0.1
7.13092
checkpoints/epoch_11.pt
11
1250.02851
4378.4111
44080.72147
182.50823
44885.64702
12
259860
2128781312
144.94458
0.00018875
7.10964
1223.70239
1-2
of 2