Drscotthawley's workspace
Runs
2
State
Notes
User
Tags
Created
Runtime
Sweep
batch_size
bias
dropout
epochs
learning_rate
n_blocks
n_embd
n_heads
seq_length
vocab_sizes
weight_decay
epoch
step
train
train_ema
val
val_ema
Finished
drscotthawley
14h 5m 31s
-
256
false
0.1
60
0.001
8
256
16
128
[128,501,501]
0.01
60
61500
1.63724
1.65325
1.90125
1.88867
Crashed
drscotthawley
1d 3h 33m 23s
-
256
false
0.5
50
0.001
8
256
16
128
[128,501,501]
0.04
16
16169
2.00416
2.02307
2.03811
2.04675
1-2
of 2