Awni00's workspace
Runs
50
Name
50 visualized
State
Notes
User
Tags
Created
Runtime
Sweep
model_config.activation
model_config.bias
model_config.d_model
model_config.dropout_rate
model_config.max_block_size
model_config.n_heads
model_config.n_heads_ra
model_config.n_heads_sa
model_config.n_layers
model_config.norm_first
model_config.pos_enc_type
model_config.ra_kwargs.n_kv_heads
model_config.ra_kwargs.n_relations
model_config.ra_kwargs.rel_activation
model_config.ra_kwargs.rel_proj_dim
model_config.ra_kwargs.symmetric_rels
model_config.ra_kwargs.use_relative_positional_symbols
model_config.ra_type
model_config.sa_kwargs.n_kv_heads
model_config.share_attn_params
model_config.symbol_retrieval
model_config.symbol_retrieval_kwargs.d_model
model_config.symbol_retrieval_kwargs.max_rel_pos
model_config.symbol_retrieval_kwargs.n_heads
model_config.symbol_retrieval_kwargs.n_symbols
model_config.symbol_retrieval_kwargs.symbol_dim
model_config.symbol_retrieval_kwargs.trainable_symbols
model_config.symbol_retriever_config.shared_symbol_retriever
model_config.symbol_retriever_config.weight_tie_symbol_library
model_config.use_flash_attention
model_config.vocab_size
model_summary.Estimated total size (MB)
model_summary.Forward/backward pass size (MB)
model_summary.Input size (MB)
model_summary.Params size (MB)
model_summary.Total Mult-Adds
model_summary.num_params
model_summary.num_trainable_params
model_summary.total_params
model_summary.trainable_params
optimizer_config.betas
optimizer_config.grad_clip
optimizer_config.learning_rate
optimizer_config.max_lr
Finished
-
awni00
resumed run
15h 25m 50s
-
gelu
false
1536
0
1024
-
12
12
24
true
RoPE
6
64
identity
12
false
-
relational_attention
6
false
symbolic_attention
1536
-
8
1024
-
false
true
false
-
50304
7294.72008
4061.13485
0.004176
3233.58106
862694400
735453696
733880832
812720640
811147776
[0.9,0.95]
1
0.0006
0.0006
Finished
-
awni00
19h 38m 3s
-
gelu
false
1536
0
1024
24
-
-
24
true
RoPE
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
true
50304
7095.79579
3759.14496
0.004176
3336.64666
834161664
756894720
756894720
834161664
834161664
[0.9,0.95]
1
0.0006
0.0006
Finished
-
awni00
15s
-
gelu
false
1536
0
1024
24
-
-
24
true
RoPE
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
true
50304
7095.79579
3759.14496
0.004176
3336.64666
834161664
756894720
756894720
834161664
834161664
[0.9,0.95]
1
0.0006
0.0006
Finished
-
awni00
resumed run
23h 30m 57s
-
gelu
false
2048
0
1024
-
16
16
24
true
RoPE
8
128
identity
8
false
-
relational_attention
8
false
symbolic_attention
2048
-
16
512
-
false
true
false
-
50304
10750.7508
5277.48301
0.004176
5473.26362
1464832000
1270536192
1269487616
1373558784
1372510208
[0.9,0.95]
1
0.0006
0.0006
Crashed
-
awni00
terminated early
1d 7h 28m 33s
-
gelu
false
1536
0
1024
-
12
12
24
true
RoPE
6
64
identity
12
false
-
relational_attention
6
false
symbolic_attention
1536
-
8
1024
-
false
true
false
-
50304
7294.72008
4061.13485
0.004176
3233.58106
862694400
735453696
733880832
812720640
811147776
[0.9,0.95]
1
0.0006
0.0006
Crashed
-
awni00
5h 54m 9s
-
gelu
false
1536
0
1024
24
-
-
24
true
RoPE
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
true
50304
7095.79579
3759.14496
0.004176
3336.64666
834161664
756894720
756894720
834161664
834161664
[0.9,0.95]
1
0.0006
0.0006
Finished
-
awni00
resumed run
3h 55m 50s
-
gelu
false
2048
0
1024
-
16
16
24
true
RoPE
8
128
identity
8
false
-
relational_attention
8
false
symbolic_attention
2048
-
16
2048
-
false
true
false
-
50304
10750.7508
5277.48301
0.004176
5473.26362
1464832000
1276827648
1272633344
1379850240
1375655936
[0.9,0.95]
1
0.0006
0.0006
Finished
-
awni00
resumed run
7h 48m 32s
-
gelu
false
1024
0
1024
-
8
8
24
true
RoPE
4
32
identity
16
false
-
relational_attention
4
false
symbolic_attention
1024
-
8
1024
-
false
true
false
-
50304
4419.60046
2844.78669
0.004176
1574.8096
417843200
344681472
343632896
396192768
395144192
[0.9,0.95]
1
0.0006
0.0006
Finished
-
awni00
resumed run
15h 43m 27s
-
gelu
false
1024
0
1024
-
8
8
24
true
RoPE
4
64
identity
8
false
-
relational_attention
4
false
symbolic_attention
1024
-
8
1024
-
false
true
false
-
50304
4419.60046
2844.78669
0.004176
1574.8096
417843200
345074688
344026112
396585984
395537408
[0.9,0.95]
1
0.0006
0.0006
Crashed
-
awni00
terminated early
1d 23h 49m 34s
-
gelu
false
2048
0
1024
-
16
16
24
true
RoPE
8
128
identity
8
false
-
relational_attention
8
false
symbolic_attention
2048
-
16
2048
-
false
true
false
-
50304
10750.7508
5277.48301
0.004176
5473.26362
1464832000
1276827648
1272633344
1379850240
1375655936
[0.9,0.95]
1
0.0006
0.0006
Finished
-
awni00
terminated early
1d 23h 50m 8s
-
gelu
false
1024
0
1024
-
8
8
24
true
RoPE
4
32
identity
16
false
-
relational_attention
4
false
symbolic_attention
1024
-
8
1024
-
false
true
false
-
50304
4419.60046
2844.78669
0.004176
1574.8096
417843200
344681472
343632896
396192768
395144192
[0.9,0.95]
1
0.0006
0.0006
Finished
-
awni00
terminated early
1d 23h 50m 5s
-
gelu
false
1024
0
1024
-
8
8
24
true
RoPE
4
32
identity
16
true
-
relational_attention
4
false
symbolic_attention
1024
-
8
1024
-
false
true
false
-
50304
4369.26882
2844.78669
0.004176
1524.47795
417843200
332098560
331049984
383609856
382561280
[0.9,0.95]
1
0.0006
0.0006
Finished
-
awni00
terminated early
1d 23h 50m 1s
-
gelu
false
1024
0
1024
-
8
8
24
true
RoPE
4
64
identity
8
false
-
relational_attention
4
false
symbolic_attention
1024
-
8
1024
-
false
true
false
-
50304
4419.60046
2844.78669
0.004176
1574.8096
417843200
345074688
344026112
396585984
395537408
[0.9,0.95]
1
0.0006
0.0006
Failed
-
awni00
terminated early
13h 2m 46s
-
gelu
false
2048
0
1024
-
16
16
24
true
RoPE
8
128
identity
8
false
-
relational_attention
8
false
symbolic_attention
2048
-
16
2048
-
false
true
false
-
50304
10750.7508
5277.48301
0.004176
5473.26362
1464832000
1276827648
1272633344
1379850240
1375655936
[0.9,0.95]
1
0.0006
0.0006
Crashed
-
awni00
terminated early
1d 14h 36m 41s
-
gelu
false
2048
0
1024
-
16
16
24
true
RoPE
8
128
identity
8
false
-
relational_attention
8
false
symbolic_attention
2048
-
16
512
-
false
true
false
-
50304
10750.7508
5277.48301
0.004176
5473.26362
1464832000
1270536192
1269487616
1373558784
1372510208
[0.9,0.95]
1
0.0006
0.0006
Crashed
-
awni00
terminated early
1d 17h 20m 4s
-
gelu
false
2048
0
1024
-
16
16
24
true
RoPE
8
128
identity
8
false
-
relational_attention
8
false
symbolic_attention
2048
-
8
2048
-
false
true
false
-
50304
10750.7508
5277.48301
0.004176
5473.26362
1464832000
1276827648
1272633344
1379850240
1375655936
[0.9,0.95]
1
0.0006
0.0006
Finished
-
awni00
1d 15h 40m 4s
-
gelu
false
2048
0
1024
-
16
16
24
true
RoPE
8
64
identity
16
false
-
relational_attention
8
false
symbolic_attention
2048
-
8
2048
-
false
true
false
-
50304
10750.7508
5277.48301
0.004176
5473.26362
1464832000
1275254784
1271060480
1378277376
1374083072
[0.9,0.95]
1
0.0006
0.0006
Finished
-
awni00
resumed run
28m 37s
-
gelu
false
1024
0
1024
-
8
8
24
true
RoPE
2
-
identity
-
false
-
relational_attention
2
-
symbolic_attention
1024
-
8
1024
-
false
true
false
-
50304
4268.60552
2744.12339
0.004176
1524.47795
405260288
331803648
330755072
383314944
382266368
[0.9,0.95]
1
0.0006
0.0006
Finished
-
awni00
resumed run
1h 14m 21s
-
gelu
false
1024
0
1024
-
8
8
24
true
RoPE
4
-
identity
-
false
-
relational_attention
4
-
symbolic_attention
1024
-
8
1024
-
false
true
false
-
50304
4419.60046
2844.78669
0.004176
1574.8096
417843200
344386560
343337984
395897856
394849280
[0.9,0.95]
1
0.0006
0.0006
Finished
-
awni00
resumed run
15h 2m 57s
-
gelu
false
2048
0
1024
32
-
-
24
true
RoPE
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
true
50304
10531.65576
4874.82982
0.004176
5656.82176
1414205440
1311182848
1311182848
1414205440
1414205440
[0.9,0.95]
1
0.0006
0.0006
1-20
of 50