Skip to main content
eleutherai
Projects
mesh-transformer-jax
Workspace
Log in
Sign up
Overview
Workspace
Runs
Automat.
Sweeps
Reports
Artifacts
Aran's workspace
Personal workspace
Automated workspace
Changes are only visible to you.
Runs
514
Name
20 visualized
baseline5
baseline5
moe-dense2
moe-dense2
baseline5
baseline5
moe-dense2
moe-dense2
baseline5
baseline5
moe-dense2
moe-dense2
moe-dense
moe-dense
moe-dense
moe-dense
baseline4
baseline4
baseline3
baseline3
baseline3
baseline3
baseline2
baseline2
baseline
baseline
GPT3_6B_pile_rotary
GPT3_6B_pile_rotary
baseline_small
baseline_small
bert2
bert2
bert
bert
baseline_small
baseline_small
bert
bert
bert
bert
glm18
glm18
glm18
glm18
glm18
glm18
baseline
baseline
glm17
glm17
glm16
glm16
glm16
glm16
glm16
glm16
glm15
glm15
glm15
glm15
glm14-8cores
glm14-8cores
glm14-8cores
glm14-8cores
glm13-8cores
glm13-8cores
glm12-8cores
glm12-8cores
glm11-8cores
glm11-8cores
glm10-8cores
glm10-8cores
glm10-8cores
glm10-8cores
glm10-8cores
glm10-8cores
glm10-8cores
glm10-8cores
glm10-8cores
glm10-8cores
glm9-8cores
glm9-8cores
glm8-8cores
glm8-8cores
glm7-8cores
glm7-8cores
glm7-8cores
glm7-8cores
glm7-8cores
glm7-8cores
glm6-8cores
glm6-8cores
glm6-8cores
glm6-8cores
glm5-8cores
glm5-8cores
glm4-8cores
glm4-8cores
glm3-8cores
glm3-8cores
glm-8cores
glm-8cores
glm2-8cores
glm2-8cores
baseline2-8cores
baseline2-8cores
posemb-8cores
posemb-8cores
posemb-8cores
posemb-8cores
posemb-8cores
posemb-8cores
posemb-8cores
posemb-8cores
posemb-8cores
posemb-8cores
posemb-8cores
posemb-8cores
posemb-8cores
posemb-8cores
posemb-8cores
posemb-8cores
baseline-8cores
baseline-8cores
baseline-8cores
baseline-8cores
baseline-8cores
baseline-8cores
baseline-8cores
baseline-8cores
baseline-8cores
baseline-8cores
baseline-8cores
baseline-8cores
baseline-8cores
baseline-8cores
baseline-8cores
baseline-8cores
baseline-8cores
baseline-8cores
scaling-h-4
scaling-h-4
scaling-h-4
scaling-h-4
scaling-ffn_not_branched
scaling-ffn_not_branched
scaling-multi-conv
scaling-multi-conv
scaling-multi-conv
scaling-multi-conv
scaling-ffn_not_branched
scaling-ffn_not_branched
scaling-ffn_not_branched
scaling-ffn_not_branched
scaling-multi-conv
scaling-multi-conv
scaling-mc
scaling-mc
scaling-mc
scaling-mc
scaling-mc
scaling-mc
scaling-mc
scaling-mc
scaling-soft8
scaling-soft8
scaling-soft8
scaling-soft8
scaling-soft8
scaling-soft8
scaling-softmax-test
scaling-softmax-test
scaling-ppp
scaling-ppp
scaling-fff
scaling-fff
scaling-ppp
scaling-ppp
scaling-ppp
scaling-ppp
scaling-ppp
scaling-ppp
scaling-fff
scaling-fff
ppf
ppf
ppp
ppp
ppp
ppp
ppp
ppp
ppp
ppp
lr-4x-ffn
lr-4x-ffn
lr-4x-attn
lr-4x-attn
lr-4x-attn
lr-4x-attn
1-100
of 514
train/loss
train/loss
Showing first 10 runs
0
50k
100k
150k
200k
Step
2
4
6
8
10
baseline5
moe-dense2
baseline5
moe-dense2
moe-dense
baseline4
baseline3
baseline3
baseline
GPT3_6B_pile_rotary
Previous
Next