Skip to main content
haok
Projects
flame-moe
Log in
Sign up
Overview
Workspace
Runs
Automat.
Sweeps
Reports
Artifacts
Haok's workspace
Personal workspace
Automated workspace
Changes are only visible to you.
Runs
112
Name
112 visualized
Group: dense-191m
Group: dense-191m
3
Group: dense-411m
Group: dense-411m
1
Group: dense-70m
Group: dense-70m
3
31956
31956
31904
31904
31905
31905
Group: acquire
Group: acquire
10
Group: dclm-1b-1x
Group: dclm-1b-1x
2
Group: dclm-411m-4x
Group: dclm-411m-4x
1
Group: dclm-411m
Group: dclm-411m
1
Group: flame-moe-1.7b
Group: flame-moe-1.7b
2
Group: flame-moe-721m
Group: flame-moe-721m
1
Group: flame-moe-290m
Group: flame-moe-290m
1
Group: testing-2.0e19
Group: testing-2.0e19
2
Group: testing-2.4e20
Group: testing-2.4e20
5
Group: ablation-3e18
Group: ablation-3e18
16
01-10 of 16
Group: ablation-1e18
Group: ablation-1e18
16
01-10 of 16
Group: ablation-6e18
Group: ablation-6e18
16
01-10 of 16
Group: ablation-1e19
Group: ablation-1e19
16
01-10 of 16
Group: ablation-3e19
Group: ablation-3e19
16
01-10 of 16
1-17
of 17
Settings
New report
Add panels
Charts
10
1-6 of 10
z_loss
z_loss
2k
4k
6k
8k
10k
Step
10
20
30
40
group: acquire
group: ablation-6e18
group: ablation-3e18
group: flame-moe-1.7b
group: flame-moe-721m
group: flame-moe-290m
samples vs steps
samples vs steps
Computing group metrics from first 10 groups
2k
4k
6k
8k
10k
12k
14k
Step
5000000
10000000
15000000
group: dense-191m
group: dense-70m
group: dense-411m
group: acquire
group: ablation-6e18
group: ablation-3e18
group: dclm-1b-1x
group: dclm-411m-4x
group: dclm-411m
group: flame-moe-1.7b
num-zeros
num-zeros
Computing group metrics from first 10 groups
2k
4k
6k
8k
10k
12k
14k
Step
0
1e+9
2e+9
3e+9
4e+9
group: dense-191m
group: dense-70m
group: dense-411m
group: acquire
group: ablation-6e18
group: ablation-3e18
group: dclm-1b-1x
group: dclm-411m-4x
group: dclm-411m
group: flame-moe-1.7b
loss-scale
loss-scale
Computing group metrics from first 10 groups
2k
4k
6k
8k
10k
12k
14k
Step
0
0.5
1
1.5
2
group: dense-191m
group: dense-70m
group: dense-411m
group: acquire
group: ablation-6e18
group: ablation-3e18
group: dclm-1b-1x
group: dclm-411m-4x
group: dclm-411m
group: flame-moe-1.7b
load_balancing_loss
load_balancing_loss
Showing first 10 runs
20k
40k
60k
Step
1
2
3
4
5
group: acquire
group: ablation-6e18
group: ablation-3e18
group: flame-moe-1.7b
group: flame-moe-721m
group: flame-moe-290m
group: testing-2.4e20
group: testing-2.0e19
group: ablation-1e18
group: ablation-3e19
lm loss validation
lm loss validation
Computing group metrics from first 10 groups
2k
4k
6k
8k
10k
12k
14k
Step
3
4
5
6
7
8
group: dense-191m
group: dense-70m
group: dense-411m
group: ablation-6e18
group: ablation-3e18
group: acquire
group: dclm-1b-1x
group: dclm-411m-4x
group: dclm-411m
group: flame-moe-1.7b
System
22
1-6 of 22
Add section