Skip to main content
nickcdryan
Projects
bench
Workspace
Log in
Sign up
Overview
Workspace
Runs
Automat.
Sweeps
Reports
Artifacts
Nickcdryan's workspace
Personal workspace
Automated workspace
Changes are only visible to you.
Runs
104
Name
2 visualized
STANDARD-BL-28-standardcorrected-c4-SEQ768-BS24-LR1e3-334M
STANDARD-BL-28-standardcorrected-c4-SEQ768-BS24-LR1e3-334M
BL-28-examplemixing-2hopresid-3skipweights-350M
BL-28-examplemixing-2hopresid-3skipweights-350M
BL-28-standard-2hopresid-2skipweights-334M
BL-28-standard-2hopresid-2skipweights-334M
BL-28-examplemixingcorrected-4-350M
BL-28-examplemixingcorrected-4-350M
BL-28-weightedskipdcorrected-init3.5,2,1:-c4-SEQ768-BS24-LR1e3-334M
BL-28-weightedskipdcorrected-init3.5,2,1:-c4-SEQ768-BS24-LR1e3-334M
BL-28-weightedskipdcorrected-init2-c4-SEQ768-BS24-LR1e3-334M
BL-28-weightedskipdcorrected-init2-c4-SEQ768-BS24-LR1e3-334M
BL-28-weightedskipdcorrected-init2,1,.5:-c4-SEQ768-BS24-LR1e3-334M
BL-28-weightedskipdcorrected-init2,1,.5:-c4-SEQ768-BS24-LR1e3-334M
BL-28-weightedskipdcorrected-init1-c4-SEQ768-BS24-LR1e3-334M
BL-28-weightedskipdcorrected-init1-c4-SEQ768-BS24-LR1e3-334M
BL-28-weightedskipcorrected-1,.5,0:-c4-SEQ768-BS24-LR1e3-334M
BL-28-weightedskipcorrected-1,.5,0:-c4-SEQ768-BS24-LR1e3-334M
BL-28-3gramexamplemixing-c4-mistraltok-SEQ768-BS24-LR1e3-371M
BL-28-3gramexamplemixing-c4-mistraltok-SEQ768-BS24-LR1e3-371M
BL-28-standard-noparallel-SEQ768-BS24-LR1e3-334M
BL-28-standard-noparallel-SEQ768-BS24-LR1e3-334M
BL-28-examplemixing-skipskiptemp--init1-.5-0:-c4-mistraltok-SEQ768-BS24-LR1e3-350M
BL-28-examplemixing-skipskiptemp--init1-.5-0:-c4-mistraltok-SEQ768-BS24-LR1e3-350M
BL-28-standardskip-c4-mistraltok-SEQ768-BS24-LR1e3-334M
BL-28-standardskip-c4-mistraltok-SEQ768-BS24-LR1e3-334M
BL-28-standard-c4-mistraltok-SEQ768-BS24-LR1e3-334M
BL-28-standard-c4-mistraltok-SEQ768-BS24-LR1e3-334M
BL-28-examplemixing-skipskip-n=4-c4-mistraltok-SEQ768-BS24-LR1e3-350M
BL-28-examplemixing-skipskip-n=4-c4-mistraltok-SEQ768-BS24-LR1e3-350M
BL-28-examplemixing-skip-n=4-c4-mistraltok-SEQ768-BS24-LR1e3-350M
BL-28-examplemixing-skip-n=4-c4-mistraltok-SEQ768-BS24-LR1e3-350M
BL-28-examplemixing-silu-n=4-c4-mistraltok-SEQ768-BS24-LR1e3-350M
BL-28-examplemixing-silu-n=4-c4-mistraltok-SEQ768-BS24-LR1e3-350M
BL-28-examplemixing-n=2-c4-mistraltok-SEQ768-BS24-LR1e3-350M
BL-28-examplemixing-n=2-c4-mistraltok-SEQ768-BS24-LR1e3-350M
BL-28-examplemixingreverse-n=4-c4-mistraltok-SEQ768-BS24-LR1e3-350M
BL-28-examplemixingreverse-n=4-c4-mistraltok-SEQ768-BS24-LR1e3-350M
BL-28-examplemixing-n=8-c4-mistraltok-SEQ768-BS24-LR1e3-350M
BL-28-examplemixing-n=8-c4-mistraltok-SEQ768-BS24-LR1e3-350M
1-20
of 104
Add panels
Charts
8
1-4 of 8
nwp_loss
nwp_loss
0
2k
4k
6k
8k
10k
12k
Step
3.5
4
4.5
5
5.5
6
6.5
7
BL-28-standard-c4-mistraltok-SEQ768-BS24-LR1e3-334M
BL-28-temp-init:1-c4-mistraltok-nopad-SEQ768-BS24-LR1e3-334M
nwp_loss_per_Gflop
nwp_loss_per_Gflop
0
20M
40M
60M
80M
100M
120M
140M
Step
4
6
8
10
BL-28-standard-c4-mistraltok-SEQ768-BS24-LR1e3-334M
train_loss
train_loss
0
5k
10k
15k
20k
Step
3.5
4
4.5
5
5.5
6
6.5
7
BL-28-standard-c4-mistraltok-SEQ768-BS24-LR1e3-334M
BL-28-temp-init:1-c4-mistraltok-nopad-SEQ768-BS24-LR1e3-334M
flop_step
flop_step
0
5k
10k
15k
20k
Step
0
50000000
1e+8
1.5e+8
2e+8
BL-28-standard-c4-mistraltok-SEQ768-BS24-LR1e3-334M
System
28
1-6 of 28
Add section