Skip to main content
eleutherai
Projects
neox
Reports
Snapshot Feb 24 2021, 2:18pm
Log in
Sign up
Share
Comment
Star
Share
Comment
Star
Snapshot Feb 24 2021, 2:18pm
GPT2_XL_pipe, regular adam. mp=2, pp=2, size reduced to match 1-bit adam model.
Shivanshu Purohit
Created on February 24
|
Last edited on February 24
Comment
Section 1
iteration_time
iteration_time
500
1k
1.5k
2k
Step
5
10
15
20
25
lm loss
lm loss
0
500
1k
1.5k
2k
Step
5
6
7
8
9
10
loss_scale
loss_scale
500
1k
1.5k
2k
Step
1e+9
2e+9
3e+9
4e+9
Run set
16
Run set
16
Add a comment