Skip to main content
eleutherai
Projects
neox
Reports
Snapshot Feb 23 2021, 2:21am
Log in
Sign up
Share
Comment
Star
Share
Comment
Star
Snapshot Feb 23 2021, 2:21am
3D parallelism with no ZeRO, ran for 500 steps.
Shivanshu Purohit
Created on February 22
|
Last edited on February 22
Comment
Section 1
iteration_time
iteration_time
200
400
600
800
1k
1.2k
1.4k
Step
10
15
20
25
30
35
lm loss
lm loss
0
200
400
600
800
1k
1.2k
1.4k
Step
5
6
7
8
9
10
11
loss_scale
loss_scale
200
400
600
800
1k
1.2k
1.4k
Step
5e+8
1e+9
1.5e+9
2e+9
Run set
16
Run set
16
Add a comment