Skip to main content
brandony
Projects
neox
Reports
Sequence Parallel GPT-NeoX
Log in
Sign up
Share
Comment
Star
Sequence Parallel GPT-NeoX
Brandon Yang
Created on September 25
|
Last edited on September 25
Comment
Section 1
timers/backward-backward
timers/backward-backward
200
400
600
800
1k
Step
1
2
3
4
5
GPUC972-0
GPUC972-0
timers/optimizer
timers/optimizer
200
400
600
800
1k
Step
0.005
0.01
0.015
0.02
0.025
GPUC972-0
GPUC972-0
timers/batch generator
timers/batch generator
200
400
600
800
1k
Step
0.01
0.02
0.03
0.04
0.05
0.06
0.07
GPUC972-0
GPUC972-0
timers/backward
timers/backward
200
400
600
800
1k
Step
1
2
3
4
5
GPUC972-0
GPUC972-0
timers/forward
timers/forward
200
400
600
800
1k
Step
0.2
0.4
0.6
0.8
1
GPUC972-0
GPUC972-0
timers/backward-allreduce
timers/backward-allreduce
200
400
600
800
1k
Step
-2
-1
0
1
2
GPUC972-0
GPUC972-0
Run set
2
Run set
2
Run set
2
Run set
2
Run set
2
Add a comment