Skip to main content

Snapshot Feb 23 2021, 2:21am

3D parallelism with no ZeRO, ran for 500 steps.
Created on February 22|Last edited on February 22

Section 1


2004006008001k1.2k1.4kStep101520253035
02004006008001k1.2k1.4kStep567891011
2004006008001k1.2k1.4kStep5e+81e+91.5e+92e+9
Run set
16



Run set
16