Skip to main content
l2hmc-qcd
Projects
Megatron-DS-Benchmarking
Reports
throughput/tflops (23/09/16 08:08:13)
Log in
Sign up
Share
Comment
Star
Share
Comment
Star
throughput/tflops (23/09/16 08:08:13)
Sam Foreman
Created on September 16
|
Last edited on February 2
Comment
throughput/tflops
throughput/tflops
MODEL_SIZE: GPT25B, machine: ThetaGPU, world_size: 32, env.SP_TYPE: megatron, micro_batch_size: 1, seq_length: 420000, env.GAS: 1, global_batch_size: 1, zero_stage: 1, env.MPSIZE: 32, env.PPSIZE: 1, env.SPSIZE: 1, use_flash_attn: true
MODEL_SIZE: GPT25B, machine: ThetaGPU, world_size: 32, env.SP_TYPE: megatron, micro_batch_size: 1, seq_length: 400000, env.GAS: 1, global_batch_size: 1, zero_stage: 1, env.MPSIZE: 32, env.PPSIZE: 1, env.SPSIZE: 1, use_flash_attn: true
MODEL_SIZE: GPT25B, machine: ThetaGPU, world_size: 32, env.SP_TYPE: megatron, micro_batch_size: 1, seq_length: 360000, env.GAS: 1, global_batch_size: 1, zero_stage: 1, env.MPSIZE: 32, env.PPSIZE: 1, env.SPSIZE: 1, use_flash_attn: true
MODEL_SIZE: GPT25B, machine: ThetaGPU, world_size: 32, env.SP_TYPE: megatron, micro_batch_size: 1, seq_length: 192000, env.GAS: 1, global_batch_size: 1, zero_stage: 1, env.MPSIZE: 32, env.PPSIZE: 1, env.SPSIZE: 1, use_flash_attn: true
MODEL_SIZE: GPT25B, machine: ThetaGPU, world_size: 32, env.SP_TYPE: megatron, micro_batch_size: 1, seq_length: 128000, env.GAS: 1, global_batch_size: 1, zero_stage: 1, env.MPSIZE: 32, env.PPSIZE: 1, env.SPSIZE: 1, use_flash_attn: true
MODEL_SIZE: GPT13B, machine: Polaris, world_size: 16, env.SP_TYPE: megatron, micro_batch_size: 1, seq_length: 2048, env.GAS: 1, global_batch_size: 1, zero_stage: 0, env.MPSIZE: 16, env.PPSIZE: 1, env.SPSIZE: 1, use_flash_attn: true
MODEL_SIZE: GPT13B, machine: ThetaGPU, world_size: 8, env.SP_TYPE: megatron, micro_batch_size: 1, seq_length: 2048, env.GAS: 1, global_batch_size: 1, zero_stage: 0, env.MPSIZE: 8, env.PPSIZE: 1, env.SPSIZE: 1, use_flash_attn: true
0
10
20
30
40
50
60
70
80
90
100
110
Run set
9
Add a comment
110