Skip to main content

throughput/samples_per_sec (23/09/16 08:03:33)

Created on September 16|Last edited on September 16

MODEL_SIZE: GPT25B, machine: ThetaGPU, world_size: 32, env.SP_TYPE: megatron, micro_batch_size: 1, seq_length: 420000, env.GAS: 1, global_batch_size: 1, zero_stage: 1, env.MPSIZE: 32, env.PPSIZE: 1, env.SPSIZE: 1, use_flash_attn: trueMODEL_SIZE: GPT25B, machine: ThetaGPU, world_size: 32, env.SP_TYPE: megatron, micro_batch_size: 1, seq_length: 400000, env.GAS: 1, global_batch_size: 1, zero_stage: 1, env.MPSIZE: 32, env.PPSIZE: 1, env.SPSIZE: 1, use_flash_attn: trueMODEL_SIZE: GPT25B, machine: ThetaGPU, world_size: 32, env.SP_TYPE: megatron, micro_batch_size: 1, seq_length: 360000, env.GAS: 1, global_batch_size: 1, zero_stage: 1, env.MPSIZE: 32, env.PPSIZE: 1, env.SPSIZE: 1, use_flash_attn: trueMODEL_SIZE: GPT25B, machine: ThetaGPU, world_size: 32, env.SP_TYPE: megatron, micro_batch_size: 1, seq_length: 192000, env.GAS: 1, global_batch_size: 1, zero_stage: 1, env.MPSIZE: 32, env.PPSIZE: 1, env.SPSIZE: 1, use_flash_attn: trueMODEL_SIZE: GPT25B, machine: ThetaGPU, world_size: 32, env.SP_TYPE: megatron, micro_batch_size: 1, seq_length: 128000, env.GAS: 1, global_batch_size: 1, zero_stage: 1, env.MPSIZE: 32, env.PPSIZE: 1, env.SPSIZE: 1, use_flash_attn: true0.0000.0020.0040.0060.0080.0100.0120.0140.0160.0180.0200.0220.024
Run set
7