Skip to main content

refactor

Created on September 22|Last edited on September 22

5001k1.5kglobal_step012
exp_name: train_policy_accelerate, ppo.gradient_accumulation_steps: 1, base_model: gpt2 Run set
exp_name: train_policy_accelerate Run set 2
5001k1.5kglobal_step0246810
20406080100Time (minutes)20406080
20406080100Time (minutes)1416182022
Run set
5
Run set 2
4