Skip to main content

Qwen2.5-7B-Instruct SFT on CodeForces-CoT

Created on March 1|Last edited on March 3

Section 1

Initial scan with:
  • learning rate in range [1e-5, 2e-5, 4e-5]
  • packing=false
  • effective bs = 128
  • num epochs = 10 (checkpoint every 20% steps)

2468train/epoch0.20.40.60.81
2468train/epoch0.000010.000020.000030.00004
2468train/epoch246
all
25
solutions
6
solutions_w_editorials
4
test_input_generator
3
checker_interactor
3
openthoughts-solutions-mix
7
openthoughts-solutions-w-editorials-mix
6