Skip to main content
a0970601776
Projects
Llama-3-Freego-8B-Instruct
Reports
DPO training note
Log in
Sign up
Share
Comment
Star
DPO training note
Liang
Created on December 9
|
Last edited on December 9
Comment
Section 1
train/rewards/rejected
train/rewards/rejected
500
1k
1.5k
train/global_step
-80
-60
-40
-20
0
dpo-2024-12-11-1
dpo-2024-12-05-1
dpo-2024-12-03-1
Run set
9
Add a comment