Tulu3 8B RM
[['?we=ai2-llm&wpn=open_instruct_public&ceik=chat_template_name&cen=chat_template_name&metrics=train/rm/accuracy&metrics=train/rm/loss&metrics=train/rm/chosen_rewards&metrics=train/rm/rejected_rewards&metrics=train/rm/reward_margin&metrics=train/rm/lr&metric_names=Accuracy&metric_names=Loss&metric_names=Chosen Rewards&metric_names=Rejected Rewards&metric_names=Reward Margin&metric_names=Learning Rate', 'tulu?tag=no-tag-734-g3e689d0&tag=pr-616&tag=tulu3_8b_rm&cl=Tulu3 8B RM']]
Created on March 21|Last edited on March 21
Comment
Add a comment