Comment
Run set 2
0
Name
0 visualized
eval/full_eval accuracy
eval/dahoas_full_hh_rlhf_chosen_longer_eval accuracy
eval/dahoas_full_hh_rlhf_rejected_longer_eval accuracy
eval/dahoas_rm_static_chosen_longer_eval accuracy
eval/dahoas_rm_static_rejected_longer_eval accuracy
eval/yitingxie_rlhf_reward_chosen_longer_eval accuracy
eval/yitingxie_rlhf_reward_rejected_longer_eval accuracy
clearml_args.clearml_api_access_key
clearml_args.clearml_api_host
clearml_args.clearml_api_secret_key
clearml_args.clearml_files_host
clearml_args.clearml_project
clearml_args.clearml_web_host
eval_dataset_args.debug_mode
eval_dataset_args.reward_token
iterations
lora_params.lora_dim
lora_params.lora_module_name
loss_config.loss_kwargs.cut_prompt
loss_config.loss_kwargs.cut_same_begining
loss_config.loss_kwargs.margin
loss_config.loss_kwargs.max_ends_in_pair
loss_config.loss_kwargs.mean_mode
loss_config.loss_kwargs.tokens_weights_mode
loss_config.loss_name
loss_function
model
reward_token
0
of 0
Add a comment
Created with ❤️ on Weights & Biases.
https://wandb.ai/nikgerasimenko/reward_model--moved/reports/Facebook-OPT-360M-on-open-Eng-datasets--Vmlldzo0NjQ1OTY1