dense-reward-carper v. main
dense-reward-carper
@442423f/Fix missing dtype in trainer rewards tensor (#520)/2023-07-10
main
@b9c16f5/fix peft config cause tensorboard type error (#515)/2023-07-05
Created on July 10|Last edited on July 10
Comment
ppo_sentiments_t5/t5-imdb/1gpu
sft_sentiments/gpt2/1gpu
ilql_randomwalks/GPT2Config/1gpu
ppo_sentiments/gpt2-imdb/1gpu
ppo_randomwalks/randomwalks/1gpu
ppo_hh/pythia-6B-static-sft/7gpus
ilql_sentiments/gpt2/1gpu
Add a comment