Skip to main content

dense-reward-carper v. main

dense-reward-carper @442423f/Fix missing dtype in trainer rewards tensor (#520)/2023-07-10 main @b9c16f5/fix peft config cause tensorboard type error (#515)/2023-07-05
Created on July 10|Last edited on July 10

ppo_sentiments_t5/t5-imdb/1gpu


01k2k3k4kStep0.550.60.650.70.750.8
Run set
1



Run set
1


sft_sentiments/gpt2/1gpu


Run set
1



Run set
1


ilql_randomwalks/GPT2Config/1gpu


Run set
2



Run set
2


ppo_sentiments/gpt2-imdb/1gpu


Run set
1



Run set
1


ppo_randomwalks/randomwalks/1gpu


Run set
2



Run set
2


ppo_hh/pythia-6B-static-sft/7gpus


Run set
1



Run set
1


ilql_sentiments/gpt2/1gpu


Run set
1



Run set
1