Skip to main content

fix-kl-computation v. main

fix-kl-computation @c37aa8b/feat(ppo_trainer): log token-wise KL/2023-04-20 main @9bc0836/fix(offline_pipeline): ILQL negative indexing under truncation (#435)/2023-04-18
Created on April 20|Last edited on April 20

ilql_sentiments/gpt2/1gpu


02004006008001kStep0.50.60.7
Run set
2



Run set
2


ppo_sentiments_t5/t5-imdb/1gpu


Run set
2



Run set
2


ppo_hh/pythia-6B-static-sft/7gpus


Run set
2



Run set
2


ppo_randomwalks/randomwalks/1gpu


Run set
2



Run set
2


ppo_sentiments/gpt2-imdb/1gpu


Run set
2



Run set
2


ilql_randomwalks/GPT2Config/1gpu


Run set
2



Run set
2


sft_sentiments/gpt2/1gpu


Run set
2



Run set
2