Skip to main content

fix-kl-controller v. main

fix-kl-controller @ee8cd4a/fix(ppo_trainer): update `AdaptiveKLController` with correct KL/2023-03-08 main @adbf8fc/Add intermediate checkpointing to `accelerate` trainers (#349)/2023-03-08
Created on March 10|Last edited on March 10

ppo_hh/gpt-j-6B/7gpus


Select runs that logged reward/mean
to visualize data in this line chart.
Run set



Run set


ppo_randomwalks/randomwalks/1gpu


Run set



Run set


ilql_randomwalks/GPT2Config/1gpu


Run set



Run set


sft_sentiments/gpt2/1gpu


Run set



Run set


ilql_sentiments/gpt2/1gpu


Run set



Run set


ppo_sentiments/gpt2-imdb/1gpu


Run set



Run set