Skip to main content

trlx: refactor - remove orchestrator abstraction from API #289

* PPO sentiments reproduction from `main` (gpt2) * Summarize dailymail/cnn reproduction from `main` (google/flan-t5-small)`
Created on February 8|Last edited on February 8

Results


0100200300400500Step0.20.30.40.50.60.7
Run set
5



Run set
5



Run set
5



Run set
5



Run set
5



Run set
5



Run set
5



Run set
5



Run set
5



Run set
5



Run set
5