Skip to main content
pku_rl
Projects
SafePO
Reports
Overview of SafePO Implementation
Log in
Sign up
Share
Comment
Star
Share
Comment
Star
Overview of SafePO Implementation
PKU-Alignment
Created on August 19
|
Last edited on August 19
Comment
Episode cost of the whole training process, averaged from all tasks
Episode cost of the whole training process, averaged from all tasks
0
10
20
30
40
50
60
algo: TRPOLag
algo: TRPOLag
algo: PPOLag
algo: PPOLag
algo: CUP
algo: CUP
algo: FOCOPS
algo: FOCOPS
algo: CPO
algo: CPO
algo: RCPO
algo: RCPO
algo: PCPO
algo: PCPO
algo: CPPOPID
algo: CPPOPID
Add a comment