Skip to main content

SBX v0.9.1 (unroll) vs PR-21 (for-i-loop)

Created on December 13|Last edited on December 13
Note: using n_envs=x and gradient_steps=x , where x=12 for HalfCheetah and x=14 for Hopper

HalfCheetah-v4

Hopper-v4


50k100k150k200k250kglobal_step200040006000800010000
algo: sac v0.9.1 (unroll)
algo: sac PR-21 (for-i-loop)
50k100k150k200k250kglobal_step50010001500200025003000
algo: sac v0.9.1 (unroll)
algo: sac PR-21 (for-i-loop)
v0.9.1 (unroll)
3
PR-21 (for-i-loop)
3