Mujoco Demo
Created on July 21|Last edited on July 21
Comment
Findings on HalfCheetah-v2
In this benchmark, we have studied the performance of DDPG, TD3, and PPO. Overall, we find TD3 to Achieve the highest returns, which is corroborated by the TD3 paper.
The agent trained with PPO achieves around 1700 return by walking with the Halfchetah's head, which explains the poor performance compared to TD3 or DDPG.
💡
video.0
Run set
6
Add a comment