Skip to main content

Mujoco Demo

Created on July 21|Last edited on July 21

Findings on HalfCheetah-v2

In this benchmark, we have studied the performance of DDPG, TD3, and PPO. Overall, we find TD3 to Achieve the highest returns, which is corroborated by the TD3 paper.
The agent trained with PPO achieves around 1700 return by walking with the Halfchetah's head, which explains the poor performance compared to TD3 or DDPG.
💡

500k1M1.5Mglobal_step0200040006000800010000
video.0
This run didn't log videos for key "video.0", step 10759, index 0. Docs →
This run didn't log videos for key "video.0", step 10759, index 0. Docs →
This run didn't log videos for key "video.0", step 10759, index 0. Docs →
This run didn't log videos for key "video.0", step 10759, index 0. Docs →
This run didn't log videos for key "video.0", step 1485, index 0. Docs →
This run didn't log videos for key "video.0", step 1485, index 0. Docs →
Run set
6