Skip to main content

MuJoCo: Our PPO vs openai/baselines' PPO

Created on April 7|Last edited on April 13

200k400k600k800k Steps100020003000Episodic Return
Run set
6
Name
6 visualized
2
6
State
Notes
User
Tags
Created
Runtime
Sweep
alg
env
exp_name
network
num_env
num_timesteps
play
reward_scale
save_video_interval
save_video_length
seed
track
anneal_lr
batch_size
capture_video
clip_coef
clip_vloss
cuda
ent_coef
gae
gae_lambda
gamma
gym_id
learning_rate
max_grad_norm
minibatch_size
norm_adv
num_envs
num_minibatches
num_steps
torch_deterministic
total_timesteps
update_epochs
vf_coef
wandb_entity
wandb_project_name
aux_batch_size
aux_minibatch_size
beta_clone
e_auxiliary
e_policy
n_aux_grad_accum
n_aux_minibatch
n_iteration
Finished
-
costa-huang
1d 21h 41m 24s
-
ppo2
Hopper-v2
["baselines-ppo2-mlp","ppo_continuous_action"]
mlp
-
1000000
false
1
0
200
2
true
true
2048
false
0.2
true
false
0
true
0.95
0.99
Hopper-v2
0.0003
0.5
64
true
1
32
2048
true
1000000
10
0.5
vwxyzjn
ppo-details
-
-
-
-
-
-
-
-
1-1
of 1



Run set
6



Run set
6