Skip to main content

RPO (alpha=0.5) on Mujoco_v2 Part 2

['ppo_continuous_action_8M?tag=v1.0.0-13-gcbd83f6', 'rpo_continuous_action?tag=pr-331']
Created on January 3|Last edited on January 3

2M4M6MSteps0100020003000Episodic Return
50100150200250300Time (minutes)01000200030004000Episodic Return
ppo_continuous_action_8M ({'tag': ['v1.0.0-13-gcbd83f6']})
10
rpo_continuous_action ({'tag': ['pr-331']})
10
Name
10 visualized
10
State
Notes
User
Tags
Created
Runtime
Sweep
actor_device_ids
actor_devices
adv_norm_fullbatch
alpha
anneal_lr
anneal_steps
async_batch_size
async_update
autotune
aux_batch_rollouts
backend
base_model
batch_size
beta_clone
buffer_size
capture_video
channels
clip_coef
clip_vloss
concurrency
cuda
debug_normalize
deepspeed
device_ids
discount
distill_batch_size
distill_beta
distill_learning_rate
distill_update_epochs
distributed
dropout_rate
e_auxiliary
e_policy
end_e
ent_coef
env
env_id
eps
eval_every
eval_freq
exp_name
expl_noise
exploration_fraction
exploration_noise
Finished
masud99r
3h 34m 26s
-
-
-
-
-
true
-
-
-
-
-
-
-
2048
-
-
false
-
0.2
true
-
false
-
-
-
-
-
-
-
-
-
-
-
-
-
0
-
Ant-v2
-
-
-
ppo_continuous_action_8M
-
-
-
1-1
of 1



ppo_continuous_action_8M ({'tag': ['v1.0.0-13-gcbd83f6']})
10
rpo_continuous_action ({'tag': ['pr-331']})
10



ppo_continuous_action_8M ({'tag': ['v1.0.0-13-gcbd83f6']})
10
rpo_continuous_action ({'tag': ['pr-331']})
10



ppo_continuous_action_8M ({'tag': ['v1.0.0-13-gcbd83f6']})
10
rpo_continuous_action ({'tag': ['pr-331']})
10



ppo_continuous_action_8M ({'tag': ['v1.0.0-13-gcbd83f6']})
10
rpo_continuous_action ({'tag': ['pr-331']})
10



ppo_continuous_action_8M ({'tag': ['v1.0.0-13-gcbd83f6']})
10
rpo_continuous_action ({'tag': ['pr-331']})
10



ppo_continuous_action_8M ({'tag': ['v1.0.0-13-gcbd83f6']})
10
rpo_continuous_action ({'tag': ['pr-331']})
10