Skip to main content

Return normalization

Created on February 5|Last edited on February 5

Section 1




500k1M1.5Mglobal_step0100200300400500600
Run set
4
Name
4 visualized
0
10
2
2
2
2
State
Notes
User
Tags
Created
Runtime
Sweep
alpha
batch_size
buffer_size
capture_video
clip_coef
cuda
end_e
ent_coef
episode_length
exp_name
exploration_fraction
gae_lambda
gamma
gym_id
hidden_sizes
learning_rate
learning_starts
max_grad_norm
noise_std
notb
prod_mode
return_filter_reset
running_state_reset
seed
start_e
start_steps
target_network_frequency
target_update_interval
tau
torch_deterministic
total_timesteps
train_frequency
update_epochs
vf_coef
wandb_entity
wandb_project_name
action_noise
actor_buffer_size
alg
anneal_lr
aux_batch_size
aux_minibatch_size
beta_clone
built_in_ais
Finished
-
costa-huang
2mo 24d 21h 1m 37s
-
-
2055
-
true
0.2
[false,true]
-
0.01
600
["ppo6_normalized_env","ppo7_value_td_loss","ppo_adv_norm","ppo_continuous_gae","ppo_reward_norm"]
-
0.97
0.992
Walker2DBulletEnv-v0
-
0.0007
-
0.5
-
-
true
-
-
1.7
-
-
-
-
-
true
5200000
-
5.8
0.25
cleanrl
cleanrl
-
-
-
true
-
-
-
-
Finished
-
costa-huang
7h 24m 37s
-
-
-
-
true
0.2
true
-
0.01
0
ppo_return_norm_reset
-
0.97
0.99
Walker2DBulletEnv-v0
-
0.0007
-
0.5
-
-
true
true
[false,true]
1
-
-
-
-
-
true
2000000
-
3
0.25
cleanrl
cleanrl
-
-
-
-
-
-
-
-
Finished
-
costa-huang
8h 41m 26s
-
-
-
-
true
0.2
true
-
0.01
0
ppo_return_norm_reset
-
0.97
0.99
Walker2DBulletEnv-v0
-
0.0007
-
0.5
-
-
true
false
[false,true]
1
-
-
-
-
-
true
2000000
-
3
0.25
cleanrl
cleanrl
-
-
-
-
-
-
-
-
1-3
of 3


Section 2




Run set
4