Comment
MontezumaRevenge-v5
MontezumaRevenge-v5
MontezumaRevenge-v5
MontezumaRevenge-v5
CleanRL's ppo_rnd_envpool.py
1
Name
1 visualized
exp_name: ppo_rnd
exp_name: ppo_rnd
1
State
Notes
User
Tags
Created
Runtime
Sweep
adv_norm_fullbatch
alpha
anneal_lr
autotune
aux_batch_rollouts
backend
batch_size
beta_clone
buffer_size
capture_video
clip_coef
clip_vloss
cuda
device_ids
discount
e_auxiliary
e_policy
end_e
ent_coef
env
env_id
eval_every
eval_freq
exp_name
expl_noise
exploration_fraction
exploration_noise
ext_coef
gae
gae_lambda
gamma
int_coef
int_gamma
learning_rate
learning_starts
load_model
max_grad_norm
max_timesteps
minibatch_size
n_atoms
n_aux_grad_accum
n_eval_episodes
n_iteration
noise_clip
Finished
yooceii
10d 16h 51m 24s
-
-
-
true
-
-
-
16384
-
-
false
0.1
true
true
-
-
-
-
-
0.001
-
MontezumaRevenge-v5
-
-
ppo_rnd
-
-
-
2
true
0.95
0.999
1
0.99
0.0001
-
-
0.5
-
4096
-
-
-
-
-
1-1
of 1
Add a comment
Created with ❤️ on Weights & Biases.
https://wandb.ai/openrlbenchmark/openrlbenchmark/reports/-MontezumaRevenge-CleanRL-s-PPO-RND--VmlldzoyNTIyNjc5