Prabhasak's group workspace
Group: Hopper-v2
State
Notes
User
Tags
Created
Runtime
Sweep
algo
batch_size
buffer_size
cg_damping
cg_iters
check_callback
cliprange
device
ent_coef
entcoeff
env
eval_callback
exp_id
gamma
gradient_steps
lam
learning_rate
learning_starts
max_kl
n_steps
n_timesteps
nminibatches
noptepochs
num_trajs
policy
save_best_model
seed
tensorboard
timesteps_IL
timesteps_RL
timesteps_per_batch
train_IL
train_RL
train_freq
traj_use
verbose
vf_iters
vf_stepsize
wandb_log
BC_max_iter
checkpoint_dir
env_id
env_kwargs.expert
env_kwargs.name
Finished
prabhasak
4h 38m
-
trpo
-
-
0.1
15
false
-
gpu
-
0.01
Hopper-v2
true
1
0.99
-
0.95
-
-
0.005
-
-
-
-
20
MlpPolicy
true
42
true
0
0
2048
true
false
-
-
0
5
0.001
true
-
-
-
-
-
Finished
prabhasak
36m 52s
-
trpo
256
-
-
-
-
-
gpu
-
-
-
-
1
-
-
-
-
-
-
-
-
-
-
20
-
-
42
-
-
-
-
-
-
-
-
0
-
-
true
5e5
models
Hopper-v2
-
-
Finished
prabhasak
2h 47m 32s
-
trpo
-
-
0.1
15
false
-
gpu
-
0.01
Hopper-v2
true
1
0.99
-
0.95
-
-
0.005
-
-
-
-
10
MlpPolicy
true
42
true
0
0
2048
true
false
-
-
0
5
0.001
true
-
-
-
-
-
Finished
prabhasak
45m 58s
-
trpo
256
-
-
-
-
-
gpu
-
-
-
-
1
-
-
-
-
-
-
-
-
-
-
10
-
-
42
-
-
-
-
-
-
-
-
0
-
-
true
5e5
models
Hopper-v2
-
-
Finished
prabhasak
5h 44m 34s
-
trpo
-
-
0.1
15
false
-
gpu
-
0.01
Hopper-v2
true
1
0.99
-
0.95
-
-
0.005
-
-
-
-
5
MlpPolicy
true
42
true
0
0
2048
true
false
-
-
0
5
0.001
true
-
-
-
-
-
Finished
prabhasak
1h 3m 41s
-
trpo
256
-
-
-
-
-
gpu
-
-
-
-
1
-
-
-
-
-
-
-
-
-
-
5
-
-
42
-
-
-
-
-
-
-
-
0
-
-
true
5e5
models
Hopper-v2
-
-
Finished
prabhasak
7h 6m 25s
-
sac
256
1000000
-
-
false
-
gpu
0.01
-
Hopper-v2
true
1
-
1
-
lin_3e-4
1000
-
-
-
-
-
10
CustomSACPolicy
true
42
true
0
0
-
false
true
1
-
0
-
-
true
-
-
-
-
-
Finished
prabhasak
1h 30m 11s
-
trpo
-
-
0.1
15
false
-
gpu
-
0.01
Hopper-v2
true
1
0.99
-
0.95
-
-
0.005
-
-
-
-
10
MlpPolicy
true
42
true
0
0
2048
false
true
-
-
0
5
0.001
true
-
-
-
-
-
1-8
of 8