Jakegrigsby's workspace
Runs
24
Name
0 visualized
State
Notes
User
Tags
Created
Runtime
Sweep
Actor.activation
Actor.cont_dist_kind
Actor.d_hidden
Actor.dropout_p
Actor.gmm_modes
Actor.log_std_high
Actor.log_std_low
Actor.n_layers
Agent.fake_filter
Agent.gamma
Agent.num_critics
Agent.num_critics_td
Agent.offline_coeff
Agent.online_coeff
Agent.popart
Agent.reward_multiplier
Agent.tau
Agent.use_multigamma
Agent.use_target_actor
BabyTstepEncoder.emb_dim
BabyTstepEncoder.extras_dim
BabyTstepEncoder.mission_dim
BabyTstepEncoder.obs_kind
BilevelEpsilonGreedy.eps_end_end
BilevelEpsilonGreedy.eps_end_start
BilevelEpsilonGreedy.eps_start_end
BilevelEpsilonGreedy.eps_start_start
BilevelEpsilonGreedy.randomize_eps
BilevelEpsilonGreedy.rollout_horizon
BilevelEpsilonGreedy.steps_anneal
EpsilonGreedy.eps_end
EpsilonGreedy.eps_start
EpsilonGreedy.randomize_eps
EpsilonGreedy.steps_anneal
Experiment.always_load_latest
Experiment.always_save_latest
Experiment.batch_size
Experiment.batches_per_update
Experiment.critic_loss_weight
Experiment.force_reset_train_envs_every
Experiment.grad_clip
Experiment.l2_coeff
Experiment.learning_rate
Experiment.log_interval
Crashed
-
jakegrigsby
6d 22h 11m 38s
-
'leaky_relu'
-
300
0.0
-
-
-
2
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
False
True
-
-
-
-
-
-
-
-
Finished
feed-forward offline RL w/ Beta policy distribution
jakegrigsby
8h 37m 57s
-
'leaky_relu'
-
128
0.0
-
-
-
2
False
0.995
4
2
1.0
0.0
True
1
0.003
True
True
-
-
-
-
-
-
-
-
-
-
-
0.05
1.0
True
1000000
False
True
-
1
10.0
None
1.0
0.001
-
300
Finished
-
jakegrigsby
14h 14m 16s
-
'leaky_relu'
'normal'
256
0.0
5
2.0
-5.0
2
False
0.999
4
2
0.1
1.0
True
10.0
0.003
True
True
-
-
-
-
-
-
-
-
-
-
-
0.05
1.0
True
500000
False
True
-
1
10.0
None
1.0
0.001
0.0001
300
Finished
-
jakegrigsby
21h 55m 11s
-
'leaky_relu'
'normal'
256
0.0
5
2.0
-5.0
2
False
0.99
4
2
0.1
1.0
True
1.0
0.005
True
True
-
-
-
-
-
-
-
-
-
-
-
0.05
1.0
True
500000
False
True
-
1
10.0
None
-
0.001
-
300
Killed
(eval metrics incorrectly report HalfCheetahV5 - this is HalfCheetahV4)
jakegrigsby
8h 58m 45s
-
'leaky_relu'
'normal'
256
0.0
5
2.0
-5.0
2
False
0.99
4
2
0.1
1.0
True
1.0
0.005
True
True
-
-
-
-
-
-
-
-
-
-
-
0.05
1.0
True
500000
False
True
-
1
10.0
None
-
0.001
-
300
Finished
-
jakegrigsby
10h 19m 37s
-
'leaky_relu'
'normal'
256
0.0
5
2.0
-5.0
2
False
0.999
4
2
0.1
1.0
True
100.0
0.003
True
True
-
-
-
-
-
-
-
-
-
-
-
0.05
1.0
True
400000
False
True
-
1
10.0
None
-
0.001
-
300
Finished
-
jakegrigsby
11h 2m 17s
-
'leaky_relu'
'normal'
256
0.0
5
2.0
-5.0
2
False
0.999
4
2
0.1
1.0
True
100.0
0.003
True
True
-
-
-
-
-
-
-
-
-
-
-
0.05
1.0
True
400000
False
True
-
1
10.0
None
-
0.001
-
300
Finished
-
jakegrigsby
9h 48m 51s
-
'leaky_relu'
'normal'
256
0.0
5
2.0
-5.0
2
False
0.999
4
2
0.1
1.0
True
100.0
0.003
True
True
-
-
-
-
-
-
-
-
-
-
-
0.05
1.0
True
400000
False
True
-
1
10.0
None
-
0.001
-
300
Finished
-
jakegrigsby
10h 1m 42s
-
'leaky_relu'
'normal'
256
0.0
5
2.0
-5.0
2
False
0.999
4
2
0.1
1.0
True
100.0
0.003
True
True
-
-
-
-
-
-
-
-
-
-
-
0.05
1.0
True
400000
False
True
-
1
10.0
None
-
0.001
-
300
Finished
-
jakegrigsby
12h 23m 45s
-
'leaky_relu'
'normal'
256
0.0
5
2.0
-5.0
2
False
0.999
4
2
0.1
1.0
True
100.0
0.003
True
True
-
-
-
-
-
-
-
-
-
-
-
0.05
1.0
True
400000
False
True
-
1
10.0
None
-
0.001
-
300
Killed
configure "amago.nets.actor_critic.Actor.cont_dist_kind" : "gmm"
jakegrigsby
6h 46m 48s
-
'leaky_relu'
'gmm'
256
0.0
5
2.0
-5.0
2
False
0.999
4
2
-
-
True
10.0
0.003
True
True
-
-
-
-
-
-
-
-
-
-
-
0.05
1.0
True
1000000
False
True
-
1
10.0
None
1.0
0.001
0.0001
300
Finished
-
jakegrigsby
11h 26m 50s
-
'leaky_relu'
'normal'
256
0.0
5
2.0
-5.0
2
False
0.999
4
2
0.1
1.0
True
100.0
0.003
True
True
-
-
-
-
0.01
0.8
0.05
1.0
True
500
500000
-
-
-
-
False
True
-
1
10.0
None
1.0
0.001
0.0001
300
Finished
-
jakegrigsby
14h 52m 59s
-
'leaky_relu'
'normal'
256
0.0
5
2.0
-5.0
2
False
0.999
4
2
0.1
1.0
True
100.0
0.003
True
True
-
-
-
-
0.01
0.8
0.05
1.0
True
400
500000
-
-
-
-
False
True
-
1
10.0
None
1.0
0.001
0.0001
300
Killed
learning update stats for symbolic alchemy (actor: https://wandb.ai/jakegrigsby/amago-v3-reference/runs/s85fw2kn?nw=nwuserjakegrigsby)
jakegrigsby
6d 20h 23m 46s
-
'leaky_relu'
'normal'
256
0.0
5
2.0
-5.0
2
False
0.999
4
2
-
-
True
-
0.003
True
True
-
-
-
-
0.01
0.8
0.05
1.0
True
200
2500000
-
-
-
-
False
True
-
1
10.0
None
1.0
0.001
0.0001
300
Killed
roughly matches the VMPO results in the Alchemy paper (of 155.4 +/- 1.6 )
jakegrigsby
6d 20h 33m 2s
-
'leaky_relu'
'normal'
256
0.0
5
2.0
-5.0
2
False
0.999
4
2
-
-
True
-
0.003
True
True
-
-
-
-
0.01
0.8
0.05
1.0
True
200
2500000
-
-
-
-
False
True
-
1
10.0
None
1.0
0.001
0.0001
300
Finished
-
jakegrigsby
11h 20m 5s
-
'leaky_relu'
'normal'
256
0.0
5
2.0
-5.0
2
False
0.999
4
2
0.1
1.0
True
10.0
0.003
True
True
-
-
-
-
-
-
-
-
-
-
-
0.05
1.0
True
1000000
False
True
-
1
10.0
None
1.0
0.001
0.0001
300
Killed
-
jakegrigsby
3h 54m 47s
-
'leaky_relu'
'normal'
256
0.0
5
2.0
-5.0
2
False
0.999
4
2
0.1
1.0
True
10.0
0.003
True
True
-
-
-
-
-
-
-
-
-
-
-
0.05
1.0
True
1000000
False
True
-
1
10.0
None
1.0
0.001
0.0001
300
Killed
-
jakegrigsby
4d 30m 32s
-
'leaky_relu'
'normal'
256
0.0
5
2.0
-5.0
2
False
0.999
4
2
-
-
True
-
0.003
True
True
-
-
-
-
-
-
-
-
-
-
-
0.05
1.0
True
1000000
False
True
-
1
10.0
None
1.0
0.001
0.0001
300
Killed
Uses `MultiTaskAgent` as discussed in "Breaking the Multi-Task Barrier..."
Slightly different hparams than the paper version but similar results. The paper settings were not tuned.
jakegrigsby
4d 15h 20m 50s
-
'leaky_relu'
'normal'
256
0.0
5
2.0
-5.0
2
False
0.999
-
2
-
-
True
-
0.003
True
True
-
-
-
-
0.01
0.8
0.05
1.0
True
1500
2000000
-
-
-
-
False
True
-
1
10.0
None
-
0.001
-
300
Finished
-
jakegrigsby
7h 32m 27s
-
'leaky_relu'
'normal'
256
0.0
5
2.0
-5.0
2
False
0.9999
4
2
0.1
1.0
True
100.0
0.003
True
True
-
-
-
-
-
-
-
-
-
-
-
-
-
True
-
False
True
-
1
10.0
None
1.0
0.001
0.0001
300
1-20
of 24