stable-learning-control Table

Rickstaa's workspace

Runs

1,918

225

150

375

300

320

Finished

rickstaa

2y ago

5mo 27d 3h 45m 50s

nn.ReLU

256

121.14286

nn.ReLU

true

256

gpu:1

["CartPoleCost-v1","FetchReachCost-v1","Oscillator-v1","OscillatorComplicated-v1"]

["stable_gym.envs.biological.oscillator.oscillator.Oscillator","stable_gym.envs.biological.oscillator_complicated.oscillator_complicated.OscillatorComplicated","stable_gym.envs.classic_control.cartpole_cost.cartpole_cost.CartPoleCost","stable_gym.envs.robotics.fetch.fetch_reach_cost.fetch_reach_cost.FetchReachCost"]

223.71429

["han2020_reproduction_sac_cartpole_cost_alpha3_tune_exp_lac_critic","han2020_reproduction_sac_cartpole_cost_alpha3_tune_exp_lac_critic_big","han2020_reproduction_sac_fetch_reach_alpha3_tune_exp_lac_critic","han2020_reproduction_sac_fetch_reach_alpha3_tune_exp_lac_critic_big","han2020_reproduction_sac_oscillator_alpha3_tune_exp_lac_critic","han2020_reproduction_sac_oscillator_complicated_alpha3_tune_exp_lac_critic"]

false

0.995

0.0001

4.5714e-10

0.0003

1.3714e-9

step

linear

271.42857

minimize

0.995

1000000

26203.8

2048

1000

100

Finished

rickstaa

1y ago

35m 9s

nn.ReLU

256

nn.ReLU

true

256

gpu:1

OscillatorComplicated-v1

stable_gym.envs.biological.oscillator_complicated.oscillator_complicated.OscillatorComplicated

han2020_reproduction_sac_oscillator_complicated_alpha3_tune_exp_bigger_initial_alpha

false

0.995

0.0001

1.0000e-9

0.0003

3.0000e-9

step

linear

400

minimize

0.995

1000000

26203.8

2048

1000

100

Finished

rickstaa

1y ago

26m 20s

nn.ReLU

256

nn.ReLU

true

256

gpu:1

Oscillator-v1

stable_gym.envs.biological.oscillator.oscillator.Oscillator

han2020_reproduction_sac_oscillator_alpha3_tune_exp_bigger_steps_per_update

false

0.995

0.0001

1.0000e-9

0.0003

3.0000e-9

step

linear

400

minimize

0.995

1000000

26203.8

2048

1000

100

Finished

rickstaa

2y ago

3d 12h 22m 43s

nn.ReLU

256

99.2

nn.ReLU

true

256

gpu:1

["CartPoleCost-v1","FetchReachCost-v1","Oscillator-v1","OscillatorComplicated-v1"]

186

["han2020_reproduction_sac_cartpole_cost_alpha3_tune_exp_sac_extra_all","han2020_reproduction_sac_fetch_reach_alpha3_tune_exp_sac_extra_all","han2020_reproduction_sac_oscillator_alpha3_tune_exp_sac_extra_all","han2020_reproduction_sac_oscillator_complicated_alpha3_tune_exp_sac_extra_all"]

false

0.995

0.0001

5.5333e-10

0.0003

1.6600e-9

step

linear

290

minimize

0.995

1000000

26203.8

2048

1000

100

Finished

rickstaa

2y ago

3d 13h 1m 3s

nn.ReLU

256

nn.ReLU

true

256

gpu:1

["CartPoleCost-v1","FetchReachCost-v1","OscillatorComplicated-v1"]

["stable_gym.envs.biological.oscillator_complicated.oscillator_complicated.OscillatorComplicated","stable_gym.envs.classic_control.cartpole_cost.cartpole_cost.CartPoleCost","stable_gym.envs.robotics.fetch.fetch_reach_cost.fetch_reach_cost.FetchReachCost"]

220.25

["han2020_reproduction_sac_cartpole_cost_alpha3_tune_exp_different_steps_per_update","han2020_reproduction_sac_fetch_reach_alpha3_tune_exp_different_steps_per_update","han2020_reproduction_sac_oscillator_complicated_alpha3_tune_exp_lac_critic_different_steps_per_update"]

false

0.995

0.0001

4.4167e-10

0.0003

1.3250e-9

step

linear

262.5

minimize

0.995

1000000

26203.8

2048

1000

100

Finished

rickstaa

2y ago

2d 14h 59m 21s

nn.ReLU

256

nn.ReLU

true

256

gpu:1

["CartPoleCost-v1","FetchReachCost-v1","Oscillator-v1"]

["stable_gym.envs.biological.oscillator.oscillator.Oscillator","stable_gym.envs.classic_control.cartpole_cost.cartpole_cost.CartPoleCost","stable_gym.envs.robotics.fetch.fetch_reach_cost.fetch_reach_cost.FetchReachCost"]

208

["han2020_reproduction_sac_cartpole_cost_alpha3_tune_exp_bigger_initial_alpha","han2020_reproduction_sac_fetch_reach_alpha3_tune_exp_bigger_initial_alpha","han2020_reproduction_sac_oscillator_alpha3_tune_exp_bigger_initial_alpha"]

false

0.995

0.0001

4.4167e-10

0.0003

1.3250e-9

step

linear

262.5

minimize

0.995

1000000

26203.8

2048

1000

100

Finished

rickstaa

2y ago

2d 5h 40m 55s

nn.ReLU

256

nn.ReLU

true

256

gpu:1

["CartPoleCost-v1","FetchReachCost-v1","Oscillator-v1","OscillatorComplicated-v1"]

186

["han2020_reproduction_sac_cartpole_cost_alpha3_tune_exp","han2020_reproduction_sac_fetch_reach_alpha3_tune_exp","han2020_reproduction_sac_fetch_reach_alpha3_tune_infinite_horizon_exp","han2020_reproduction_sac_oscillator_alpha3_tune_exp","han2020_reproduction_sac_oscillator_complicated_alpha3_tune_exp"]

false

0.995

0.0001

5.5333e-10

0.0003

1.6600e-9

step

linear

290

minimize

0.995

1000000

26203.8

2048

1000

100

Finished

An extra experiment to see what the effect is of the smaller horizon size found in Han et al.'s codebase .

rickstaa

extra

2y ago

16h 39m 20s

nn.ReLU

256

112

nn.ReLU

true

0.46667

256

gpu:1

["CartPoleCost-v1","FetchReachCost-v1","Oscillator-v1","OscillatorComplicated-v1"]

195.75

["han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_small_horizon_alp0-1","han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_small_horizon_alp0-3","han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_small_horizon_alp1-0","han2020_reproduction_lac_fetch_reach_alpha3_tune_exp_small_horizon_alp0-1","han2020_reproduction_lac_fetch_reach_alpha3_tune_exp_small_horizon_alp0-3","han2020_reproduction_lac_fetch_reach_alpha3_tune_exp_small_horizon_alp1-0","han2020_reproduction_lac_oscillator_alpha3_tune_exp_small_horizon_alp0-1","han2020_reproduction_lac_oscillator_alpha3_tune_exp_small_horizon_alp0-3","han2020_reproduction_lac_oscillator_alpha3_tune_exp_small_horizon_alp1-0","han2020_reproduction_lac_oscillator_complicated_alpha3_tune_exp_small_horizon_alp0-1"]

false

0.99

0.0001

6.0833e-10

0.0003

1.8250e-9

step

linear

312.5

minimize

0.995

1000000

26203.8

2048

1000

100

Finished

An extra experiment to see what the effect is of the lower lambda learning rate found in Han et al.'s codebase (1e-4 vs 3e-4).

rickstaa

extra

2y ago

22h 27m 55s

nn.ReLU

256

99.2

nn.ReLU

true

0.46667

256

gpu:1

["CartPoleCost-v1","FetchReachCost-v1","Oscillator-v1","OscillatorComplicated-v1"]

186

["han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_lambda_lr_check_alp0-1","han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_lambda_lr_check_alp0-3","han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_lambda_lr_check_alp1-0","han2020_reproduction_lac_fetch_reach_alpha3_tune_exp_lambda_lr_check_alp0-1","han2020_reproduction_lac_fetch_reach_alpha3_tune_infinite_horizon_exp_lambda_lr_check_alp0-1","han2020_reproduction_lac_fetch_reach_alpha3_tune_infinite_horizon_exp_lambda_lr_check_alp0-3","han2020_reproduction_lac_fetch_reach_alpha3_tune_infinite_horizon_exp_lambda_lr_check_alp1-0","han2020_reproduction_lac_oscillator_alpha3_tune_exp_lambda_lr_check_alp0-1","han2020_reproduction_lac_oscillator_alpha3_tune_exp_lambda_lr_check_alp0-3","han2020_reproduction_lac_oscillator_alpha3_tune_exp_lambda_lr_check_alp1-0"]

false

0.991

0.99

0.0001

5.5333e-10

0.0003

1.6600e-9

step

linear

290

minimize

0.995

1000000

26203.8

2048

1000

100

Finished

An extra experiment to check the effect of the smaller actor found in Han et al's codebase.

rickstaa

extra

2y ago

18h 37m 48s

nn.ReLU

99.2

nn.ReLU

true

0.46667

256

gpu:1

["CartPoleCost-v1","FetchReachCost-v1","Oscillator-v1","OscillatorComplicated-v1"]

186

["han2020_reproduction_lac_fetch_reach_alpha3_tune_exp_small_actor_alp0-1","han2020_reproduction_lac_fetch_reach_alpha3_tune_exp_small_actor_alp0-3","han2020_reproduction_lac_fetch_reach_alpha3_tune_exp_small_actor_alp1-0","han2020_reproduction_lac_fetch_reach_alpha3_tune_infinite_horizon_exp_small_actor_alp0-1","han2020_reproduction_lac_fetch_reach_alpha3_tune_infinite_horizon_exp_small_actor_alp0-3","han2020_reproduction_lac_fetch_reach_alpha3_tune_infinite_horizon_exp_small_actor_alp1-0","han2020_reproduction_lac_oscillator_alpha3_tune_exp_small_actor_alp0-1","han2020_reproduction_lac_oscillator_alpha3_tune_exp_small_actor_alp0-3","han2020_reproduction_lac_oscillator_alpha3_tune_exp_small_actor_alp1-0","han2020_reproduction_lac_oscillator_complicated_alpha3_tune_exp_small_actor_alp0-1"]

false

0.991

0.99

0.0001

5.5333e-10

0.0003

1.6600e-9

step

linear

290

minimize

0.995

1000000

26203.8

2048

1000

100

Finished

Here we decreased the lambda learning rate from 3e-4 specified in Han et al's paper to 1e-4 specified in their codebase. We unfortunately set the wrong lambda final learning rate when changing the lambda learning rate.

rickstaa

2y ago

20h 11m 20s

nn.ReLU

256

99.2

nn.ReLU

true

0.46667

256

gpu:0

["CartPoleCost-v1","FetchReachCost-v1","Oscillator-v1","OscillatorComplicated-v1"]

186

["han2020_reproduction_lac_fetch_reach_alpha3_tune_exp_lambda_lr_check_alp0-1","han2020_reproduction_lac_fetch_reach_alpha3_tune_exp_lambda_lr_check_alp0-3","han2020_reproduction_lac_fetch_reach_alpha3_tune_exp_lambda_lr_check_alp1-0","han2020_reproduction_lac_fetch_reach_alpha3_tune_infinite_horizon_exp_lambda_lr_check_alp0-1","han2020_reproduction_lac_fetch_reach_alpha3_tune_infinite_horizon_exp_lambda_lr_check_alp0-3","han2020_reproduction_lac_fetch_reach_alpha3_tune_infinite_horizon_exp_lambda_lr_check_alp1-0","han2020_reproduction_lac_oscillator_alpha3_tune_exp_lambda_lr_check_alp0-1","han2020_reproduction_lac_oscillator_alpha3_tune_exp_lambda_lr_check_alp0-3","han2020_reproduction_lac_oscillator_alpha3_tune_exp_lambda_lr_check_alp1-0","han2020_reproduction_lac_oscillator_complicated_alpha3_tune_exp_lr_lambda_check_alp0-3"]

false

0.991

0.99

0.0001

5.5333e-10

0.0003

1.6600e-9

step

linear

290

minimize

0.995

1000000

26203.8

2048

1000

100

Finished

A double check to see if our earlier experiment into a longer CompOscillator training was performed correctly.

rickstaa

extra

2y ago

12h 58m 52s

nn.ReLU

256

176

nn.ReLU

true

0.8

256

gpu:1

OscillatorComplicated-v1

stable_gym.envs.biological.oscillator_complicated.oscillator_complicated.OscillatorComplicated

["han2020_reproduction_lac_oscillator_complicated_alpha3_tune_exp_longer_alp0-1","han2020_reproduction_lac_oscillator_complicated_alpha3_tune_exp_longer_alp0-2","han2020_reproduction_lac_oscillator_complicated_alpha3_tune_exp_longer_alp0-4","han2020_reproduction_lac_oscillator_complicated_alpha3_tune_exp_longer_alp0-5","han2020_reproduction_lac_oscillator_complicated_alpha3_tune_exp_longer_alp0-7","han2020_reproduction_lac_oscillator_complicated_alpha3_tune_exp_longer_alp0-8","han2020_reproduction_lac_oscillator_complicated_alpha3_tune_exp_longer_alp1-1","han2020_reproduction_lac_oscillator_complicated_alpha3_tune_exp_longer_alp1-2","han2020_reproduction_lac_oscillator_complicated_alpha3_tune_exp_longer_alp1-4","han2020_reproduction_lac_oscillator_complicated_alpha3_tune_exp_longer_alp1-5"]

false

0.99

0.0001

1.0000e-9

0.0003

3.0000e-9

step

linear

400

minimize

0.995

1000000

26203.8

2048

1000

100

Finished

An extra experiment to see what the effect is of the smaller critic found in Han et al.'s codebase.

rickstaa

extra

2y ago

16h 49m 28s

nn.ReLU

256

nn.ReLU

true

0.46667

256

gpu:1

OscillatorComplicated-v1

stable_gym.envs.biological.oscillator_complicated.oscillator_complicated.OscillatorComplicated

["han2020_reproduction_lac_oscillator_complicated_alpha3_tune_exp_small_critic_alp0-1","han2020_reproduction_lac_oscillator_complicated_alpha3_tune_exp_small_critic_alp0-3","han2020_reproduction_lac_oscillator_complicated_alpha3_tune_exp_small_critic_alp1-0"]

false

0.99

0.0001

1.0000e-9

0.0003

3.0000e-9

step

linear

400

minimize

0.995

1000000

34912.5

2048

1000

100

Finished

Here we decreased the critic network size of the CompOscillator environment to [64,64,16] to investigate an inconsistency in Han et al.'s research. This is the short version where only 1e5 environment interactions were used per training iteration.

rickstaa

extra

incorrect

2y ago

1h 14m 42s

nn.ReLU

256

nn.ReLU

true

0.46667

256

gpu:1

OscillatorComplicated-v1

stable_gym.envs.biological.oscillator_complicated.oscillator_complicated.OscillatorComplicated

false

0.99

0.0001

1.0000e-9

0.0003

3.0000e-9

step

linear

400

minimize

0.995

1000000

26203.8

2048

1000

100

Finished

Here we added 5 extra seeds to the Oscillator and CompOscillator training performed in the alph3 hyperparameter tuning of our reproduction study. The first version of the CompOscillator, labled with 'short', training uses to little environment interactions 1e5 vs 2e5. The second version uses the correct number.

rickstaa

extra

2y ago

3d 6h 25m 28s

nn.ReLU

256

176

nn.ReLU

true

0.8

256

gpu:1

["Oscillator-v1","OscillatorComplicated-v1"]

["stable_gym.envs.biological.oscillator.oscillator.Oscillator","stable_gym.envs.biological.oscillator_complicated.oscillator_complicated.OscillatorComplicated"]

65.33333

["han2020_reproduction_lac_oscillator_complicated_alpha3_tune_exp_alp0-2","han2020_reproduction_lac_oscillator_complicated_alpha3_tune_exp_alp0-3","han2020_reproduction_lac_oscillator_complicated_alpha3_tune_exp_alp0-4","han2020_reproduction_lac_oscillator_complicated_alpha3_tune_exp_alp0-8","han2020_reproduction_lac_oscillator_complicated_alpha3_tune_exp_alp1-0","han2020_reproduction_lac_oscillator_complicated_alpha3_tune_exp_alp1-1","han2020_reproduction_lac_oscillator_complicated_alpha3_tune_exp_alp1-2","han2020_reproduction_lac_oscillator_complicated_alpha3_tune_exp_alp1-3","han2020_reproduction_lac_oscillator_complicated_alpha3_tune_exp_alp1-4","han2020_reproduction_lac_oscillator_complicated_alpha3_tune_exp_alp1-5"]

false

0.99

0.0001

1.0000e-9

0.0003

3.0000e-9

step

linear

400

minimize

0.995

1000000

346296.6

2048

1000

100

Finished

Here, we increased the total training steps used for the Oscillator and CompOscillator training from 1e5 to 2e5 in the alpha3 hyperparameter tuning of our reproduction study.

rickstaa

extra

2y ago

19h 39m 20s

nn.ReLU

256

176

nn.ReLU

true

0.8

256

gpu:1

["Oscillator-v1","OscillatorComplicated-v1"]

["stable_gym.envs.biological.oscillator.oscillator.Oscillator","stable_gym.envs.biological.oscillator_complicated.oscillator_complicated.OscillatorComplicated"]

["han2020_reproduction_lac_oscillator_alpha3_tune_exp_extra_long_alp0-2","han2020_reproduction_lac_oscillator_alpha3_tune_exp_extra_long_alp0-3","han2020_reproduction_lac_oscillator_alpha3_tune_exp_extra_long_alp0-4","han2020_reproduction_lac_oscillator_alpha3_tune_exp_extra_long_alp0-6","han2020_reproduction_lac_oscillator_alpha3_tune_exp_extra_long_alp0-7","han2020_reproduction_lac_oscillator_alpha3_tune_exp_extra_long_alp0-9","han2020_reproduction_lac_oscillator_alpha3_tune_exp_extra_long_alp1-0","han2020_reproduction_lac_oscillator_alpha3_tune_exp_extra_long_alp1-2","han2020_reproduction_lac_oscillator_alpha3_tune_exp_extra_long_alp1-3","han2020_reproduction_lac_oscillator_alpha3_tune_exp_extra_long_alp1-4"]

false

0.99

0.0001

1.0000e-9

0.0003

3.0000e-9

step

linear

400

minimize

0.995

1000000

26203.8

2048

1000

100

Finished

The experiments of the alpha3 hyperparameter tuning performed in our reproduction study. One experiment, the CompOscillator, was later replaced since the step size mentioned in Han et al.'s paper was inconsistent, and we later decided to increase it.

rickstaa

2y ago

3d 4h 10m 22s

nn.ReLU

256

99.2

nn.ReLU

true

0.8

256

gpu:1

["CartPoleCost-v1","FetchReachCost-v1","Oscillator-v1","OscillatorComplicated-v1"]

176.2

["han2020_reproduction_lac_fetch_reach_alpha3_tune_infinite_horizon_exp_alp0-1","han2020_reproduction_lac_fetch_reach_alpha3_tune_infinite_horizon_exp_alp0-4","han2020_reproduction_lac_fetch_reach_alpha3_tune_infinite_horizon_exp_alp0-5","han2020_reproduction_lac_fetch_reach_alpha3_tune_infinite_horizon_exp_alp0-6","han2020_reproduction_lac_fetch_reach_alpha3_tune_infinite_horizon_exp_alp0-7","han2020_reproduction_lac_fetch_reach_alpha3_tune_infinite_horizon_exp_alp0-9","han2020_reproduction_lac_fetch_reach_alpha3_tune_infinite_horizon_exp_alp1-0","han2020_reproduction_lac_fetch_reach_alpha3_tune_infinite_horizon_exp_alp1-1","han2020_reproduction_lac_fetch_reach_alpha3_tune_infinite_horizon_exp_alp1-2","han2020_reproduction_lac_fetch_reach_alpha3_tune_infinite_horizon_exp_alp1-3"]

false

0.991

0.99

0.0001

5.5333e-10

0.0003

1.6600e-9

step

linear

290

minimize

0.995

1000000

26203.8

2048

1000

100

Finished

A small pilot study where we set use a constant 3e-4 lambda learning rate and decay the other learning rates to 1e-10.

rickstaa

2y ago

2d 8h 27m 57s

nn.ReLU

256

112

nn.ReLU

true

0.8

256

gpu:1

["CartPoleCost-v1","FetchReachCost-v1","Oscillator-v1","OscillatorComplicated-v1"]

183.5

["han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp0-1","han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp0-2","han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp0-6","han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp0-8","han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp0-9","han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp1-1","han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp1-2","han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp1-3","han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp1-4","han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp1-5"]

false

0.99

0.0001

1.0000e-10

0.0003

1.0000e-10

step

linear

312.5

minimize

0.995

1000000

26203.8

2048

1000

100

Finished

A pilot study where we check how the algorithm behaves when we let the lambda learning rate decay linearly and set it equal to the actor learning rate.

rickstaa

2y ago

9d 13h 33m 30s

nn.ReLU

256

110

nn.ReLU

true

0.795

256

["cpu","gpu","gpu:1"]

["CartPoleCost-v1","FetchReachCost-v1","Oscillator-v1","OscillatorComplicated-v1"]

195.71875

["han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp1-1","han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp1-2","han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp1-3","han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp1-4","han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp1-5","han2020_reproduction_lac_fetch_reach_alpha3_tune_exp_alp1-1","han2020_reproduction_lac_fetch_reach_alpha3_tune_exp_alp1-2","han2020_reproduction_lac_fetch_reach_alpha3_tune_exp_alp1-3","han2020_reproduction_lac_fetch_reach_alpha3_tune_exp_alp1-4","han2020_reproduction_lac_fetch_reach_alpha3_tune_exp_alp1-5"]

false

0.99

0.0001

1.0000e-10

0.0003

1.0000e-10

step

linear

310.9375

minimize

0.995

1000000

25085.41875

2048

1000

100

Finished

rickstaa

cartpole

pilot

2y ago

3h 41m 30s

nn.ReLU

[256,256]

[64,64,16]

nn.ReLU

true

1.5

256

gpu:1

CartPoleCost-v1

stable_gym.envs.classic_control.cartpole_cost.cartpole_cost.CartPoleCost

489

han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp1-5

false

0.99

0.0001

1.0000e-10

0.0003

1.0000e-10

step

linear

250

minimize

0.995

1000000

234

2048

1000

100

Finished

rickstaa

cartpole

pilot

2y ago

3h 43m 28s

nn.ReLU

[256,256]

[64,64,16]

nn.ReLU

true

1.5

256

gpu:1

CartPoleCost-v1

stable_gym.envs.classic_control.cartpole_cost.cartpole_cost.CartPoleCost

489

han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp1-5

false

0.99

0.0001

1.0000e-10

0.0003

1.0000e-10

step

linear

250

minimize

0.995

1000000

48104

2048

1000

100

Finished

rickstaa

cartpole

pilot

2y ago

3h 42m 46s

nn.ReLU

[256,256]

[64,64,16]

nn.ReLU

true

1.2

256

gpu:1

CartPoleCost-v1

stable_gym.envs.classic_control.cartpole_cost.cartpole_cost.CartPoleCost

489

han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp1-2

false

0.99

0.0001

1.0000e-10

0.0003

1.0000e-10

step

linear

250

minimize

0.995

1000000

3658

2048

1000

100

Finished

rickstaa

cartpole

pilot

2y ago

3h 43m 2s

nn.ReLU

[256,256]

[64,64,16]

nn.ReLU

true

1.2

256

gpu:1

CartPoleCost-v1

stable_gym.envs.classic_control.cartpole_cost.cartpole_cost.CartPoleCost

489

han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp1-2

false

0.99

0.0001

1.0000e-10

0.0003

1.0000e-10

step

linear

250

minimize

0.995

1000000

78456

2048

1000

100

Finished

rickstaa

cartpole

pilot

2y ago

3h 42m 32s

nn.ReLU

[256,256]

[64,64,16]

nn.ReLU

true

1.3

256

gpu:1

CartPoleCost-v1

stable_gym.envs.classic_control.cartpole_cost.cartpole_cost.CartPoleCost

489

han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp1-3

false

0.99

0.0001

1.0000e-10

0.0003

1.0000e-10

step

linear

250

minimize

0.995

1000000

567

2048

1000

100

Finished

rickstaa

cartpole

pilot

2y ago

3h 47m 58s

nn.ReLU

[256,256]

[64,64,16]

nn.ReLU

true

1.4

256

gpu:1

CartPoleCost-v1

stable_gym.envs.classic_control.cartpole_cost.cartpole_cost.CartPoleCost

489

han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp1-4

false

0.99

0.0001

1.0000e-10

0.0003

1.0000e-10

step

linear

250

minimize

0.995

1000000

234

2048

1000

100

Finished

rickstaa

cartpole

pilot

2y ago

3h 46m 37s

nn.ReLU

[256,256]

[64,64,16]

nn.ReLU

true

1.4

256

gpu:1

CartPoleCost-v1

stable_gym.envs.classic_control.cartpole_cost.cartpole_cost.CartPoleCost

489

han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp1-4

false

0.99

0.0001

1.0000e-10

0.0003

1.0000e-10

step

linear

250

minimize

0.995

1000000

48104

2048

1000

100

Finished

rickstaa

cartpole

pilot

2y ago

3h 46m 34s

nn.ReLU

[256,256]

[64,64,16]

nn.ReLU

true

1.3

256

gpu:1

CartPoleCost-v1

stable_gym.envs.classic_control.cartpole_cost.cartpole_cost.CartPoleCost

489

han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp1-3

false

0.99

0.0001

1.0000e-10

0.0003

1.0000e-10

step

linear

250

minimize

0.995

1000000

78456

2048

1000

100

Finished

rickstaa

cartpole

pilot

2y ago

3h 47m 57s

nn.ReLU

[256,256]

[64,64,16]

nn.ReLU

true

1.4

256

gpu:1

CartPoleCost-v1

stable_gym.envs.classic_control.cartpole_cost.cartpole_cost.CartPoleCost

489

han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp1-4

false

0.99

0.0001

1.0000e-10

0.0003

1.0000e-10

step

linear

250

minimize

0.995

1000000

3658

2048

1000

100

Finished

rickstaa

cartpole

pilot

2y ago

3h 46m 5s

nn.ReLU

[256,256]

[64,64,16]

nn.ReLU

true

1.5

256

gpu:1

CartPoleCost-v1

stable_gym.envs.classic_control.cartpole_cost.cartpole_cost.CartPoleCost

489

han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp1-5

false

0.99

0.0001

1.0000e-10

0.0003

1.0000e-10

step

linear

250

minimize

0.995

1000000

567

2048

1000

100

Finished

A small pilot study to check how step based learning rate decay works on the GPU.

rickstaa

pilot

2y ago

18h 19m 48s

nn.ReLU

256

nn.ReLU

true

0.55

256

gpu

CartPoleCost-v1

stable_gym.envs.classic_control.cartpole_cost.cartpole_cost.CartPoleCost

489

["han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp0-1","han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp0-2","han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp0-3","han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp0-4","han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp0-5","han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp0-6","han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp0-7","han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp0-8","han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp0-9","han2020_reproduction_lac_cartpole_cost_alpha3_tune_exp_alp1-0"]

false

0.99

0.0001

1.0000e-10

0.0003

1.0000e-10

step

linear

250

minimize

0.995

1000000

48104

2048

1000

100

1-20

of 21