Gaochenxiao's workspace
Runs
90
Name
10 visualized
task: walker2d-medium-expert-v2
task: walker2d-medium-expert-v2
2
10
task: walker2d-medium-replay-v2
task: walker2d-medium-replay-v2
2
10
task: hopper-medium-expert-v2
task: hopper-medium-expert-v2
2
10
task: walker2d-medium-v2
task: walker2d-medium-v2
2
10
task: hopper-medium-replay-v2
task: hopper-medium-replay-v2
2
10
task: hopper-medium-v2
task: hopper-medium-v2
2
10
task: halfcheetah-medium-expert-v2
task: halfcheetah-medium-expert-v2
2
10
task: halfcheetah-medium-replay-v2
task: halfcheetah-medium-replay-v2
2
10
task: halfcheetah-medium-v2
task: halfcheetah-medium-v2
2
10
State
Notes
User
Tags
Created
Runtime
Sweep
UtilsRL.numpy_fp
UtilsRL.precision
UtilsRL.torch_fp
actor_lr
actor_opt_decay_schedule
aw_temperature
batch_size
conditioned_logstd
critic_q_lr
critic_v_lr
debug
discount
double
eval_episode
eval_interval
hidden_dims
log_interval
loss_temperature
max_action
max_clip
max_epoch
name
noise_std
norm_layer
normalize_obs
normalize_reward
num_v_update
policy_logstd_min
save_interval
scale_random_sample
seed
step_per_epoch
task
tau
use_log_loss
wandb.entity
wandb.project
device
Eval/length_mean
Eval/length_std
Eval/normalized_score_mean
Eval/normalized_score_std
loss/actor_loss
loss/q_loss
Finished
-
gaochenxiao
2d 7h 59m 24s
-
numpy.float32
float32
torch.float32
0.0003
cosine
3
256
false
0.0003
0.0003
false
0.99
true
10
10
256
10
2
1
6
1000
["consistent_d4rl","tuned_d4rl"]
0
true
false
true
1
-5
50
0
2
1000
walker2d-medium-expert-v2
0.005
false
lamda-rl
XQL-D4RL
["cuda:0","cuda:1"]
1000
0
110.14885
0.32408
-3.02801
0.29023
Finished
-
gaochenxiao
2d 9h 3m 38s
-
numpy.float32
float32
torch.float32
0.0003
cosine
3
640
false
0.0003
0.0003
false
0.99
true
10
10
256
10
3.5
1
6
1000
["consistent_d4rl","tuned_d4rl"]
0
true
false
true
1
-5
50
0
2
1000
walker2d-medium-replay-v2
0.005
false
lamda-rl
XQL-D4RL
["cuda:0","cuda:1"]
721.04
263.23242
63.65255
25.13288
8.53564
1.35656
Finished
-
gaochenxiao
2d 16h 7m 32s
-
numpy.float32
float32
torch.float32
0.0003
cosine
3
640
false
0.0003
0.0003
false
0.99
true
10
10
256
10
2
1
7
1000
["consistent_d4rl","tuned_d4rl"]
0
true
false
true
1
-5
50
0.5
2
1000
hopper-medium-expert-v2
0.005
false
lamda-rl
XQL-D4RL
["cuda:0","cuda:1"]
887.7
157.97109
97.33077
17.60766
-2.7614
0.16119
Finished
-
gaochenxiao
2d 10h 1m 52s
-
numpy.float32
float32
torch.float32
0.0003
cosine
3
640
false
0.0003
0.0003
false
0.99
true
10
10
256
10
6
1
7
1000
["consistent_d4rl","tuned_d4rl"]
0
true
false
true
1
-5
50
0
2
1000
walker2d-medium-v2
0.005
false
lamda-rl
XQL-D4RL
["cuda:0","cuda:1"]
987.78
36.66
83.66085
3.83363
-6.10119
0.60005
Finished
-
gaochenxiao
2d 16h 4m 12s
-
numpy.float32
float32
torch.float32
0.0003
cosine
3
256
false
0.0003
0.0003
false
0.99
true
10
10
256
10
2
1
7
1000
["consistent_d4rl","tuned_d4rl"]
0
true
false
true
1
-5
50
0
2
1000
hopper-medium-replay-v2
0.005
false
lamda-rl
XQL-D4RL
["cuda:0","cuda:1"]
938.21
72.71518
95.26813
7.49088
4.70926
1.05782
Finished
-
gaochenxiao
2d 16h 17m 39s
-
numpy.float32
float32
torch.float32
0.0003
cosine
3
640
false
0.0003
0.0003
false
0.99
true
10
10
256
10
3.5
1
7
1000
["consistent_d4rl","tuned_d4rl"]
0
true
false
true
1
-5
50
0
2
1000
hopper-medium-v2
0.005
false
lamda-rl
XQL-D4RL
["cuda:0","cuda:1"]
662.06
139.52734
68.53887
14.41685
-1.15047
0.22542
Finished
-
gaochenxiao
2d 8h 45m 45s
-
numpy.float32
float32
torch.float32
0.0003
cosine
3
640
false
0.0003
0.0003
false
0.99
true
10
10
256
10
1.5
1
6
1000
["consistent_d4rl","tuned_d4rl"]
0
true
false
true
1
-5
50
0
2
1000
halfcheetah-medium-expert-v2
0.005
false
lamda-rl
XQL-D4RL
["cuda:0","cuda:1"]
1000
0
88.55594
9.00393
-4.09935
0.47737
Crashed
-
gaochenxiao
2d 8h 18m 33s
-
numpy.float32
float32
torch.float32
0.0003
cosine
3
256
false
0.0003
0.0003
false
0.99
true
10
10
256
10
1.5
1
6
1000
["consistent_d4rl","tuned_d4rl"]
0
true
false
true
1
-5
50
0
2
1000
halfcheetah-medium-replay-v2
0.005
false
lamda-rl
XQL-D4RL
["cuda:0","cuda:1"]
1000
0
44.05103
2.24956
-0.4076
0.67087
Finished
-
gaochenxiao
2d 7h 50m 57s
-
numpy.float32
float32
torch.float32
0.0003
cosine
3
640
false
0.0003
0.0003
false
0.99
true
10
10
256
10
1.5
1
7
1000
["consistent_d4rl","tuned_d4rl"]
0.05
true
false
true
1
-5
50
0
2
1000
halfcheetah-medium-v2
0.005
false
lamda-rl
XQL-D4RL
["cuda:0","cuda:1"]
1000
0
47.59483
0.59181
-5.39199
0.38361
1-9
of 9