[Offline] EDAC
Created on September 28|Last edited on June 7
Comment
Results are averaged over 4 seeds. For each dataset we plot d4rl normalized score and return.
All reference scores are from: Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble. Maze2d was not tested before, so there are no reference scores.
Locomotion
Maze2d
AntMaze
umaze-v2
Reference score: NaN
Run set
16
umaze-diverse-v2
Reference score: NaN
Run set
20
medium-play-v2
Reference score: NaN
Run set
20
medium-diverse-v2
Reference score: NaN
Run set
24
large-play-v2
Reference score: NaN
Run set
32
large-diverse-v2
Reference score: NaN
Run set
36
Adroit
Pen
Human-v1
Reference score: 52.1
Run set
20
Cloned-v1
Reference score: 68.2
Run set
28
Expert-v1
Reference score: NaN
Run set
32
Door
Human-v1
Reference score: 10.7
Run set
28
Cloned-v1
Reference score: 9.6
Run set
32
Expert-v1
Reference score: NaN
Run set
40
Hammer
Human-v1
Reference score: 0.8
Run set
28
Cloned-v1
Reference score: 0.3
Run set
32
Expert-v1
Reference score: NaN
Run set
40
Relocate
Human-v1
Reference score: 0.1
Run set
28
Cloned-v1
Reference score: 0.0
Run set
32
Expert-v1
Reference score: NaN
Run set
40
Add a comment