Skip to main content

[Offline] EDAC

Created on September 28|Last edited on June 7
Results are averaged over 4 seeds. For each dataset we plot d4rl normalized score and return.
All reference scores are from: Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble. Maze2d was not tested before, so there are no reference scores.

Locomotion

Maze2d

AntMaze

umaze-v2

Reference score: NaN

Run set
16


umaze-diverse-v2

Reference score: NaN

Run set
20


medium-play-v2

Reference score: NaN

Run set
20


medium-diverse-v2

Reference score: NaN

Run set
24


large-play-v2

Reference score: NaN

Run set
32


large-diverse-v2

Reference score: NaN

Run set
36


Adroit

Pen

Human-v1

Reference score: 52.1

Run set
20


Cloned-v1

Reference score: 68.2

Run set
28


Expert-v1

Reference score: NaN

Run set
32


Door

Human-v1

Reference score: 10.7

Run set
28


Cloned-v1

Reference score: 9.6

Run set
32


Expert-v1

Reference score: NaN

Run set
40



Hammer

Human-v1

Reference score: 0.8

Run set
28


Cloned-v1

Reference score: 0.3

Run set
32


Expert-v1

Reference score: NaN

Run set
40



Relocate

Human-v1

Reference score: 0.1

Run set
28


Cloned-v1

Reference score: 0.0

Run set
32


Expert-v1

Reference score: NaN

Run set
40