Skip to main content

[Offline] SAC-N

Created on September 27|Last edited on June 7
Results are averaged over 4 seeds. For each dataset we plot d4rl normalized score and return.
All reference scores are from: Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble. Maze2d was not tested before, so there are no reference scores.

Locomotion

Maze2d

AntMaze

umaze-v2

Reference score: NaN

Run set
20


umaze-diverse-v2

Reference score: NaN

Run set
24


medium-play-v2

Reference score: NaN

Run set
24


medium-diverse-v2

Reference score: NaN

Run set
28


large-play-v2

Reference score: NaN

Run set
36


large-diverse-v2

Reference score: NaN

Run set
40


Adroit

Pen

Human-v1

Reference score: 9.5

Run set
24


Cloned-v1

Reference score: 64.1

Run set
32


Expert-v1

Reference score: NaN

Run set
36


Door

Human-v1

Reference score: -0.3

Run set
32


Cloned-v1

Reference score: -0.3

Run set
36


Expert-v1

Reference score: NaN

Run set
44



Hammer

Human-v1

Reference score: 0.3

Run set
32


Cloned-v1

Reference score: 0.2

Run set
36


Expert-v1

Reference score: NaN

Run set
44



Relocate

Human-v1

Reference score: -0.1

Run set
32


Cloned-v1

Reference score: 0.0

Run set
36


Expert-v1

Reference score: NaN

Run set
44