Skip to main content

Negatives from Experience Replay vs Negatives from Expert

Size of Experience Replay is 10. Fixed subset of size 10 is chosen from the expert dataset.
Created on March 12|Last edited on March 12
Environment reward is plotted as AverageEnvEpRet.

Computing group metrics from first 10 groups
0100200300400Step-200-180-160-140-120-100
Computing group metrics from first 10 groups
0100200300400Step50100150
Computing group metrics from first 10 groups
0100200300400Step-250-200-150-100-500
Computing group metrics from first 10 groups
0100200300400Step-800-600-400-200
Computing group metrics from first 10 groups
0100200300400Step-200-1000100
Computing group metrics from first 10 groups
0100200300400Step-800-600-400-200
Run set
329