Is there a difference in results if we vary the Human Action Buffer size?
Here, we compare:
Alpha Parameters (1.0, 0.9, and 0.8) where the Human Action Buffer and Robot Action Buffer are the same size (100,000)
Alpha Parameters (1.0, 0.9, and 0.8) where the Human Action Buffer (10,000) is 10% of the Robot Action Buffer size (100,000)
Alpha Parameters (1.0, 0.9, and 0.8) where the Human Action Buffer (1000) is 1% of the Robot Action Buffer size (100,000)
For the last case: on average, for every 100 experiences inputted into the Robot Action Buffer, 1 experience is inputted into the Human Action Buffer. So, in this case, the sizes of the Human/Robot Action Buffers are proportional to the number of experiences inputted.
Where Human Action Buffer is 10% of Robot Action Buffer Size:
average_score
average_score
Select runs that logged average_score to visualize data in this line chart.
Run set
0
Where Human Action Buffer is 1% of Robot Action Buffer Size. This is proportional to the amount of experiences inputted in Robot Action Buffer: