This report contains the results of the alpha3 hyperparameter tuning, where we used a linearly decaying λ learning rate which was set equal to the actor learning rate.
CartPole
AverageTestEpRet
AverageTestEpRet
Select runs that logged AverageTestEpRet to visualize data in this line chart.
AverageLambda
AverageLambda
Select runs that logged AverageLambda to visualize data in this line chart.