Grid Search for Mini Batch Gradient Descent

This report contains the run for the grid search of different mini-batch sizes and learning rates. We have the following list for the two hyperparameters lr_list=[0.0001, 0.001, 0.01, 0.1] batch_size_list = [1,8,16,32,64]

Astarag Mohapatra

Created on February 13|Last edited on February 13

Comment

﻿
Section 1The learning curve for different configurations. You can zoom in on individual plots to see how they are performing
﻿
This set of panels contains runs from a private project, which cannot be shown in this report
﻿
This is the grid search run for mini-batch gradient descent. We can see that for higher batch size, we have less total time taken, and for the lower batch size we have higher time.
﻿
This set of panels contains runs from a private project, which cannot be shown in this report
﻿
﻿

Add a comment