Testing different batch sizes
Testing out different batch sizes by doing gradient accumulation. The original model with batch_size = 2 is `learning_curve_338_early_stopping`. The other models are denoted by _{x}_{y} where x is the batch size and y the mini-batch size.
It looks like the performance is getting worse for larger batch sizes.
Created on January 10|Last edited on January 10
Comment
Section 1
Add a comment