lr, plr, and plast_clip are swept, here.
Select runs that logged avg_loss
to visualize data in this line chart.
now, the code is modified to use `torch.exp(plasticity) instead of just plasticity.
now for the smaller models, for speed and visibility