Allow the plasticity to reach as low as -plast_clip. in this case, that's 1e2.
Select runs that logged avg_loss
to visualize data in this line chart.
Back to the lower cap for plasticity being 1e-15. Really just test if the decay value can be higher than plasticity.
have the decay match the lr.
compare
I went on to define a larger gap between the lr and the decay.
try:
very small layer size - vsmalllayer
much larger layer size - vlargelayer
larger decay rate - largedecay
incorporate negative plasticity again. - negplast
tinker with a significantly smaller plasticity