str
Created on June 28|Last edited on July 3
Comment
5 layers deep, so it's gotta be lrs of 1e6 and 1e8, with a batch size of 32. I vary the upper limit of the plasticity. On my own machine, I change up the lower limit.
Run set
10
Run set
10
1e7 and 1e9, batch size of one now. I vary the clip of the plasticity again.
Run set
5
Run set
6
Run set
19
shoot - now it looks like, compared to backprop, i've got issues with deep layers. that's sad. time to try to deepen. I'll start from what works and go from there.
Run set
3
get recurring - I try from what works, first. vanilla candidate, without the meta-plasticity. Then I attempt to get meta-plastic networks to the same performance. This is especially hard on progressively harder datasets.
Run set
9
wait -- does plastic-candidate even converge at all?
Run set
2
Time to do a big sweep, to find whether I can get 4 layers to work.
Run set
11
drilling down further. still at 4 layers, we've identified that it's pretty much gotta have a lr smaller than 1e-5.
Run set
14
try 5 layers
Run set
15
uhhh, something I randomly did here brought 5-layer networks back to working, again. I worry that it's less robust compared to backprop. It seems like my learning rules get very volatile when the size of the layers increases or the number of layers increases.
Run set
16
Ok, full sweep on only layers size.
Run set
9
Add a comment