Skip to main content

str

Created on June 28|Last edited on July 3
5 layers deep, so it's gotta be lrs of 1e6 and 1e8, with a batch size of 32. I vary the upper limit of the plasticity. On my own machine, I change up the lower limit.

01k2k3kStep100
2
20
10
15
5
Run set
10



Run set
10

1e7 and 1e9, batch size of one now. I vary the clip of the plasticity again.

Run set
5



Run set
6




Run set
19

shoot - now it looks like, compared to backprop, i've got issues with deep layers. that's sad. time to try to deepen. I'll start from what works and go from there.

Run set
3

get recurring - I try from what works, first. vanilla candidate, without the meta-plasticity. Then I attempt to get meta-plastic networks to the same performance. This is especially hard on progressively harder datasets.

Run set
9


wait -- does plastic-candidate even converge at all?

Run set
2

Time to do a big sweep, to find whether I can get 4 layers to work.

Run set
11

drilling down further. still at 4 layers, we've identified that it's pretty much gotta have a lr smaller than 1e-5.

Run set
14

try 5 layers

Run set
15

uhhh, something I randomly did here brought 5-layer networks back to working, again. I worry that it's less robust compared to backprop. It seems like my learning rules get very volatile when the size of the layers increases or the number of layers increases.

Run set
16

Ok, full sweep on only layers size.

Run set
9