No Recurrent Connection
Remove the recurrent connection. Try to get the rnn running, anyways, using weight updates for short-term memory. The plasticity value ought to be instrumental, here, some weights being more plastic than others and therefore changing quickly so as to store info.
Created on July 17|Last edited on July 25
Comment
Just look at the whole entire sweep. It's over learning rate, plasticity learning rate, and the batch size. I just wanna catch the high performers here, with the lowest loss numbers the fastest.
Showing first 29 runs
Run set
81
Which batch size seems to do the most? Is it really the batch size that I care about?
Run set
81
I really don't want learning rate to be the defining factor, here, but it seems like it wants to be. That's just gonna get us to the local minima of repeating the same input character back. No memory really needed for that, unfortunately.
Run set
81
I'd like to see this have a bigger influence. The plasticity learning rate ought to define how much plasticity values truly separate out short term and long term memory connections.
Run set
81
I want to know more about the batch size of 1. It's probably what's going to let me really encode memory in the weights.
Run set
21
some more drastic numbers
Run set
35
add some noise
Run set
10
add some noise, but include a different sort of initialization
Run set
10
only clip the weights
Run set
126
Run set
126
Run set
126
Run set
126
Run set
126
Run set
126
Run set
126
Add a comment