Fixed New Dataset
Created on June 4|Last edited on June 8
Comment
These are all using the new long_range_memory_dataset, and my 'candidate' learning rule.
Is one layer or two better?
Run set
8
should it have a high or low hidden size?
seems like lower is better for these fast iterations.
Run set
23
What's the ideal learning rate, here?
really seems like I can keep going smaller.
Run set
41
Does my learning rule gain an advantage in deeper networks?
Run set
33
actually compare to backprop
Run set
49
Does backprop really just need higher learning rates?
Run set
28
Try different decay rates
Run set
28
Backprop v. Wackprop: Ultimate Matchup
Run set
44
Heroes
Run set
1965
Add a comment