Fixed New Dataset

Created on June 4|Last edited on June 8
Comment
These are all using the new long_range_memory_dataset, and my 'candidate' learning rule.
Is one layer or two better?﻿
avg_loss
avg_loss
Select runs that logged avg_loss 
to visualize data in this line chart.
Run set0
﻿
should it have a high or low hidden size?
seems like  lower is better for these fast iterations. 
﻿
Run set0
﻿
What's the ideal learning rate, here? really seems like I can keep going smaller. 
﻿
Run set0
﻿
Does my learning rule gain an advantage in deeper networks?﻿
Run set0
﻿
actually compare to backprop
﻿
Run set0
﻿
Does backprop really just need higher learning rates?
﻿
Run set0
﻿
Try different decay rates﻿
Run set0
﻿
Backprop v. Wackprop: Ultimate Matchup﻿
Run set0
﻿
Heroes﻿
Run set1019
﻿
﻿
﻿
Add a comment