Select runs that logged avg_loss to visualize data in this line chart.
Run set
0
Run set
4062
Run set
0
Run set
0
Run set
0
Did I get the deets right? It really seems like it may be the candecay value of 0.99 that is the secret sauce, at least for getting results that pass the basic test of a repeated sequence of length 2. Let's see if it's that value that really fixes things.
Run set
0
is it the number of layers??
Run set
0
Of course, my initial test had a curve upwards near the end. I think a different learning rate helped with that.