Skip to main content

MAC Network Weight Decay and Learning Rate

Created on September 14|Last edited on September 15

Results

Even a small weight decay of 1e-5 kills the performance of the MAC network. We clearly see that the MAC network learns nothing after a while if we inspect the gradients of the MAC cell.




Select runs that logged train/f1
to visualize data in this line chart.
Select runs that logged train/accuracy
to visualize data in this line chart.
Select runs that logged train/loss
to visualize data in this line chart.
Run set