Permuted MNIST CNN

Created on October 26|Last edited on October 26
Comment
﻿
Best results:﻿
Valid loss
Valid loss
Showing first 10 runs
05101520Step0.080.10.120.140.160.18
Train acc
Train acc
Showing first 10 runs
05101520Step0.880.90.920.940.960.98
Train loss
Train loss
Showing first 10 runs
05101520Step0.10.20.30.4
Valid acc
Valid acc
Showing first 10 runs
05101520Step0.940.950.960.970.98
Run set578
﻿
Observations:﻿
Run set502
﻿
Both strides give good accuracy, but stride 3 ends up being best as it is most used in top-20 by accuracy.
Number of channels does not matter much, but max accuracy is still reached with high (~100) number of channels.
Best lr <<<< 0.1.
Bigger kernel size captures more info.
Almost all values of dropout are workable. Around 0.3 is best.
﻿
Comparision with MLP;Both are able to get good results.
By permuting pixels most of spatial information for humans is lost. However since all images are permuted in same fashion, all features are transformed into some other features which may not be spatially connected.
For eg: all features of '7' are lost, but since they are transformed in similar fashion, CNN can still extract them. 
MLP performs good from the start. Its accuracy is same as normal MNIST.
﻿
Add a comment