Skip to main content

ReLU/tanh/Linear 4-->8; 8-->16 Runs

Created on November 4|Last edited on November 11

Overall results

Results obtained over 10 runs for 4 most interesting combinations of run.
Experimental settings: Adam with default settings; minibatches of size 32; training for 100 + 100 epochs. Initial parameters are all sampled from N(0, 1).
Types of initialization: We add a tiny bit of noise to all the matrices to break ties
  • Random: Sample new matrix from N(0, 1)
  • Random adjusted: Sample from N(\mu, \sigma) where \mu is mean of parent matrix, \sigma is std of parent matrix
  • Permuted: Randomly shuffle entries in matrix
  • Copy: Copy matrix
Expansion: Start 4-->8 or 8-->16

050100150Epoch0.50.60.70.80.9Validation accuracy
run_type: U[copy, permute; FR:False] W[copy, permute; FR:False] val_accuracy_pretrain
run_type: U[copy, random_adjusted; FR:False] W[copy, random_adjusted; FR:False] val_accuracy_pretrain
run_type: U[copy, random; FR:False] W[copy, random; FR:False] val_accuracy_pretrain
run_type: U[copy, copy; FR:False] W[copy, copy; FR:False] val_accuracy_pretrain
run_type: U[copy, permute; FR:False] W[copy, permute; FR:False] val_accuracy_expand
run_type: U[copy, random_adjusted; FR:False] W[copy, random_adjusted; FR:False] val_accuracy_expand
run_type: U[copy, random; FR:False] W[copy, random; FR:False] val_accuracy_expand
run_type: U[copy, copy; FR:False] W[copy, copy; FR:False] val_accuracy_expand
4 --> 8 (tanh)
50
4 --> 8 (relu)
50
4 --> 8 (linear)
50
8 --> 16 (tanh)
50
8 --> 16 (relu)
40
8 --> 16 (linear)
40