Skip to main content

riff on normal distributions

Created on September 11|Last edited on September 11
positive plasticity only.

0100200300400Step0.80.9123
learning_rate: 0.01, plast_clip: 5
learning_rate: 0.01, plast_clip: 2
learning_rate: 0.001, plast_clip: 2
Run set
9

gelu

Run set
9

use mse loss instead of crossentropy.

Run set
5

slower last layer, remove relu derivative calculation, since I'm using sigmoid rn.

Run set
8

even slower last layer

Run set
12

actually make the last layer faster

Run set
9

maybe fixed pos_only

Run set
6

go mini

Run set
6