Skip to main content

riff on normal distributions

Created on September 11|Last edited on September 11
positive plasticity only.

Select runs that logged avg_loss
to visualize data in this line chart.
Run set
0

gelu

Run set
0

use mse loss instead of crossentropy.

Run set
0

slower last layer, remove relu derivative calculation, since I'm using sigmoid rn.

Run set
0

even slower last layer

Run set
0

actually make the last layer faster

Run set
0

maybe fixed pos_only

Run set
0

go mini

Run set
0