Skip to main content

minGPT - Layer Activations

Created on February 19|Last edited on February 19

Train Loss


05001k1.5kStep01234
Run set
5


Activations Near Zero

(+-0.05)

Embedding Activations


Run set
5


Early LayerNorm + Early Linear


Run set
5


End of the Model

Second Last Linear Layer

Run set
5