minGPT - Layer Activations
Created on February 19|Last edited on February 19
Comment
Train Loss
Run set
5
Activations Near Zero
(+-0.05)
Embedding Activations
Run set
5
Early LayerNorm + Early Linear
Run set
5
End of the Model
Second Last Linear Layer
Run set
5
Add a comment