Skip to main content
homebrewnlp
Projects
gpt
Reports
Activation vs No-Activation
Log in
Sign up
Share
Comment
Star
Share
Comment
Star
Activation vs No-Activation
Lucas Nestler
Created on October 11
|
Last edited on October 12
Comment
Loss/Median256
Loss/Median256
100
10k
Step
1
2
3
4
5
group: tied-moe-modulo
Loss/Median256
Loss/Median256
20k
40k
60k
80k
Step
1
2
3
4
5
group: tied-moe-modulo
Run set
13
Add a comment