Skip to main content
pszemraj
Projects
recurrent-experiments
Reports
griffin aka recurrent_gemma arch
Log in
Sign up
Share
Comment
Star
Share
Comment
Star
griffin aka recurrent_gemma arch
some initial experiments with activation, layer count on simple_wikipedia_LM
Peter Szemraj
Created on April 25
|
Last edited on April 25
Comment
example output model can be found
here
Section 1
eval/loss
eval/loss
0.5
1
1.5
train/epoch
5
6
7
8
9
10
eval/accuracy
eval/accuracy
0.5
1
1.5
train/epoch
0.009
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.1
0.2
0.3
0.4
eval/steps_per_second
eval/steps_per_second
0.5
1
1.5
train/epoch
1
2
3
4
Run set
4
Run set
4
Run set
4
Run set
4
Add a comment