Skip to main content

tokenizer experiments

various tokenizer experiments on the same L12 H768 neobert arch (mostly)
Created on October 6|Last edited on October 6

train metrics


20k40k60k80k100kStep22.533.5
20k40k60k80k100kStep0.60.811.21.41.61.8
20k40k60k80k100kStep0.40.450.50.550.60.65
20k40k60k80k100kStep0.000020.000040.000060.00008
20k40k60k80k100kStep678910
20k40k60k80k100kStep1e+82e+83e+84e+85e+86e+87e+8
Run set
8


run comparer


Run set
8


other stuff

redundant metrics

kept for completeness

Run set
8


system


Run set
8