OLMo-7B-Twin-2T
Training metrics
Created on February 6|Last edited on February 6
Comment
This is a subset of the metrics collected during the OLMo-7B-Twin-2T training run. You can explore the full set of metrics on the runs page here.
Train perplexity
Run set
24
Notes: unlike the other OLMo-7B run, this run had a series of instabilities in the form of "fast spikes," i.e. spikes that recovered quickly. Despite the spikes we decided to keep the run going since it was always able to recover. We believe now that the instabilities were a result of a bad random initialization of the model weights.
💡
In-loop downstream evaluations
Run set
24
In-loop perplexity evaluations
Run set
24
Optimizer metrics
Run set
24
Throughput metrics
Run set
24
Add a comment