Skip to main content

OLMo-7B-Twin-2T

Training metrics
Created on February 6|Last edited on February 6
This is a subset of the metrics collected during the OLMo-7B-Twin-2T training run. You can explore the full set of metrics on the runs page here.

Train perplexity


500G1T1.5Tthroughput/total_tokens50000100000150000200000
group: OLMo-7B-Twin-2T
Run set
24

Notes: unlike the other OLMo-7B run, this run had a series of instabilities in the form of "fast spikes," i.e. spikes that recovered quickly. Despite the spikes we decided to keep the run going since it was always able to recover. We believe now that the instabilities were a result of a bad random initialization of the model weights.
💡

In-loop downstream evaluations


Run set
24


In-loop perplexity evaluations


Run set
24



Optimizer metrics


Run set
24


Throughput metrics


Run set
24