Skip to main content

Plot: Dolma 1.7 vs Dolma-OLMoE

Created on August 4|Last edited on August 29
Validation loss on C4, The Pile etc. is expected to be worse for our new data (news), as Dolma 1.7 contains the C4 training set and other training sets similar to those in The Pile such as Reddit, and thus has a more similar distribution to those.

5k10k15k20k25k30kStep2.533.54
5k10k15k20k25k30kStep23456
Run set
2
Run set 2