Skip to main content

Tootsie 8B cooldown v2 ("groovy-parrot")

See https://github.com/stanford-crfm/marin/issues/600 for narrative
Created on March 12|Last edited on May 12

Big Idea:

  • Core tootsie DCLM mix to 3.7 T tokens
  • Cooldown on Dolmino HQ data (with synth math and Flan) to 4.8T tokens
NB that the final run (blue, "cooldown take 2") starts from a slightly earlier checkpoint from the red run since I messed the red one up.

Lineage Runs


10G100G1Tthroughput/total_tokens2345678910
Run set
3



Run set