Skip to main content

Tootsie 8B dessert v1 ("zircon-badger")

See https://github.com/stanford-crfm/marin/issues/600 for narrative
Created on March 12|Last edited on May 12
Big Idea:
  • Start from end of [monumental-jellyfish]
    • Core tootsie DCLM mix to 3.7 T tokens
    • Cooldown on Dolmino HQ data (without synth math or Flan) to 4.8T tokens
  • add in flan and synthmath (and maintain other mix) for another 200B tokens at low LR
Flan was oversampled, and many synthmaths were inadvertently excluded

Lineage Runs


This set of panels contains runs from a private project, which cannot be shown in this report



Run set
6662