Skip to main content

Tootsie 8B dessert v2 ("fiery-hippo")

See https://github.com/stanford-crfm/marin/issues/600 for narrative
Created on March 12|Last edited on May 12
Big Idea:
  • Start from end of [monumental-jellyfish]
    • Core tootsie DCLM mix to 3.7 T tokens
    • Cooldown on Dolmino HQ data (without synth math or Flan) to 4.8T tokens
  • add in flan and synthmath (and maintain other mix) for another 200B tokens at low LR
(Fixed relative to zircon-badger

Lineage Runs


Run set
4



Run set