Skip to main content

898 Tootsie Soft-Raccoon

Created on March 26|Last edited on May 12

  • GH Issue 898
Starting from Tootsie 8b Monumental-Jellyfish we did a deeper cool down.


The idea was that maybe our model was not sufficiently cooled down. Our final LR from monumental-jellyfish (aka tootsie-phase3) was 1.7e-4, which is closer to Olmo 2 7b's peak LR of 3e-4 than to its final LR of 3e-5. We also saw evidence that our model had lower confidence in general: we needed a lower temperature for alpaca eval to work well etc.


Soft-Racoon

So we continued the same mix with the LR starting from (approximately) the same place. We annealed the LR from 1.7e-4 to 1.7e-5. This is tootsie-8b-soft-raccoon3 (gray as of this writing). (The first two soft-raccoons had uninteresting config problems)

Loss decreases (both held out tulu sft data and the training mix) for most of the run but ominously at the end the loss started increasing... This started around 2.2e-5.

XXX link to SFT on soft raccoon. Overall tt performed better on tulu during SFT, and alpacaeval went up, but was still quite bad (~4)


Softer-Raccoon

We decided to lower the LR even more since things got better, but that ominous increasing training loss (and validation loss!) trend increased! Oh no!

We tried to diagnose:

So who knows. We're moving on to hypnotic-spoonbill (TODO) which includes some tulu data and flan (and a higher min LR) but otherwise looks like soft raccoon. Fingers crossed.


This set of panels contains runs from a private project, which cannot be shown in this report