NLB Control
Created on March 23|Last edited on March 23
Comment
Since we were seeing unsuccessful transfer on NLB co-bps, I took a step back and ran experiments more similar to section 1, arch/base pilots.
At the outset, it seemed like transfer could in fact be worse than pretraining, but tweaking the Heldout task to leverage the core enc-dec infill path fixed this (all models, including basic transfer ~SoTA). (Not shown in plots, but the relevant exps are also in nlb_control)
Co-bps probe
- Pretraining appears to help for RTT, but not for Maze. (Pretraining on maze is minor)
- It is undetermined whether this is an issue of transfer from or to maze datasets (2K might simply not be enough)
- Note these runs are pretty comparable to the SoTA baselines (i.e. heavy regularization evidently didn't really matter)
- Though if we're talking about 0.02 cobps diffs being signif, maybe this is important...
Run set
15
Kinematic decode probe
- When accounting for val as well, large scale pretraining pulls ahead, and even maze sees pretraining benefit.
- Note optimal schedule is different than for co-bps; kinematic prefers faster schedules?
Run set
15
Add a comment