NLB Control

Created on March 23|Last edited on March 23
Comment
Since we were seeing unsuccessful transfer on NLB co-bps, I took a step back and ran experiments more similar to section 1, arch/base pilots.
﻿
At the outset, it seemed like transfer could in fact be worse than pretraining, but tweaking the Heldout task to leverage the core enc-dec infill path fixed this (all models, including basic transfer ~SoTA). (Not shown in plots, but the relevant exps are also in nlb_control)
Co-bps probePretraining appears to help for RTT, but not for Maze. (Pretraining on maze is minor)
It is undetermined whether this is an issue of transfer from or to maze datasets (2K might simply not be enough)
Note these runs are pretty comparable to the SoTA baselines (i.e. heavy regularization evidently didn't really matter)
Though if we're talking about 0.02 cobps diffs being signif, maybe this is important...
﻿
Run set15
﻿
﻿
Kinematic decode probeWhen accounting for val as well, large scale pretraining pulls ahead, and even maze sees pretraining benefit.
Note optimal schedule is different than for co-bps; kinematic prefers faster schedules? 
﻿
Run set15
﻿
﻿
Add a comment