Skip to main content

[Atomicity] MC_RTT 5ms

- On RTT data alone, factor 1 is the only one that achieves competence. - With Maze data augmented (i.e. 30K base + 3K added, likely not a representational quality increase), RTT Factor 2 becomes feasible.
Created on February 10|Last edited on February 17
Status: Concluded; factor only affects efficiency (within a 1-4x scope), data may interact with feasible factors.
  • Flat models behave a little differently and appear slightly more stable in training.
  • Initial impressions indicated that factor-4 models were not stable on RTT; but continued training eventually stabilized.
  • Adding maze data improves learning efficiency, at least in factor models.
    • I can't quite tell what the diff is that would make even val loss different when including maze; most data is the same so perf on Zenodo Indy datasets should be similar. Rerunning to confirm (see @Multisession, Multisubject pilot).
    • The simple truth is that rtt_maze_5_factor_2 is good in a way that's not matched by any run on only rtt data, except factor_1.


Factorized models only


Run set
17


Including flat models.

  • The picture is complicated once we introduce flat models. Smaller factors don't appear very different, possibly more overfit by flat_1.
    • That is, flat_1 is behaving worse than factor_1 for MC_RTT evaluation.
  • Only flat_1, flat_2, flat_4 are matched to only use Indy, appear comparable to factor_2, factor_4
    • rtt_joint_f4_m5 (which includes Loco data and is a slightly bigger model to compensate) similarly matches performance

Run set
13


To revisit a motivating point that factor_4 runs seems quite seem to bec unstable; this doesn't actually seem to be the case after training for longer periods of time (though factor_4 does appear less efficient.
flat runs appear different in that flat_1,2,4 are similarly stable.
Note the rtt_5_factor_1 multi runs aren't particularly more efficient; the first blue curve (`jlvknwc8`) ~ matches the efficiency of flat curves and only sustained training hits the new plateau.
This plateau is again similarly reached simply by adding maze data.

Run set
10




Adding Maze stabilizes learning?

Comparing rtt_5_factor_2 vs rtt_maze_5_factor_2 makes it clear maze can sometimes help learning significantly.
In flat models, the verdict is a bit less clear since the trajectories start quite good; plausible that all models would've hit insight to reach ceiling levels in the next 10 hours of training, with or without maze.


Run set
8