[Scale] Inductive bias

Created on March 3|Last edited on March 4
Comment
This data is unsorted, stable interface, so results are not confounded by artificial channel shifting/padding.
While spacetime is worse at 800 scale cross-animal transfer, it recovers by 6400 trials.stitch was reduced to bottleneck of 64 dimensions.
Cross-animal does appear to scale, though it does so again at a factor below standard multi-session.
It is for example unclear (and I don't think we have the means to really test this) whether the saturation of transfer from cross-animal is different from cross-session, but Rizzoglio's CCA alignment suggests no fundamental difference about cross-animal. 
That will be empirical.
﻿
﻿
Run set6
Run set 22
Run set 35
﻿
﻿
﻿
Similarly the spacetime model does underperform for 1600 across-session, but it also recovers by around 6K trials. 
Stitch actually never outperforms flat. I'm going to tentatively conclude that stitching isn't a helpful mechanism, actually, due to the # of params it introduces -- it is still relatively advantageous when channels are unstable across contexts (e.g.  cross-subject or with sorted neurons) - because nonflat does fail in this case.
﻿
So there are several separate throughlines
Spacetime is the best architecture, and in particular it overtakes non spatial by around 6K trials.
Cross-context transfer does occur and does scale. In-context scales better, but also won't scale beyond a hypothetical e.g. 10K trials.
Tiny caveat for avoiding direct comparisons with single_100 line below is that the scaling we observe below is acausal but IRDT that's an issue.
Cross-context transfer will likely saturate sooner than in-context, but we can get a respectable amount of scaling (from 100 to 1600 trials). 
But due to power laws we likely will not see great scaling from e.g. 400 trials (I doubt we'll ever see 400 -> 3200).
﻿
Weak point is that stitching isn't really examined in depth, but I'm not really sure what else to do (PCR is inapplicable).
﻿
Run set4
 
Run set 23
﻿
﻿
Add a comment