Reports
Created by
Created On
Last edited
Decoding Pilot
Need to figure out how to get fine-tuning work. Testing on RTT single session to kick off.
0
2023-02-17
[Atomicity] MC_RTT 5ms
- On RTT data alone, factor 1 is the only one that achieves competence.
- With Maze data augmented (i.e. 30K base + 3K added, likely not a representational quality increase), RTT Factor 2 becomes feasible.
0
2023-02-10
[Atomicity] Compute-normalized factor size comps
Hm, BPS clouds picture, let's just look at loss.
0
2023-02-10
[Throughput] Mask Ratio effects
Masking more increases throughput (and perf/flop), but at what cost?
0
2023-02-10
Flat vs factorized
Factorized is objectively much more efficient but a flat model has more eventual potential/is more agnostic. Per Kaiming's spatiotemporal paper, this might be better at scaled throughputs.
0
2023-02-10