Reports
Created by
Created On
Last edited
Plot: QKNorm revisited
olmoe17-8x1b-final-eps-noqk uses no QK-Norm but RMSNorm with weights
olmoe17-8x1b-final-eps uses non-parametric QK-Norm & RMSNorm with weights
0
2024-07-06
Comparison of eval metrics for OLMoE data ablations
The goal of this analysis is to understand whether our main in-loop downstream evals -- OLMES core 9 plus MMLU -- are sufficiently sensitive to changes in data mix.
0
2024-07-10