Skip to main content

Self-distill first try (300M models)

Best single model loss: 3.587 Best two ensemble loss: approximately 3.43
Created on July 16|Last edited on September 8

Section 1


0.20.40.60.8run_progress3.63.844.24.44.6
reference
1
sd0715 runs
1
ens2d0717 runs
2
ens4d0721 runs
1
4 separate
1
sd0805
1
8 sep
1
ablation
6