Skip to main content

big scale

Created on August 26|Last edited on August 30
Curious to see how the smoother, text dataset fares. I have a bit of a comparison between runs that have the same effective learning rate, as well.

05001k1.5kStep00.20.40.60.81
group: text_scale_no_clip
group: text_scale_slow
group: text_scale_slow_no_clip
group: text_scale_base
group: text_scale_4_2
group: text_scale_5_3
group: text_scale_3_0
5001k1.5kStep0.20.40.60.81
5001k1.5kStep0.20.40.60.81
Run set
22