scale the bench
Created on August 23|Last edited on November 4
Comment
base metrics
specific questions
What are the parameters of the actual top performers?
What does plast_clip do?
Does it make things explode? If so, when? Is there a min. value for which it can use memory at all? For simplicity, we fix clip_weights. We also fix plast_proportion and n_hidden.
Computing group metrics from first 10 groups
what areas gain any accuracy?
What areas blow up?
ok, now, what's the effect of clip_weights
What's the effect of the ratio of high-lr to low-lr parameters?
Add a comment