scale the bench

Created on August 23|Last edited on November 4
Comment
﻿
base metrics
specific questions
What are the parameters of the actual top performers?
What does plast_clip do? Does it make things explode? If so, when? Is there a min. value for which it can use memory at all? For simplicity, we fix clip_weights. We also fix plast_proportion and n_hidden.
﻿
avg_accuracy
avg_accuracy
Computing group metrics from first 10 groups
05001k1.5kStep0.50.60.70.80.9
plast_clip: 100000, learning_rate: 0.0001
plast_clip: 100000, learning_rate: 0.00007
plast_clip: 300000, learning_rate: 0.00005
plast_clip: 70000, learning_rate: 0.0001
plast_clip: 30000, learning_rate: 0.0002
plast_clip: 50000, learning_rate: 0.0001
plast_clip: 10000, learning_rate: 0.0005
plast_clip: 30000, learning_rate: 0.0003
plast_clip: 30000, learning_rate: 0.0001
plast_clip: 10000, learning_rate: 0.0003
3_palindrome_dataset_vary_length179
﻿
what areas gain any accuracy?
﻿
3_palindrome_dataset_vary_length160
 
2_palindrome_dataset_vary_length179
﻿
What areas blow up?
﻿
3_palindrome_dataset_vary_length313
 
2_palindrome_dataset_vary_length252
﻿
ok, now, what's the effect of clip_weights
What's the effect of the ratio of high-lr to low-lr parameters?﻿
Run set70
﻿
﻿
﻿
Run set70
﻿
﻿
﻿
﻿
Add a comment