Skip to main content

handing two

Created on March 31|Last edited on April 30

last two instead of last one

current

big_long_sweep, long_range_memory dataset

050100150200250Step11e+20
learning_rate: 0.001, plast_clip: 100000
learning_rate: 0.001, plast_clip: 10000
learning_rate: 0.001, plast_clip: 1000
learning_rate: 0.001, plast_clip: 100
learning_rate: 0.0001, plast_clip: 100000
learning_rate: 0.0001, plast_clip: 10000
learning_rate: 0.0001, plast_clip: 1000
learning_rate: 0.0001, plast_clip: 100
learning_rate: 0.00001, plast_clip: 100000
learning_rate: 0.00001, plast_clip: 10000
learning_rate: 0.00001, plast_clip: 1000
learning_rate: 0.00001, plast_clip: 100
050100150200250Step0.60.70.80.912345678910
0.001 100000
0.001 10000
0.001 1000
0.001 100
0.0001 100000
0.0001 10000
0.0001 1000
0.0001 100
0.00001 100000
0.00001 10000
0.00001 1000
0.00001 100
last-two
48

sweep forget rates, too

last-two
83

additionally, sweep grad_clip

last-two
504