Starts with the Key-Recall task. Then, perhaps the associative retrieval task from Ba & Hinton, repeated sequences, or even the baby names or short stories datasets. Might consider hyperparameter sweeps or a search for the ideal effective update magnitude.
A synthetic dataset is constructed for simple, single character retrieval. The sequence contains a character, '?', indicating that the next input contains a value to be stored. Later in the sequence, another character, '!', requests the stored value. The gaps are filled with 0s. Some examples:
0000?10!1
00000?200!2
00?,00!,
00?.0!.
avg_loss
avg_loss
Select runs that logged avg_loss to visualize data in this line chart.
no outliers
no outliers
Select runs that logged avg_loss to visualize data in this line chart.
Run set
0
Repeated Sequences
Simple next-character prediction on a repeated sequence of four random characters.