Visualizing NLP Attention Based Models Using Custom Charts

A quick introduction into using a custom chart to visualize attention models in an NLP application - Neural Machine Translation.
Kyle Goyette

Visualizing NLP Attention

Sometimes in NLP we can use attention to understand why a model makes a specific prediction. In neural machine translation, we often use a seq2seq model, where an encoder accepts and input, and the decoder attends to each of the states in the encoder. In the visualization below, the encoder receives as input the words in the bottom row, while the decoder outputs the words in the top row. We see that the decoder attends to each of the words which are a direct translation, for example, to output the word purse, the decoder attended most strongly to geldborse.

In this visualization, you can hover over any of the words to show which other word has the strongest connection between the encoder and decoder. Moreover, the strength of the connections are represented by the width of the lines which form the connection.