Visualizing The Effect of Attention on Gradient Flow Using Custom Charts

In this article, we take a look into gradient propagation in attentive recurrent models with Weights & Biases Custom Charts feature.

Kyle Goyette

Created on October 7|Last edited on November 9

Comment

﻿
Attentive recurrent models can be painful to inspect. I was interested in finding out how the structure of learned attention mechanisms interacted with gradient flow in sequential models. 
So I created the visualization below. Hovering over any point in this model shows the strength of the connection between all other time steps and the selected time step. From this, I was clearly able to see and understand how my model leveraged attention to learn to solve the denoise task, and how gradient flowed in the learned structure.
﻿
﻿
Run set1
﻿
﻿

Add a comment

Tags: Intermediate, Reinforcement Learning, Gaming, Experiment, Research, Panels, Plots, Slider

Iterate on AI agents and models faster. Try Weights & Biases today.