A Brief Introduction to Mixture Model Networks (MoNet)
This article provides an overview of the Mixture Model Networks (MoNet) architecture, with code examples in PyTorch Geometric and interactive visualizations using W&B.
Created on September 5|Last edited on June 28
Comment
In this article, we'll briefly go over the mixture models networks (MoNet) architecture proposed in the paper Geometric deep learning on graphs and manifolds using mixture model CNNs by Federico Monti, Davide Boscaini, Jonathan Masci, Emanuele Rodolà, Jan Svoboda and Michael M. Bronstein.
This is one of the fundamental models from the graph attention networks paradigm inspired by work in spectral graph theory and offers a different definition of the convolutional operation compared to the spectral domain. If you'd like to dig into our associated colab, you can find that link here:
There are three main classes of models of graph neural networks, namely message-passing graph neural networks, graph convolutional networks and graph attention networks. For a brief overview of the three paradigms, you can refer to the following blog posts:
An Introduction to Graph Attention Networks
This article provides a beginner-friendly introduction to Attention based Graphical Neural Networks (GATs), which apply deep learning paradigms to graphical data.
An Introduction to Convolutional Graph Neural Networks
This article provides a beginner-friendly introduction to Convolutional Graph Neural Networks (GCNs), which apply deep learning paradigms to graphical data.
An Introduction to Message Passing Graph Neural Networks
This article provides a beginner-friendly introduction to Message Passing Graph Neural Networks (MPGNNs), which apply deep learning paradigms to graphical data.
Table of Contents
Definition of MoNetMoNet MethodImplementing the MoNet ModelMoNet Model: Training Results SummaryRecommended Reading
Definition of MoNet
Strictly speaking, MoNet comes under the banner of spectral models as opposed to spatial models. The definition of convolution that we have discussed so far in our introductory article on graph convolutional nets is based on approximations of the definition of convolution from the Euclidean domain.
Whereas arguably the definition of convolution in spectral models is more true to its intention. Spectral methods define the convolution operation to perform template matching over small local "patches."
In particular, these models propose a way to make patches and perform operations as a function of local graphs or manifolds. However, from a wider perspective, this method relies on local patches and then computes a metric between the patches (template matching) so it's generally grouped under the banner of graph attention networks.
MoNet Method
As we explained graph attention networks (GAT) as the extension of a general formulation, we'll do the same here. The general formula for computing intermediate representations in attention-based graph neural networks are:
In the case of mixture model networks (MoNet) the attention mechanism (i.e. ) and the wider update rule is as follows:
Very complicated looking I know! The here is the number of kernels, so we just focus on , where are edge features. It's worth mentioning that mixture model networks introduced and studied a family of functions represented a mixture of Gaussian kernels.
Family of Functions of Gaussian Kernels
Implementing the MoNet Model
As with other models discussed in the series, we go to PyTorch Geometric again for implementation of the attention mechanism discussed above outlined in the paper (GMMConv).
Let's walk through a minimal example implementation:
class MoNet(torch.nn.Module):def __init__(self, in_channels, hidden_channels, out_channels):super().__init__()self.conv1 = GMMConv(in_channels, hidden_channels, dim=2, kernel_size=16)self.conv2 = GMMConv(hidden_channels, out_channels, dim=2, kernel_size=16)def forward(self, data):x, edge_index, edge_attr = data.x, data.edge_index, data.edge_attrx = F.dropout(x, p=0.5, training=self.training)x = F.elu(self.conv1(x, edge_index, edge_attr))x = F.dropout(x, p=0.5, training=self.training)x = self.conv2(x, edge_index, edge_attr)return F.log_softmax(x, dim=1)
MoNet Model: Training Results
We train some models for 50 epochs to perform Node classification on the Cora Dataset, using the minimal model implementation as stated above, and report the training loss and accuracy, comparing the effect of the hidden dimension on the overall performance.
Run set
3
Summary
In this article, we learned about the Mixture Model Networks (MoNet) architecture, along with code and interactive visualizations. To see the full suite of W&B features, please check out this short 5 minutes guide.
If you want more reports covering graph neural networks with code implementations, let us know in the comments below or on our forum ✨!
Check out these other reports on Fully Connected covering other Graph Neural Networks-based topics and ideas.
Recommended Reading
An Introduction to GraphSAGE
This article provides an overview of the GraphSAGE neural network architecture, complete with code examples in PyTorch Geometric, and visualizations using W&B.
A Brief Introduction to Residual Gated Graph Convolutional Networks
This article provides a brief overview of the Residual Gated Graph Convolutional Network architecture, complete with code examples in PyTorch Geometric and interactive visualizations using W&B.
What are Graph Isomorphism Networks?
This article provides a brief overview of Graph Isomorphism Networks (GIN), complete with code examples in PyTorch Geometric and interactive visualizations using W&B.
A Brief Introduction to Graph Attention Networks
This article provides a brief overview of the Graph Attention Networks architecture, complete with code examples in PyTorch Geometric and interactive visualizations using W&B.
Add a comment
Iterate on AI agents and models faster. Try Weights & Biases today.