Introduction to Graph Neural Networks

Interested in Graph Neural Networks and want a roadmap on how to get started? In this article, we'll give a brief outline of the field and share blogs and resources!
Saurav Maheshkar
Created on January 22|Last edited on March 3
Comment
﻿
Over the past few years, there's been emerging interest in the use of graph data in machine learning. Why is that? Well, graphs provide an excellent mathematical structure to represent molecules (leading to groundbreaking research like AlphaFold) and other networks of various kinds. They've also recently emerged as the key meta-structure for everything, i.e. modalities such as vision, text and speech can be seen as special cases of graphs and thus Graph Representation Learning has become increasingly important.
In this article we will provide a brief overview of the field based on the excellent survey paper Everything is Connected: Graph Neural Networks by Petar Veličković and share key ideas, important notation and links to other relevant blogs.
Source: Figure 1 of Everything is Connected: Graph Neural Networks by Petar Veličković
Table of ContentsImportant NotationEquivariance and InvarianceGraph Neural NetworksSummaryFurther Reading
﻿
﻿
Important Notation
﻿
Let's start by defining a simple and modular description of a graph. A graph is defined as a tuple of sets G=(V,E)\large \mathcal{G} = (\mathcal{V}, \mathcal{E})G=(V,E)﻿ of a set of nodes V\large \mathcal{V}V﻿ and a set of edges E⊆V×V\large \mathcal{E} \sube \mathcal{V} \times \mathcal{V}E⊆V×V﻿ consisting of pairs of nodes that are connected. Both nodes and edges can have properties which are of interest to us.
Every node u∈V\large u \in \mathcal{V}u∈V﻿ is said to have a k\large kk﻿- dimensional feature xu∈Rk\large x_u \in \mathbb{R}^kxu​∈Rk﻿. If we stack the features of all the nodes we get the (node) feature matrix X=[x1...x∣V∣]⊤\large X = [x_1  ... x_{|\mathcal{V}|}]^{\top}X=[x1​...x∣V∣​]⊤﻿.
There are a number of ways in which we could store information about the edges E\large \mathcal{E}E﻿, the most common method is to use a adjacency matrix A∈R∣V∣×∣V∣\large A \in \mathbb{R}^{|\mathcal{V}| \times |\mathcal{V}|}A∈R∣V∣×∣V∣﻿, where 
auv={1(u,v)∈E0(u,v)∉E}\large a_{uv} = \left\{ \begin{array}{ll} 1 \hspace{2em} (u,v) \in \mathcal{E} \\ 0 \hspace{2em} (u,v) \not\in \mathcal{E} \end{array} \right\}auv​={1(u,v)∈E0(u,v)∈E​}﻿
The graph structure also allows for the edges to have some locality around them. This is usually described as the nodes surrounding or in the neighborhood of some node 
Nu={v ∣ (u,v)∈E∨(v,u)∈E}\large \mathcal{N}_u = \left\{ \begin{array}{ll} v \, | \,(u,v) \in \mathcal{E} \lor (v,u) \in \mathcal{E}  \end{array} \right\}Nu​={v∣(u,v)∈E∨(v,u)∈E​}﻿
	And the multiset of all neighborhood features, XNu\large X_{\mathcal{N}_u}XNu​​﻿ can be defined as:
XNu={{xv ∣ v∈Nu }}\large X_{\mathcal{N}_u} = \{\{ x_v \, | \, v \in \mathcal{N}_u \, \}\}XNu​​={{xv​∣v∈Nu​}}﻿
We define a local function i.e. the function which computes the features of a local region of the graph as 
hu=ϕ(xu,XNu)\large h_u = \phi(x_u, X_{\mathcal{N}_u})hu​=ϕ(xu​,XNu​​)﻿
Equivariance and InvarianceMost modern day geometric deep learning relies on exploiting the underlying symmetry of our natural world. These symmetries are exploited using various properties, we shall discuss two such properties today by taking the example of permutation as a symmetric operation:
Invariance: "shuffling the input doesn't change the outputs" a function is said to be permutation invariant if by shuffling the order of the inputs, the output still remains the same. In the case of graphs this is realised by using some permutation matrix P\large PP﻿ to change the order of the nodes and edges i.e. by permuting the adjacency matrix PAP⊤\large PAP^{\top}PAP⊤﻿ . Thus, permutation invariance can be formalized as:
f(PX,PAP⊤)=f(X,A)\large f(PX, PAP^{\top}) = f(X,A)f(PX,PAP⊤)=f(X,A)﻿
Equivariance: "shuffling the input also shuffles the output" a function is said to be equivariant if by shuffling the order of the inputs, the output also gets shuffled. Similarly as above, we can formalise permutation equivariance as:
F(PX,PAP⊤)=PF(X,A)\large F(PX, PAP^{\top}) = PF(X, A)F(PX,PAP⊤)=PF(X,A)﻿
There are other symmetries that also observed such as shift and rotation, which have profound implications in domains such as vision and molecular modeling.
Graph Neural NetworksDefining the aforementioned local function is of key importance in graph representation learning and much of the field revolves around defining good permutation invariant local functions ϕ\large \phiϕ﻿, which exhibit key symmetry and computational properties. Most methods can be grouped into three broad classes namely Graph Convolutional Networks, Graph Attentional Networks and Message Passing Graph Networks. To know more I'd recommend going through these introductory articles and some key methods in each family
Convolutional
hu=ϕ(xu,⊕v∈Nu cvu ψ(xv))\huge h_u = \phi(x_u, \oplus_{v \in \mathcal{N}_u} \, c_{vu} \, \psi(x_v)) \hspace{2em}hu​=ϕ(xu​,⊕v∈Nu​​cvu​ψ(xv​))﻿
2. Attentional
hu=ϕ(xu,⊕v∈Nu a(xu,xv)ψ(xv))\huge h_u = \phi (x_u, \oplus_{v \in \mathcal{N}_u} \, a(x_u, x_v) \psi(x_v))hu​=ϕ(xu​,⊕v∈Nu​​a(xu​,xv​)ψ(xv​))﻿
3. Message Passing
hu=ϕ(xu,⊕v∈Nu ψ(xu,xv))\huge h_u = \phi(x_u, \oplus_{v \in \mathcal{N_u} } \, \psi(x_u, x_v))hu​=ϕ(xu​,⊕v∈Nu​​ψ(xu​,xv​))﻿
﻿Graph Convolutional Networks (GCN)﻿
An Introduction to Convolutional Graph Neural Networks
This article provides a beginner-friendly introduction to Convolutional Graph Neural Networks (GCNs), which apply deep learning paradigms to graphical data. 
A Brief Introduction to Residual Gated Graph Convolutional Networks
This article provides a brief overview of the Residual Gated Graph Convolutional Network architecture, complete with code examples in PyTorch Geometric and interactive visualizations using W&B. 
An Introduction to GraphSAGE
This article provides an overview of the GraphSAGE neural network architecture, complete with code examples in PyTorch Geometric, and visualizations using W&B. 
﻿
﻿Graph Attentional Networks (GAT)﻿
A Brief Introduction to Graph Attention Networks 
This article provides a brief overview of the Graph Attention Networks architecture, complete with code examples in PyTorch Geometric and interactive visualizations using W&B. 
A Brief Introduction to Mixture Model Networks (MoNet)
This article provides an overview of the Mixture Model Networks (MoNet) architecture, with code examples in PyTorch Geometric and interactive visualizations using W&B. 
An Introduction to Graph Attention Networks
This article provides a beginner-friendly introduction to Attention based Graphical Neural Networks (GATs), which apply deep learning paradigms to graphical data. 
﻿
﻿Message Passing Graph Neural Networks (MPGNN)﻿
What are Graph Isomorphism Networks?
This article provides a brief overview of Graph Isomorphism Networks (GIN), complete with code examples in PyTorch Geometric and interactive visualizations using W&B. 
Graph Neural Networks (GNNs) with Learnable Structural and Positional Representations
An in-depth breakdown of "Graph Neural Networks with Learnable Structural and Positional Representations" by Vijay Prakash Dwivedi, Anh Tuan Luu, Thomas Laurent, Yoshua Bengio and Xavier Bresson.
An Introduction to Message Passing Graph Neural Networks
This article provides a beginner-friendly introduction to Message Passing Graph Neural Networks (MPGNNs), which apply deep learning paradigms to graphical data. 
﻿
SummaryIn this article we attempted to provide a simple notation for graphs and introduce key methods in Graph Representation Learning. We also looked at some important concepts such as Permutation Invariance and Equivariance. 
If you want more reports covering graph neural networks with code implementations, let us know in the comments below or on our community discord ✨!
Check out these other reports on Fully Connected covering other Graph Neural Networks-based topics and ideas.
To see the full suite of W&B features, please check out this short 5 minutes guide.
Further Reading
A Brief Introduction to Graph Contrastive Learning
This article provides an overview of "Deep Graph Contrastive Representation Learning" and introduces a general formulation for Contrastive Representation Learning on Graphs using W&B for interactive visualizations. It includes code samples for you to follow!
GraphCL: Graph Contrastive Learning Framework with Augmentations
Graph Contrastive Learning Framework as outlined in "Graph Contrastive Learning with Augmentations" by You. et al.
Multi-view Graph Representation Learning
Easy to digest breakdown of "Contrastive Multi-View Representation Learning on Graphs" by Kaveh Hassani and Amir Hosein Khasahmadi
Multi-Task Self Supervised Graph Representation Learning
Brief breakdown of Multi-task Self-supervised Graph Neural Network Enable Stronger Task Generalization [ICLR 2023] by Mingxuan Ju, Tong Zhao, Qianlong Wen, Wenhao Yu, Neil Shah, Yanfang Ye and Chuxu Zhang
﻿
﻿
Add a comment
Tags: Articles, GNN, Intermediate, Tutorial
Iterate on AI agents and models faster. Try Weights & Biases today.
Introduction to Graph Neural Networks

Table of Contents

Important Notation

Equivariance and Invariance

Graph Neural Networks

﻿Graph Convolutional Networks (GCN)﻿

﻿Graph Attentional Networks (GAT)﻿

﻿Message Passing Graph Neural Networks (MPGNN)﻿

Summary

Further Reading

Graph Convolutional Networks (GCN)

Graph Attentional Networks (GAT)

Message Passing Graph Neural Networks (MPGNN)