Graph Neural Networks (GNNs) with Learnable Structural and Positional Representations

An in-depth breakdown of "Graph Neural Networks with Learnable Structural and Positional Representations" by Vijay Prakash Dwivedi, Anh Tuan Luu, Thomas Laurent, Yoshua Bengio and Xavier Bresson.

Saurav Maheshkar

Created on January 21|Last edited on February 24

Comment

﻿
﻿Read the Paper | Official Implementation | W&B Implementation | W&B Dashboard﻿
Table of Contents (Click to Expand)🧐 Motivation🙇‍♂️ MethodStandard Message Passing Graph Neural Networks (MP-GNNs)Positional Encoding⭐️ ⭐️ Decoupling Position and Structure in MP-GNNs ⭐️ ⭐️ Positional Loss📈 Experiments✌️ Conclusion
﻿
🧐 MotivationMost GNNs are designed with a message passing mechanism that builds node representation by aggregating local neighborhood information. This class of GNNs is fundamentally structural (i.e. the node representation only depends on the local structure of the graph). 
As a consequence of which most nodes in graphs do not have any canonical positional information. This lack of structural information is a huge limitation and results in the model having low representation power due to their inability to differentiate between simple graph symmetries.
Figure 1: The typical way a Graph Neural Networks (GNN) are structured. Considering the example of a molecule the node features viz. hi,hjh_i, h_jhi​,hj​﻿ can represent the atom type (Hydrogen or Oxygen) and the edge features eije_{ij}eij​﻿ can represent the bond type (Covalent or Ionic).
For example, two atoms in a molecule with the same neighbourhood are expected to have similar representation. However this can be limiting to have the same representation for these two atoms as their positions in the molecule are distinct and their role may be different. As a consequence, the popular MP-GNNs (Message Passing Graph Neural Networks) fails to differentiate two nodes with the same 1-hop local structure, this is now well understood in context of equivalence of MP-GNNs with Weisfeiler-Lehman (WL) test for graph isomorphism.
To mitigate these issues the authors propose a new framework called LSPE (Learnable Structural and Positional Embeddings) that can be used with any MP-GNN to learn both of these properties at the same time.
These limitations can be alleviated to certain extents by:-
Stacking Multiple Layers: this can help propagate the information from a node to multiple hops but is deficient for long distance nodes because of "over-squashing".
Applying Higher-Order GNNs: Compute higher-order node-tuple aggregations such as in WL-based GNNs though these models are computationally expensive.
Considering PE for nodes/edges: this helps by creating some global positional attributes for the various nodes in a graph and can lead to sub-structures.
🙇‍♂️ Method
Notation (Click to Expand)
Standard Message Passing Graph Neural Networks (MP-GNNs)The typical way a Standard Message Passing Graph Neural Network problem is structured can be represented as follows :-
hil+1=fh( hil , {hjl}j∈Ni , eijl )eijl+1=fe( hil , hjl , eijl )\huge \begin{array}{ll} 

h^{l+1}_{i} &= f_h(\, h^{l}_{i}\, , \,\{ h^l_j \}_{j \in \mathcal{N}_i} \, , \, e^l_{ij} \,) \\

e^{l+1}_{ij} &= f_e(\, h^l_i \, , \, h^l_j \, , \, e^l_{ij} \,)

\end{array}hil+1​eijl+1​​=fh​(hil​,{hjl​}j∈Ni​​,eijl​)=fe​(hil​,hjl​,eijl​)​﻿
where hil+1, hil\large h^{l+1}_{i}, \, h^l_ihil+1​,hil​﻿ are the node structural embeddings. The j∈Nij \in \mathcal{N_i}j∈Ni​﻿ bit represents the neighbourhood of a particular node. eil+1 , eil \large e^{l+1}_i \, , \, e^l_i \, eil+1​,eil​﻿ are the edge features. fh,fe\large f_h, f_e fh​,fe​﻿ are just functions with learnable parameters. 
Input Features and Initialization (Click to Expand)These are produced by a linear embedding of available input nodes and edge features.
hil=0=LLh(hiin)=A0hiin+a0∈Rdeijl=0=LLe(eijin)=B0eijin+b0∈Rd\huge \begin{array}{ll} 

h^{l=0}_i &= LL_h(h^{in}_{i}) &= A^{0} h^{in}_i + a^0 \hspace{2em} \in \mathbb{R}^d \\

e^{l=0}_{ij} &= LL_e(e^{in}_{ij}) &= B^0 e^{in}_{ij} + b^0 \hspace{2em} \in \mathbb{R}^d

\end{array}hil=0​eijl=0​​=LLh​(hiin​)=LLe​(eijin​)​=A0hiin​+a0∈Rd=B0eijin​+b0∈Rd​﻿
Where,
﻿hiin∈Rdv\large h^{in}_i \in \mathbb{R}^{d_v}hiin​∈Rdv​﻿ and eijin∈Rde\large e^{in}_{ij} \in \mathbb{R}^{d_e}eijin​∈Rde​﻿﻿
﻿A0∈Rd×dv\large A^0 \in \mathbb{R}^{d \times d_v}A0∈Rd×dv​﻿, B0∈Rd×de\large B^0 \in \mathbb{R}^{d \times d_e}B0∈Rd×de​﻿, a0,b0∈Rd\large a^0, b^0 \in \mathbb{R}^da0,b0∈Rd﻿ are learnable.
Positional Encoding﻿
Existing MP-GNNs integrate Positional Embeddings to input node features by concatenation.
hil+1=fh( hil , {hjl}j∈Ni , eijl )eijl+1=fe( hil , hjl , eijl )\huge \begin{array}{ll} 

h^{l+1}_{i} &= f_h(\, h^{l}_{i}\, , \,\{ h^l_j \}_{j \in \mathcal{N}_i} \, , \, e^l_{ij} \,) \\

e^{l+1}_{ij} &= f_e(\, h^l_i \, , \, h^l_j \, , \, e^l_{ij} \,) \\



\end{array}hil+1​eijl+1​​=fh​(hil​,{hjl​}j∈Ni​​,eijl​)=fe​(hil​,hjl​,eijl​)​﻿
With Initial
hil=0=LLh( [hiinpiin] )=D0( [hiinpiin] )+d0eijl+1=LLe(eijin)=B0 eijin+b0\huge \begin{array}{ll} 

h^{l=0}_{i} &= LL_h(\,\left[ \displaystyle \begin{array}{cc} h^{in}_{i} \\ p^{in}_{i} \end{array} \right]\,)  &= D^0(\, \left[
\begin{array}{cc} 
 h^{in}_{i} \\ p^{in}_i \end{array} \right]\,) + d^0\\

e^{l+1}_{ij} &= LL_e (e^{in}_{ij}) &= B^0\, e^{in}_{ij} + b^0 \hspace{2.75em}



\end{array}hil=0​eijl+1​​=LLh​(​hiin​piin​​​)=LLe​(eijin​)​=D0(​hiin​piin​​​)+d0=B0eijin​+b0​﻿
⭐️ ⭐️ Decoupling Position and Structure in MP-GNNs ⭐️ ⭐️ ﻿
hil+1=fh( [hilpil] , {[hjlpjl]}j∈Ni , eijl )eijl+1=fe( hil , hjl , eijl )pil+1=fp( pil , {pjl}j∈Ni , eijl )\huge \begin{array}{ll} 

h^{l+1}_{i} &= f_h(\,\left[ \displaystyle \begin{array}{cc} h^l_i \\ p^l_i \end{array} \right]\, , \, \left\{ \left[ \displaystyle \begin{array}{cc} h^l_j \\ p^l_j \end{array} \right] \right\}_{j \in \mathcal{N}_i} \, , \, e^{l}_{ij} \,) \\ 

e^{l+1}_{ij} &= f_e(\, h^l_i \, , \, h^l_j \, , \, e^{l}_{ij} \,) \\

p^{l+1}_{i} &= f_p(\, p^l_i \, , \, \{ p^l_{j} \}_{j \in \mathcal{N}_i} \, , \, e^{l}_{ij} \,) 

\end{array}hil+1​eijl+1​pil+1​​=fh​(​hil​pil​​​,⎩⎨⎧​​hjl​pjl​​​⎭⎬⎫​j∈Ni​​,eijl​)=fe​(hil​,hjl​,eijl​)=fp​(pil​,{pjl​}j∈Ni​​,eijl​)​﻿
With Initial 
pil=0=LLP( piPE )=C0piPE+c0∈Rd\huge p^{l=0}_{i} = LL_P (\, p^{PE}_{i} \,) = C^0 p^{PE}_{i} + c^0 \hspace{2em} \in \mathbb{R}^dpil=0​=LLP​(piPE​)=C0piPE​+c0∈Rd﻿
Positional Loss﻿
Loss=LossTask([hl=Lpl=L])+α LossLap Eig(pl=L)\huge \displaystyle \text{Loss} = \text{Loss}_{\text{Task}} \left( \left[ \displaystyle \begin{array}{cc} h^{l=L} \\ p^{l=L}\end{array} \right] \right) + \alpha  \, \text{Loss}_{\text{Lap Eig}} (p^{l=L})Loss=LossTask​​​hl=Lpl=L​​​+αLossLap Eig​(pl=L)﻿
Figure 2: The proposed method that incorporates Positional Embeddings (PE) into the standard representation to learn canonical information.
📈 ExperimentsFor the purposes of this report, we use the ZINC Dataset for Graph Regression comparing various architectures and Positional Embedding types. As we can see from the plots the actual and learned eigenvectors for the trained models look pretty similar thereby proving that this methodology leads to the learning of some canonical positional information.
﻿
Run set1
﻿
✌️ ConclusionIn this report we explored Graph Neural Networks with Learnable Structural and Positional Representations by Vijay Prakash Dwivedi, Anh Tuan Luu, Thomas Laurent, Yoshua Bengio and Xavier Bresson, in which the authors propose a new method to incorporate canonical positional information into the model training paradigm, through Laplacian eigenvectors. 
This report will also be followed by a full Graph Neural Network series (from Scratch to SOTA) in the coming months, stay tuned to Fully Connected for that !! Check out these other reports on Fully Connected covering other hot topics in Graph Neural Networks. 
If you want to cite the paper in your own research kindly use the following BibTeX :-
@article{DBLP:journals/corr/abs-2110-07875,
  author    = {Vijay Prakash Dwivedi and
               Anh Tuan Luu and
               Thomas Laurent and
               Yoshua Bengio and
               Xavier Bresson},
  title     = {Graph Neural Networks with Learnable Structural and Positional Representations},
  journal   = {CoRR},
  volume    = {abs/2110.07875},
  year      = {2021},
  url       = {https://arxiv.org/abs/2110.07875},
  eprinttype = {arXiv},
  eprint    = {2110.07875},
  timestamp = {Fri, 22 Oct 2021 13:33:09 +0200},
  biburl    = {https://dblp.org/rec/journals/corr/abs-2110-07875.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}
Using W&B with DeepChem: Molecular Graph Convolutional Networks
A quick tutorial on using W&B to track DeepChem molecular deep learning experiments
Machine Learning With Graphs.
Class Notes for CS224W (http://web.stanford.edu/class/cs224w/)
Part 1 – Introduction to Graph Neural Networks With GatedGCN
This article summarizes the need for Graph Neural Networks and analyzes one particular architecture – the Gated Graph Convolutional Network.
De Novo Molecule Generation with GCPNs using TorchDrug
How reinforcement learning, specifically graph convolutional policy networks, can help create brand new molecules to treat real world diseases
﻿
﻿

Add a comment

Tags: Intermediate, Computer Vision, Molecule Generation, Research, GNN, Github, Plots

Iterate on AI agents and models faster. Try Weights & Biases today.

Graph Neural Networks (GNNs) with Learnable Structural and Positional Representations

﻿Read the Paper | Official Implementation | W&B Implementation | W&B Dashboard﻿

Table of Contents (Click to Expand)

🧐 Motivation

🙇‍♂️ Method

Notation (Click to Expand)

Standard Message Passing Graph Neural Networks (MP-GNNs)

Input Features and Initialization (Click to Expand)

Positional Encoding

⭐️ ⭐️ Decoupling Position and Structure in MP-GNNs ⭐️ ⭐️

Positional Loss

📈 Experiments

✌️ Conclusion

Read the Paper | Official Implementation | W&B Implementation | W&B Dashboard