GraphCL: Graph Contrastive Learning Framework with Augmentations

Graph Contrastive Learning Framework as outlined in "Graph Contrastive Learning with Augmentations" by You. et al.
Created on February 2|Last edited on February 5
Comment
﻿
NOTE: This Report is a part of a series of reports on Graph Representation Learning, for a brief overview and survey please refer to the following articles as well
💡
A Brief Introduction to Graph Contrastive Learning
This article provides an overview of "Deep Graph Contrastive Representation Learning" and introduces a general formulation for Contrastive Representation Learning on Graphs using W&B for interactive visualizations. It includes code samples for you to follow!
GraphCL: Graph Contrastive Learning Framework with Augmentations
Graph Contrastive Learning Framework as outlined in "Graph Contrastive Learning with Augmentations" by You. et al.
Multi-view Graph Representation Learning
Easy to digest breakdown of "Contrastive Multi-View Representation Learning on Graphs" by Kaveh Hassani and Amir Hosein Khasahmadi
Multi-Task Self Supervised Graph Representation Learning
Brief breakdown of Multi-task Self-supervised Graph Neural Network Enable Stronger Task Generalization [ICLR 2023] by Mingxuan Ju, Tong Zhao, Qianlong Wen, Wenhao Yu, Neil Shah, Yanfang Ye and Chuxu Zhang
﻿
IntroductionSelf-Supervised Learning is a special form of Unsupervised Learning where supervision is provided by the data itself. However rather than using labels denoting classes or other form of target information, if we want just to want to learn representations perhaps to act as a prior distribution for later fine-tuning we can mask out a part of the data and then train the model to recreate the missing information.
One of the key pillars of Self Supervised Learning is Contrastive methods wherein we distort or perturb the initial data by performing augmentations (semantics preserving) and then train a model to recognise them. The Contrastive bit comes in when we further probe the model to group views of similar data points together and away from views of different data points. This is the crux behind Contrastive Learning. This simple notion of training the model to group views of similar classes leads to profound results and has lead to great techniques such as SimCLR.
In this article we will cover a simple contrastive framework for Graph Representation Learning called GraphCL as outlined in the paper "Graph Contrastive Learning with Augmentations" by You. et al. 
NOTE: We assume a basic understanding of Graph Neural Networks, if you need a quick refresher the following article is recommended.
💡
Introduction to Graph Neural Networks
Interested in Graph Neural Networks and want a roadmap on how to get started? In this article, we'll give a brief outline of the field and share blogs and resources!
﻿
Table of ContentsIntroduction👨‍🏫 MethodCodeSummary
﻿
﻿
👨‍🏫 Method
Figure 1: GraphCL Framework
Similar to the GRACE Framework as introduced in a related article A Brief Introduction to Graph Contrastive Learning, the GraphCL framework also follows best practices from Self Supervised techniques as explored with other modalities viz. a shared encoder and a projection head. The most similar framework resembling GraphCL is SimCLR which has been previously explored before. 
﻿
The GraphCL framework can be summarised as follows:
Given a Graph G\large \mathcal{G}G﻿ we generate two views Gi^,Gj^\large \hat{\mathcal{G}_i},  \hat{\mathcal{G}_j}Gi​^​,Gj​^​﻿ by performing augmentations. The authors selectively learn these augmentations based on graph domains.
These two views Gi^,Gj^\large \hat{\mathcal{G}_i},  \hat{\mathcal{G}_j}Gi​^​,Gj​^​﻿ are then passed through a graph encoder f(⋅)\large f(\cdot)f(⋅)﻿ leading to representations hi,hj\large h_i, h_jhi​,hj​﻿. These graph encoders can be any architecture.
These representations are then passed through a projection head g(⋅)\large g(\cdot)g(⋅)﻿ a simple MLP network which generates two views zi,zj\large z_i, z_jzi​,zj​﻿.
We then apply a contrastive objective L(⋅)\large \mathcal{L} (\cdot)L(⋅)﻿ between the two views, the objective in this case is the normalized temperature-scaled cross entropy loss.
NOTE: As it is considered best practice in Self Supervised Learning, we don't explicitly sample negative pairs instead the augmented views of the other graphs in a batch become the negative pairs.
💡
The four graph augmentations studied in this paper are:
Node Dropping: From any given graph G\large \mathcal{G}G﻿, we randomly "drop" some nodes along with their edges.
Edge Perturbation: This involves perturbing the edges in G\large \mathcal{G}G﻿ through randomly adding or dropping a certain ratio of edges.
Attribute masking: Attribute masking prompts the model to recover masked node attributes using their context information, i.e., the remaining attributes.
Subgraph Generation: This involves creating a subgraph from the original graph by performing random walks.
The authors stress on the importance of data augmentations and deem them crucial for graph contrastive learning.
﻿
The overall framework can be generalised as follows:
l=EPGi^{−EP(Gj^∣Gi^)T(f1(Gi^),f2(Gj^)) + log⁡(EPGj^eT(f1(Gi^),f2(Gj^)))}\huge l = \mathbb{E}_{\mathbb{P}_{\hat{\mathcal{G}_i}}} \{ - \mathbb{E}_{\mathbb{P}_{(\hat{\mathcal{G}_j} | \hat{\mathcal{G}_i})}} T (f_1(\hat{\mathcal{G}_i}), f_2(\hat{\mathcal{G}_j})) 
 \, + \, \log(\mathbb{E}_{\mathbb{P}_{\hat{\mathcal{G}_j}}} e^{T (f_1(\hat{\mathcal{G}_i}), f_2(\hat{\mathcal{G}_j}))}) \}l=EPGi​^​​​{−EP(Gj​^​∣Gi​^​)​​T(f1​(Gi​^​),f2​(Gj​^​))+log(EPGj​^​​​eT(f1​(Gi​^​),f2​(Gj​^​)))}﻿
where T(⋅)\large T (\cdot)T(⋅)﻿ is some arbitrary learnable score function usually parameterized with the similarity function sim(⋅,⋅)\large \text{sim}(\cdot, \cdot)sim(⋅,⋅)﻿.
CodeLet's look into the code in a abstract manner implemented using PyTorch + PyTorch Geometric. The authors have made the original code available.
class GraphCL(torch.nn.Module):
    ... 
    def train_step(
        self,
        augmented_views: List[torch.Tensor],
    ) -> torch.Tensor:
        """
        Perform a single training step.
﻿
        Args:
            augmented_views (List[torch.Tensor]): Views generated by performing augmentations
﻿
        Returns:
            float: Loss.
        """
        # Generate Graph Views
﻿
        ## Generating representations : intermediate_reps: List[torch.Tensor]
        intermediate_reps = self.encode(augmented_views)
﻿
        ## Generate views : reps: List[torch.Tensor]
        reps = self.projection_head(intermediate_reps)
﻿
	## Calculate Loss : loss: torch.Tensor
	loss = self.contrastive_loss(reps)
﻿
        return loss
SummaryIn this article we briefly went over the paper titled "Graph Contrastive Learning with Augmentations" by Yuning You, Tianlong Chen, Yongduo Sui, Ting Chen, Zhangyang Wang and Yang Shen and contrasted (pun not intended) it with a related paper discussed earlier in A Brief Introduction to Graph Contrastive Learning. We also went over the implementation in a abstract manner and observed performance metrics with the power of Weights & Biases Logging.
To see the full suite of W&B features, please check out this short 5-minute guide. If you want more reports covering the math and "from-scratch" code implementations, let us know in the comments down below or on our forum ✨!﻿
Check out these other reports on Fully Connected covering other Geometric Deep Learning topics such as Graph Attention Networks. 
Introduction to Graph Neural Networks
Interested in Graph Neural Networks and want a roadmap on how to get started? In this article, we'll give a brief outline of the field and share blogs and resources!
A Brief Introduction to Graph Contrastive Learning
This article provides an overview of "Deep Graph Contrastive Representation Learning" and introduces a general formulation for Contrastive Representation Learning on Graphs using W&B for interactive visualizations. It includes code samples for you to follow!
An Introduction to GraphSAGE
This article provides an overview of the GraphSAGE neural network architecture, complete with code examples in PyTorch Geometric, and visualizations using W&B. 
What are Graph Isomorphism Networks?
This article provides a brief overview of Graph Isomorphism Networks (GIN), complete with code examples in PyTorch Geometric and interactive visualizations using W&B. 
Graph Neural Networks (GNNs) with Learnable Structural and Positional Representations
An in-depth breakdown of "Graph Neural Networks with Learnable Structural and Positional Representations" by Vijay Prakash Dwivedi, Anh Tuan Luu, Thomas Laurent, Yoshua Bengio and Xavier Bresson.
﻿
﻿
Add a comment