Machine Learning With Graphs.
This wandb report is my class notes for the Stanford CS224 Machine Learning with Graphs course.
I came across an interesting Kaggle problem that introduced me to the term Graph Neural Networks, and was looking for resources to learn how deep learning works with graph-structured datasets. I found several resources online, but they ended up too hard for me to fully understand, I kept having more questions after reading them. Some introductions to GNN's directly introduced me to spectral methods on graphs. I gave up for a while and then decided to jump straight into PyTorch geometric's excellent documentation and played with some toy problems on colab. I understood what sort of problems we can solve using GNN's but still was not very comfortable with the core intuitions. I wasn't able to figure out how various convolution layers in the PyTorch model are changing the tensor dimensions. I had doubts like "ok so they say they are doing a convolution on a graph does that mean, will the graph become smaller and smaller as we add more layers? if it does how does it even decide to become small".
Finally, I came across this excellent lecture series, by Jure Leskovec, and it introduced me to not just Graph Neural Networks, but how the whole field of machine learning applies to graphs and helped me with intuitions that needed for me to understand GNN and other advanced concepts. They introduced me to many new concepts very specific to graphs. Apart from the obvious feature engineering or properties, we might come up with naively for networks, there are so many cool concepts like motifs, graphlets, the concept of null graphs to do feature engineering, and the spectral methods. The main course instructor Juve Leskovec does a great job making all these concepts very approachable and intuitive. Most of these concepts are not hard, in some cases math behind them might be hard, but the things they describe and results are easy to understand.
I wanted to write these notes partly for myself as a comprehensive reference and also to make these concepts approachable to anyone who is new to Graph Neural Networks. I tried my best to make sure my past GNN ignorant self would understand these notes and build a strong intuition around them.
How to use these notes:
If you are only interested in Graph Neural Networks, you can just read Chapters 8, 9, 17, 18, 19. In this course, they introduce GCN, GraphSAGE, GAT, GIN, PinSage, GCPN, and other neural networks. Chapter 9 introduces PyTorch geometric and I don't follow the actual lecture for these notes, and I trained a few GNNs and also included the colab notebook. If you are interested in Graph Generation algorithms you can check out Chapter 10 where I wrote about GraphRNN.
But I recommend at least glancing through the first 10 chapters. Where they introduce ml basics in terms of graphs. Imagine not know how to create more features while training linear regression when your feature vector is only 2-dimensions in normal ML and directly training an image classifier using CNNs. Otherwise, you will also miss things like two kinds of clusterings possible in graphs one is for role detection and one is for community detection. Spectral Clustering notes might look math-heavy, but there is only one core intuition you need to grasp to understand what it does. Chapter 11 to 16 deal with other aspects of graphs like PageRank, cascades, epidemics, influence maximization and etc.
Also, Chapter 16 where they discuss network evolution, most of the lecture flew over my head, I didn't really understand some concepts in it like Temporal Pagerank and etc so my notes on it are not great.
Even though I only trained some GNNs on classic graph datasets in this report, I'm planning to work on a small project to finish this course. I'll be posting those details also here soon.
If you find any mistakes, please let me know by commenting on the report in the relevant sections or at the end of each report.
Here are the Video Lectures.
I found the official class notes written by the actual class students here. It's not linked in the class website but in one of the class slides. I found it randomly while reviewing the slides. I think it's also missing some topics, that's why they didn't link it in the class website. But it was still fun writing the notes, I had to watch each lecture twice and now I have much better understanding. I hope you find it useful too.
Chapters
- Introduction, Structure of Graphs
- Properties of Networks and Random Graph Models
- Motifs and Structural Roles in Networks
- Community Structure in Networks
- Spectral Clustering
- Message Passing and Node Classification
- Graph Representation Learning
- Graph Neural Networks
- Graph Neural Networks - Pytorch Geometric
- Deep Generative Models for Graphs
- Link Analysis: PageRank
- Network Effects and Cascading Behaviour
- Probabilistic Contagion and Models of Influence
- Influence Maximization in Networks
- Outbreak Detection in Networks
- Network Evolution
- Reasoning over Knowledge Graphs
- Limitations of Graph Neural Networks
- Applications of Graph Neural Networks
Topics
Node Degree, Complete Graph, Bipartite Graph, Adjacency Matrix, Connected Graph, Disconnected Graph, Directed Graph, Degree Distribution, Path, Diameter, Geodesic Distance, Diameter, Average Path Length, Clustering Coefficient, MSN Messenger Network, Erdos-Renyi Random Graph Model, GnpG_{np} Random Model, Small-World Model, Kronecker Graphs, Kronecker Product, Stochastic Kronecker Graphs, Subgraphs, Motifs, Graphlets, Automorphism Orbits, Exact Subgraph Enumeration, Roles, RolX, Communities, Granovetter, Zachary's Karate Club, Modularity, Louvian Algorithm, BigCLAM, Affiliation Graph Model(AGM), Spectral Clustering, Graph Partitioning Criterion, Conductance, Spectral Graph Theory, d-regular Graph, Degree Matrix, Laplacian Matrix, λ2\lambda_2 as an Optimization problem, Optimal Cut, Rayleigh Theorem, Spectral Partitioning Algorithm, k-Way spectral clustering, Motif Based Spectral Clustering, Node Classification, Homophily, Collective Classifiers, Local Classifier, Relational Classifier, Collective Inference, Probabilistic Relational Classifier, Iterative Classifiers, REV2, Loopy Belief Propagation, Graph Representation Learning, Embedding Nodes, Shallow Encoding, Random Walk Embeddings, Node2Vec, Biased Walks, TransE, Embedding Entire Graphs, Anonymous Walk Embeddings, Graph Neural Networks, Computational Graph, Graph Convolutional Network(GCN), GraphSAGE, Graph Attention Networks(GAT), Tips to train GNN, GCN on Zachary, GNN Comparisons on Cora Citation Network, Graph Generation, GraphRNN, PageRank, Random Walk with Restarts, Personalized PageRank, Cascades, Decision Based Diffusion Models, Reproductive Number R0R_0, Epidemic Models, SIR Model, SIS Model, SEIZ Model, Independent Cascade Model, Exposure Curve, Influence Maximization, Linear Threshold Model, Independent Cascade Model, Greedy Hill Climbing, Submodular Functions, Sketch-based Algorithms, Outbreaks Detection, CELF, Lazy Hill Climbing, Data Dependent Bound, Macroscopic Evolution, Densification power law, Forest Fire Model, Temporal Networks, Microscopic Evolution, Temporal PageRank, Knowledge Graphs, Freebase Knowledge Graph, TransE in Knowledge Graphs, TransR in Knowledge Graphs, Path Queries, Conjunctive Queries, Box Embeddings, Query2Box, EPFO Queries, GNN Limitations, Graph Structure Vulnerability, Injective, Graph Isomorphism Network(GIN), Weisfeiler-Lehman Graph Isomorphism Test, Noise Vulnerability in Graphs, Nettak, PinSAGE, Decagon, Goal Directed Graph Generation(GCPN)