5.Spectral Clustering

Spectral Clustering algorithms. Using Eigenvectors of the matrix representations to cluster nodes into communities.
Created on December 22|Last edited on January 13
Comment
﻿
Introduction and Problem DefinitionHere we study the important class of spectral methods for understanding networks. Spectral clustering means, we will be using eigenvectors and eigenvalues of the matrix representations of the graphs.
The spectral clustering algorithms we explore generally consist of three basic stages
Pre-processingConstruct a matrix representation of the graph
DecompositionCompute eigenvalues and eigenvectors of the matrix representation. And map each point to a lower-dimensional representation based on one or more eigenvectors.
GroupingAssign points to two or more clusters, based on the new representation.
Problem DefinitionGiven an undirected graph G(V,E)G(V, E)G(V,E) we need to be able to divide vertices into two disjoint groups A, B.
How can we define a good partition of GGG?
And how can we efficiently identify such a partition.
Until now we defined things like modularity and such to identify the best partition with in a graph. Now we approach the same problem using a different criterion. Null models, modularity is a physics view of the partitioning problem, this will be a computer science view. Here we will have approximation guarantees of how well our methods work.
﻿
Graph Partitioning CriterionMaximize the number of within-group connections and minimize the number of between-group connections.
Lets define the graph partitioning objective as a function as the edge cut of the partition. Cut is the set of edges with one endpoint in each group.
cut(A,B)=∑i∈A,j∈Bwij\displaystyle cut(A, B) = \sum_{i\in A, j \in B}w_{ij}cut(A,B)=i∈A,j∈B∑​wij​
If we define the criterion as Minimum-cut arg min⁡A,Bcut(A,B)\argmin_{A, B} cut(A, B)argminA,B​cut(A,B), there will be some edge cases where dangling nodes will consider themselves as a different group when there are much better partitioning of the graphs available. It only considers external cluster connections but not internal more non obvious wrt cut cluster connectivity.
Conductance [Shi-Malik '97]Conductance is defined as connectivity between groups relative to the density of each group.
ϕ(A,B)=cut(A,B)min(vol(A),vol(B))\displaystyle \phi(A, B) = \frac{cut(A, B)}{min(vol(A), vol(B))}ϕ(A,B)=min(vol(A),vol(B))cut(A,B)​
Volume is total weighted degree of the nodes in a group. We want the conductance to be as small as possible, and its small when the volume of the both groups is as big as possible. And we use minimum volume in a way to force both groups to be as big as possible. Intuitively we want to split the graph into two similar sized groups such that the cut is small. Conductance produces much more balanced partitions, unlike using just Cut. And computing the best conductance cut is NP-hard.
﻿
Spectral Graph TheorySpectral graph partitioning that gives us an approximation guarantee to find the cut that optimizes conductance .
Let AAA be the adjacency matrix of undirected graph GGG and xxx a vector in Rn\mathcal{R}^nRn with components (x1,⋯xn)(x_1,\cdots x_n)(x1​,⋯xn​). We can think of it as a label or a value respective to each node in GGG. Every node iii has it's own corresponding xix_ixi​
And now y=A.xy = A.xy=A.x
(a11⋯a1n⋮⋱⋮an1⋯ann).(x1⋮xn)=(y1⋮yn)\displaystyle \begin{pmatrix} a_{11} & \cdots & a_{1n}\\ \vdots & \ddots & \vdots\\ a_{n1} & \cdots & a_{nn}\end{pmatrix}. \begin{pmatrix} x_1\\ \vdots\\ x_n\end{pmatrix} = \begin{pmatrix} y_1\\ \vdots\\ y_n\end{pmatrix}⎝⎜⎜⎛​a11​⋮an1​​⋯⋱⋯​a1n​⋮ann​​⎠⎟⎟⎞​.⎝⎜⎜⎛​x1​⋮xn​​⎠⎟⎟⎞​=⎝⎜⎜⎛​y1​⋮yn​​⎠⎟⎟⎞​
yi=∑j=1nAijxj=∑(i,j)∈Exj\displaystyle y_i = \sum_{j=1}{n} A_{ij}x_j = \sum_{(i,j) \in E} x_jyi​=j=1∑​nAij​xj​=(i,j)∈E∑​xj​
When we multiply our adjacency matrix with xxx, we get another vector yyy whose elements at a position iii is just sum of all the corresponding xxx values of nodes that have an edge to node iii. Entry yiy_iyi​ is sum of labels xjx_jxj​ of neighbors of iii. 
Eigenvectors and Eigenvalues of a matrix AAA are defined as the solution to the equation A.x=λ.xA.x = \lambda .xA.x=λ.x. Where xxx is our eigenvector and lambda our eigenvalue. Multiplying a matrix with a vector results in scaling the original vector and rotates it. When you multiply a matrix with one of its eigenvectors, it just scales the vector and does not rotate it. And it scales it by eigenvalue amount. For any matrix there are some favoured vectors/directions and when the matrix acts upon these favoured vectors, it just scales the vector with no rotation. If we consider the maximum eigenvalue of a matrix A and its acted upon by the corresponding eigenvector, the result will be the maximum, no other vector when acted upon by this matrix will get as stretched as this eigenvector.
If we consider a eigenvector and corresponding eigenvalue of our adjacency matrix, with each element in the eigenvector labeling each node. And when we the sum all the x values of neighbours of node iii we get a scaled version of xix_ixi​. yiy_iyi​ is essentially eigenvalue times xix_ixi​ even though its sum of neighbouring x values of node i.
Spectral Graph Theory is branch of applied mathematics, that analyses the spectrum of a matrix representing GGG. Where spectrum represents the Eigenvectors x(i)x^{(i)}x(i) of a graph, ordered by its magnitude of their corresponding eigenvalues Λ={λ1,λ2,⋯λn}\Lambda = \lbrace\lambda_1,\lambda_2,\cdots\lambda_n\rbraceΛ={λ1​,λ2​,⋯λn​} where λ1≤λ2≤⋯≤λn\lambda_1 \leq \lambda_2 \leq \cdots \leq \lambda_nλ1​≤λ2​≤⋯≤λn​
Example: d-Regular Graph.For a connected graph GGG with all ddd-degree nodes, GGG is called ddd-regular.
Consider the vector x=[1,1,⋯1]x = [1,1,\cdots1]x=[1,1,⋯1]. If we multiple the Adjacency matrix of a ddd-regular matrix with the vector xxx we will just be getting d.xd.xd.x. As its a d-regular all nodes will be having d neighbor. And A.xA.xA.x at iii is just sum of xxx values of node iii. As multiplying AAA with xxx is giving us a scaled xxx by times ddd, we consider xxx here is an eigenvector and ddd corresponding eigenvalue.
x=[1,1,⋯1]A.x=[d,d,⋯d]=λ.xwith λ=dx = [1,1,\cdots1]\\ A.x = [d,d,\cdots d] = \lambda .x\\ \text{with } \lambda = dx=[1,1,⋯1]A.x=[d,d,⋯d]=λ.xwith λ=d
Turns out for a ddd-regular graph, ddd is the largest eigenvalue and [1,1,⋯1][1,1,\cdots1][1,1,⋯1] the corresponding eigenvector.
TODO: proof of the above statement in pdf
Example: Graph on 2 ComponentsConsider a Graph GGG with two ddd-regular disconnected components. What are some eigenvectors.
Suppose AAA and BBB are the components and lets consider xxx with all 1s on AAA and 0s on BBB.
x′=[1,⋯ ,1⏟∣A∣,0,⋯ ,0⏟∣B∣]T then A.x′=[d,⋯ ,d,0,⋯ ,0]Tx′′=[0,⋯ ,0,1,⋯ ,1]T then A.x′′=[0,⋯ ,0,d,⋯ ,d]Tx' = [\underbrace{1,\cdots,1}_{|A|},\underbrace{0,\cdots,0}_{|B|}]^T \text{ then } A.x' = [d,\cdots,d,0,\cdots,0]^T\\ x'' = [0,\cdots,0,1,\cdots,1]^T \text{ then } A.x'' = [0,\cdots,0,d,\cdots,d]^Tx′=[∣A∣1,⋯,1​​,∣B∣0,⋯,0​​]T then A.x′=[d,⋯,d,0,⋯,0]Tx′′=[0,⋯,0,1,⋯,1]T then A.x′′=[0,⋯,0,d,⋯,d]T
Both vectors corresponding λ=d\lambda = dλ=d
Hence for a graph with two ddd-regular disconnected components λn=λn−1\lambda_n = \lambda_{n-1}λn​=λn−1​. The two largest eigenvalues are the same. We found an eigenvector of multiplicity 2, meaning two eigenvectors correspond to the same eigenvalue. Intuitively if a graph has multiple connected components, the multiplicity of the largest eigenvalue tells us number of connected components it has.
From this intuition, consider a graph that almost has two connected components, the largest eigenvalue and the second largest eigenvalue will be very close to each other. λn−λn−1≈0\lambda_n - \lambda_{n-1} \approx 0λn​−λn−1​≈0. The second largest eigenvalue will tell us what nodes should be in component AAA and what nodes should be in component BBB
Eigenvectors are orthogonal, hence if the ddd-regular graph is connected then we already know that xn=[1,⋯ ,1]x_n = [1,\cdots,1]xn​=[1,⋯,1] is the eigenvector corresponding the largest eigenvalue. This implies xn−1x_{n-1}xn−1​ must sum to zero. Because xn.xn−1=0→∑ixn[i].xn−1[i]=∑ixn−1[i]=0x_n.x_{n-1} = 0 \rarr \sum_i x_n[i].x_{n-1}[i] = \sum_ix_{n-1}[i] = 0xn​.xn−1​=0→∑i​xn​[i].xn−1​[i]=∑i​xn−1​[i]=0 
xn−1x_{n-1}xn−1​ "splits" the nodes into two groups, nodes with xn−1>0x_{n-1}>0xn−1​>0 and xn−1<0x_{n-1}<0xn−1​<0. So in principle we could look at the eigenvector of the 2nd largest eigenvalue and declare nodes with positive label in AAA and negative label in BBB.
Matrix RepresentationsAdjacency matrix(AAA)size n×nn\times nn×n with aij=1a_{ij}=1aij​=1 if edge between iii and jjj, for a undirected graph AAA is symmetric, has nnn real eigenvalues and eigenvectors are real-valued and orthogonal.
A=(011010101000110110001011100101000110)A = \begin{pmatrix} 0 & 1 & 1 & 0 & 1 & 0\\ 1 & 0 & 1 & 0 & 0 & 0\\ 1 & 1 & 0 & 1 & 1 & 0\\ 0 & 0 & 1 & 0 & 1 & 1\\ 1 & 0 & 0 & 1 & 0 & 1\\ 0 & 0 & 0 & 1 & 1 & 0\end{pmatrix}A=⎝⎜⎜⎜⎜⎜⎜⎜⎛​011010​101000​110100​001011​101101​000110​⎠⎟⎟⎟⎟⎟⎟⎟⎞​
Degree matrix(DDD)Diagonal Matrix of size n×nn\times nn×n, D=[dii],dii=D = [d_{ii}], d_{ii} =D=[dii​],dii​= degree of the node iii
D=(300000020000003000000300000030000002)D = \begin{pmatrix} 3 & 0 & 0 & 0 & 0 & 0\\ 0 & 2 & 0 & 0 & 0 & 0\\ 0 & 0 & 3 & 0 & 0 & 0\\ 0 & 0 & 0 & 3 & 0 & 0\\ 0 & 0 & 0 & 0 & 3 & 0\\ 0 & 0 & 0 & 0 & 0 & 2\end{pmatrix}D=⎝⎜⎜⎜⎜⎜⎜⎜⎛​300000​020000​003000​000300​000030​000002​⎠⎟⎟⎟⎟⎟⎟⎟⎞​
Laplacian matrix(LLL)Laplacian matrix is defined as L=D−AL = D - AL=D−A, your degree matrix minus the adjacency matrix. -1 where ever there is an edge and positive degrees wherever there is an edge. n×nn\times nn×n size symmetric matrix.
L=(−3−1−1−0−1−0−1−2−1−0−0−0−1−1−3−1−0−0−0−0−1−3−1−1−1−0−0−1−3−1−0−0−0−1−1−2)L = \begin{pmatrix} \phantom{-}3 & -1 & -1 & \phantom{-}0 & -1 & \phantom{-}0\\ -1 & \phantom{-}2 & -1 & \phantom{-}0 & \phantom{-}0 & \phantom{-}0\\ -1 & -1 & \phantom{-}3 & -1 & \phantom{-}0 & \phantom{-}0\\ \phantom{-}0 & \phantom{-}0 & -1 & \phantom{-}3 & -1 & -1\\ -1 & \phantom{-}0 & \phantom{-}0 & -1 & \phantom{-}3 & -1\\ \phantom{-}0 & \phantom{-}0 & \phantom{-}0 & -1 & -1 & \phantom{-}2\end{pmatrix}L=⎝⎜⎜⎜⎜⎜⎜⎜⎛​−3−1−1−0−1−0​−1−2−1−0−0−0​−1−1−3−1−0−0​−0−0−1−3−1−1​−1−0−0−1−3−1​−0−0−0−1−1−2​⎠⎟⎟⎟⎟⎟⎟⎟⎞​
Trivial eigenpair in case of a laplacian matrix is x=[1,⋯ ,1]x=[1,\cdots,1]x=[1,⋯,1] then L.x=0L.x=0L.x=0 and so λ=λ1=0\lambda = \lambda_1 = 0λ=λ1​=0
The smallest eigenvalue is 0, unlike in case of adjacency matrix, the eigenvalue corresponding to [1,⋯ ,1][1,\cdots,1][1,⋯,1] is the largest.
Also eigenvalues of LLL are non-negative real numbers. And eigenvectors are all real and are always orthogonal.
TODO: Proof for the laplacian matrix properties above mentioned is in the pdf.
λ2\lambda_2λ2​ as an Optimization ProblemFor any symmetric matrix M:
λ2=min⁡x:xTw1=0xTMxxTx\displaystyle \lambda_2 = \min_{x: x^Tw_1 = 0} \frac{x^TMx}{x^Tx}λ2​=x:xTw1​=0min​xTxxTMx​ where w1w_1w1​ is eigenvector corresponding to λ1\lambda_1λ1​
TODO: Proof of this is in the pdf
In our case MMM is LLL, to figure out the meaning of min⁡xTLx\min x^TLxminxTLx on GGG
xTLx=∑i,j=1nLijxixj=∑i,j=1n(Dij−Aij)xixj=∑iDiixi2−∑(i,j)∈E2xixj=∑(i,j)∈E(xi2+xj2−2xixj)=∑(i,j)∈E(xi−xj)2x^TLx = \sum_{i,j=1}^n L_{ij}x_ix_j \\= \sum_{i,j=1}^n(D_{ij} - A_{ij})x_ix_j \\= \sum_iD_{ii}x_i^2 - \sum_{(i,j) \in E}2x_ix_j \\= \sum_{(i, j) \in E}(x_i^2 + x_j^2 - 2x_ix_j) \\= \sum_{(i,j)\in E}(x_i - x_j)^2xTLx=∑i,j=1n​Lij​xi​xj​=∑i,j=1n​(Dij​−Aij​)xi​xj​=∑i​Dii​xi2​−∑(i,j)∈E​2xi​xj​=∑(i,j)∈E​(xi2​+xj2​−2xi​xj​)=∑(i,j)∈E​(xi​−xj​)2
As eigenvectors are unit vectors we have ∑ixi2=1\sum_ix_i^2 = 1∑i​xi2​=1
and x is orthogonal to 1st eigenvector [1,⋯ ,1][1,\cdots,1][1,⋯,1] thus: ∑ixi.1=∑ixi=0\sum_ix_i.1 = \sum_ix_i = 0∑i​xi​.1=∑i​xi​=0
From the above 3 equations we have 
λ2=min⁡x:∑xi=0∑(i,j)∈E(xi−xj)2∑ixi2=min⁡x:∑xi=0∑(i,j)∈E(xi−xj)2\displaystyle \lambda_2 = \min_{x: \sum x_i=0} \frac{\sum_{(i, j) \in E}(x_i - x_j)^2}{\sum_ix_i^2} = \min_{x: \sum x_i=0} \sum_{(i, j) \in E}(x_i - x_j)^2λ2​=x:∑xi​=0min​∑i​xi2​∑(i,j)∈E​(xi​−xj​)2​=x:∑xi​=0min​(i,j)∈E∑​(xi​−xj​)2
We want to assign values xix_ixi​ to nodes iii such that few edges cross 0. We want xix_ixi​ and xjx_jxj​ to subtract each other. From the equation for λ2\lambda_2λ2​ above, we have the constraints that all the node labels should sum up to zero, so intuitively half of nodes will be in one cluster, and the other half will be in the other cluster. And to minimize the term ∑(i,j)∈E(xi−xj)2\sum_{(i, j) \in E}(x_i - x_j)^2∑(i,j)∈E​(xi​−xj​)2 the edges crossing the clusters have to as few as possible. And the Edges within the clusters will result in a small value for ∑(i,j)∈E(xi−xj)2\sum_{(i, j) \in E}(x_i - x_j)^2∑(i,j)∈E​(xi​−xj​)2 anyway.
The image below is for intuition, as we have node labels from the second eigenvector on x-axis and the corresponding nodes in red. The edges between the two clusters have to minimum to minimize the term ∑(i,j)∈E(xi−xj)2\sum_{(i, j) \in E}(x_i - x_j)^2∑(i,j)∈E​(xi​−xj​)2 
Eigenvector that corresponds to λ2\lambda_2λ2​ is our magical vector that puts nodes in one or the other cluster.
﻿
Find Optimal Cut [Fiedler '73]This is an optimal cut algorithm from 73, where we express the partition of graph into two components AAA,  BBB as a vector yyy like below.
yi={+1 if i∈A−1 if i∈B\displaystyle y_i = \begin{cases} +1 & \text{ if } i \in A\\ -1 & \text{ if } i \in B \end{cases}yi​={+1−1​ if i∈A if i∈B​
We also enforce that ∣A∣=∣B∣→∑iyi=0|A| = |B| \rarr \sum_iy_i = 0∣A∣=∣B∣→∑i​yi​=0, equivalent to being orthogonal to the trivial eigenvector [1,⋯ ,1][1,\cdots,1][1,⋯,1]
He proves that we can minimize the cut of the partition by finding a vector yyy that minimizes:
arg min⁡y∈{−1,+1}nf(y)=∑(i,j)∈E(yi−yj)2\displaystyle \argmin_{y\in\lbrace -1, +1\rbrace^n}f(y) = \sum_{(i,j) \in E}(y_i - y_j)^2y∈{−1,+1}nargmin​f(y)=(i,j)∈E∑​(yi​−yj​)2
This equation looks almost the same as we got when we were using λ2\lambda_2λ2​ for clustering into components. Only difference is yyy here can only take +1 or -1, unlike in the eigenvalue case, where it can assume any real number following the constraints mentioned above.
﻿
Rayleigh Theoremmin⁡y∈ℜn;∑iyi=0;∑iyi2=1f(y)=∑(i,j)∈E(yi−yj)2=yTLy\displaystyle \min_{y\in \Re^n; \sum_iy_i = 0; \sum_iy_i^2=1}f(y) = \sum_{(i, j)\in E}(y_i - y_j)^2 = y^TLyy∈ℜn;∑i​yi​=0;∑i​yi2​=1min​f(y)=(i,j)∈E∑​(yi​−yj​)2=yTLy
Rayleigh Theorem states that 
λ2=min⁡yf(y)\lambda_2 = \min_yf(y)λ2​=miny​f(y) The minimum value of f(y)f(y)f(y) is given by the second smallest eigenvalue λ2\lambda_2λ2​ of the Laplacian matrix LLL
xarg min⁡yf(y)x \argmin_yf(y)xargminy​f(y) The optimal solution for yyy is given by the eigenvector xxx corresponding to λ2\lambda_2λ2​ referred to as Fielder Vector.
And we can use the sign xix_ixi​ to determine cluster assignment of node iii.
There is a proof where you can show this type of spectral clustering of graph laplacian will give us near optimal solution.  There is an approximation guarantee of the spectral clustering: Spectral clustering finds a cut that has at most twice the conductance as the optimal one of conductance β\betaβ
Suppose there is a partition of GGG into AAA and BBB where ∣A∣≤∣B∣|A| \leq |B|∣A∣≤∣B∣, such that the conductance of the cut (A,B) is β\betaβ then λ2≤β\lambda_2 \leq \betaλ2​≤β.
TODO: The proof for this is in the PDF
Solving for λ2\lambda_2λ2​ is essentially minimizing the sum of the squared differences  between the coordinates(eigenvector labels) of nodes. with the condition that sum is zero and square sum is 1
﻿
Spectral Partitioning AlgorithmPreprocessingBuild the laplacian matrix LLL from adjacency matrix
Eigenvalue decompositionfind the eigenvalues λ\lambdaλ and eigenvectors xxx of LLL. Map vertices to corresponding components of x2x_2x2​
GroupingSort components of reduced 1-d vector of node labels. Identify clusters by splitting the sorted vector in two. You can choose the splitting point at median or simply 0, or you can choose to minimize the normalized cut by sweeping over ordering of nodes induced by the eigenvector.
In the above image you can see after you build the laplacian matrix of the graph with two obvious clusters, and get the x2x_2x2​ and assign them to the respective nodes, sort them and plot the ranks of the node on x-axis and its x2x_2x2​ value on the y axis. You can see the clusters emerging.
In the above image if you do spectral clustering on a graph with 4 obvious clusters, and you do the same plot on the second eigenvector x2x_2x2​, you see 4 step pattern emerge.
In the above image we show the plots of 1st and 3rd component plots for the same cluster above.
k-Way Spectral ClusteringGiven a graph, how do we make use of the eigen components of the graph laplacian?
Two approaches
recursive bi-partitioning[Hagen '92]
cluster multiple eigenvectors[Shi-Malik '00]
Recursive Bi-partitioningRecursive bi-partitioning involves, partitioning the cluster into two components using the second component x2x_2x2​. And then consider each cluster into its own graph and find its own second component and repeat. This is a bit inefficient and unstable and cluster count will have to be a power of 2.
Cluster multiple eigenvectorsCluster multiple eigenvectors, involves, building a reduced space from multiple eigenvectors. Each node is now represented by k components of the first k eigenvectors. We then cluster these nodes based on their k-dimensional representation using k-means and etc. This is the recommended approach and more common in recent papers. This method can be used to approximate optimal k-way normalized cut. Emphasizes cohesive clusters, increases the unevenness in the distribution of the data. Associations between similar points are amplified, associations between dissimilar points are attenuated. The data begins to "approximate a clustering". Multiple eigenvectors prevent instability due to information loss.
We make use of Eigengap, the difference between two consecutive eigenvalues, to choose k. Most stable clustering is generally give by the value kkk that maximizes eigengap Δk=∣λk−λk−1∣\Delta_k = |\lambda_k - \lambda_{k-1}|Δk​=∣λk​−λk−1​∣. You plot the eigenvalues, and when pick kkk where it seems like the plot drops.
﻿
Motif-Based Spectral ClusteringCould you come up and cluster the network based on the motifs. We want to detect communities based on network motifs. Give a motif M and Graph G, We identify clusters in the network where M occurs a lot. Different motifs reveal different clustering structures.
We need analogues to cut, volume and conductance in case of motifs and they are defined like below.
Motifs cut: Number of motifs that get cut
Volume volM(S)vol_M(S)volM​(S) = number of instances of end points of a motif are there in a given set s
Motif conductance ϕ(S)\phi(S)ϕ(S) = motifs cutvolM(S)\frac{\text{motifs cut}}{vol_M(S)}volM​(S)motifs cut​
In the image above the motif cut is 1 and volume 10, because the number of motif endpoints in set s, three triangles and the cut motif has one endpoint in SSS.
Similar to edge clustering where finding the optimal cut that minimizes conductance is NP-hard, finding a cut that optimizes the motif-conductance is also NP-hard. But we can use similar spectral clustering techniques with slight changes to get an approximate optimal motif clustering.
Motif Spectral ClusteringGiven a motif MMM and a graph GGG, find a set of nodes SSS that minimizes motif conductance.
Preprocessing:From graph GGG and Motif MMM. We get a new weighted graph W(M)W^{(M)}W(M). Wij(M)W_{ij}^{(M)}Wij(M)​ is the number of times edge (i, j) participates in motif M
In the above picture, in the weighted graph, the edge weight of nodes 8 and 9 is 2 because in the graph G that (8,9) edge participates in two instances of the motif. And the edge weight of nodes 7, 9 is zero as there are no instances of edge (9,7) participating in the motif M. The weighted graph we build is also an undirected graph.
DecompositionWe apply standard spectral clustering on weighted graph WMW^{M}WM finds clusters of low motif conductance.
Compute Fiedler vector xxx associated with λ2\lambda_2λ2​ of the laplacian of L(M)L^{(M)}L(M)
L(M)=D(M)−W(M)L^{(M)} = D^{(M)} - W^{(M)}L(M)=D(M)−W(M)
Compute: L(M)=λ2xL^{(M)} = \lambda_2xL(M)=λ2​x
Use x to identify communities.
Grouping
We sort the nodes by their values in xxx, Let Sr={x1,⋯ ,xr}S_r = \lbrace x_1, \cdots, x_r \rbraceSr​={x1​,⋯,xr​} and compute the motif conductance of each SrS_rSr​. We select the optimal cut by doing this sweeping procedure. In the above image we see that S5S_5S5​ had the optimal conductance, so we finalize the cut to be of the nodes {1⋯5}\lbrace 1\cdots 5\rbrace{1⋯5}
We generalized community detection to higher-order structures. Motif conductance admits a motif Cheeger inequality. Simple, fast and scalable.
﻿
Other partitioning methodsMETISHeuristic but works really well in practice.
GraclusBased on kernel k-means
LouvainBased on Modularity Optimization
Clique percorlation methodFor finding overlapping clusters
﻿
Chapters
Introduction, Structure of Graphs
Properties of Networks and Random Graph Models
Motifs and Structural Roles in Networks
Community Structure in Networks
Spectral Clustering
Message Passing and Node Classification
Graph Representation Learning
Graph Neural Networks
Graph Neural Networks - Pytorch Geometric
Deep Generative Models for Graphs
Link Analysis: PageRank
Network Effects and Cascading Behaviour
Probabilistic Contagion and Models of Influence
Influence Maximization in Networks
Outbreak Detection in Networks
Network Evolution
Reasoning over Knowledge Graphs
Limitations of Graph Neural Networks
Applications of Graph Neural Networks
﻿
﻿
Add a comment
5.Spectral Clustering

Introduction and Problem Definition

Pre-processing

Decomposition

Grouping

Problem Definition

Graph Partitioning Criterion

Conductance [Shi-Malik '97]

Spectral Graph Theory

Example: d-Regular Graph.

Example: Graph on 2 Components

Matrix Representations

Adjacency matrix( $A$ )

Degree matrix( $D$ )

Laplacian matrix( $L$ )

$λ2\lambda_2$ as an Optimization Problem

Find Optimal Cut [Fiedler '73]

Rayleigh Theorem

Spectral Partitioning Algorithm

Preprocessing

Eigenvalue decomposition

Grouping

k-Way Spectral Clustering

Recursive Bi-partitioning

Cluster multiple eigenvectors

Motif-Based Spectral Clustering

Motif Spectral Clustering

Preprocessing:

Decomposition

Grouping

Other partitioning methods

METIS

Graclus

Louvain

Clique percorlation method

Chapters

5.Spectral Clustering

Introduction and Problem Definition

Pre-processing

Decomposition

Grouping

Problem Definition

Graph Partitioning Criterion

Conductance [Shi-Malik '97]

Spectral Graph Theory

Example: d-Regular Graph.

Example: Graph on 2 Components

Matrix Representations

Adjacency matrix(AAA)

Degree matrix(DDD)

Laplacian matrix(LLL)

λ2\lambda_2λ2​ as an Optimization Problem

Find Optimal Cut [Fiedler '73]

Rayleigh Theorem

Spectral Partitioning Algorithm

Preprocessing

Eigenvalue decomposition

Grouping

k-Way Spectral Clustering

Recursive Bi-partitioning

Cluster multiple eigenvectors

Motif-Based Spectral Clustering

Motif Spectral Clustering

Preprocessing:

Decomposition

Grouping

Other partitioning methods

METIS

Graclus

Louvain

Clique percorlation method

Chapters

Adjacency matrix( $A$ )

Degree matrix( $D$ )

Laplacian matrix( $L$ )

$λ2\lambda_2$ as an Optimization Problem