Vivek Gopalakrishnan

Graph Cuts, Graph Laplacians, and λ2

2022-01-31 | Another interesting eigenvalue application!

Defining the Graph Cut Problem

Let G=(V,E)G = (V, E) be a graph with V=n|V| = n. The goal of the graph cut problem is to find a subset of the vertex set SVS \subset V that minimizes the cut ratio:[1]

ϕ(S)=e(S)min{S,Sc} , \phi(S) = \frac{e(S)}{\min\{|S|, |S^c|\}} \,,

where e(S)=e(Sc)e(S) = e(S^c) is the number of edges connecting SS and ScS^c.

[1] The cut ratio can be thought of as quantifying the amount of information that is lost when a cut is performed. Note, there are many different cost functions one can use to define the graph cut problem. For example, if S|S| is replaced with d(S)d(S) (the degree of the nodes in SS), the new quantity is called the conductance of a cut.
We can encode which subset each vertex belongs to with a binary vector x{1,+1}nx \in \{-1, +1\}^n. Then, e(S)=14(i,j)E(xixj)2e(S) = \frac{1}{4} \sum_{(i,j) \in E} (x_i - x_j)^2. With this binary representation, this essentially becomes an integer programming problem! However, this problem is NP-hard,[2] so solving it requires a few relaxations... Before doing so, it is useful to first define the graph Laplacian and prove a few properties.

[2] See here for more details.

Spectral Properties of the Graph Laplacian

In this context, spectral refers to the spectrum of eigenvalues. To get eigenvalues, we need a matrix. We can represent a grpah as a matrix using either the adjecency matrix AMn({0,1})A \in M_n(\{0, 1\}) and the degree matrix DMn({0,1,,n1})D \in M_n(\{0, 1, \dots, n-1\}).

Definition 1 (The Graph Laplacian). The graph Laplacian is a matrix representation of graph GG. It is written as LG=DAL_G = D - A where DD is the degree matrix and AA is the adjacency matrix of the graph. Since we assume GG is undirected, AA is a symmetric matrix, and therefore, so is LGL_G.

Theorem 1. A graph Laplacian has eigenvector 1\vec 1 with eigenvalue 0.
Proof: LG1=(DA)1=D1A1=0L_G \vec 1 = (D - A) \vec 1 = D \vec 1 - A \vec 1 = \vec 0.

Corollary 1. For a complete graph KK, any vector perpendicular to 1\vec 1 is also an eigenvector of the graph Laplacian.
Proof: Since GG is complete, the graph Laplacian can be written as LK=(n1)I(1I)=nI1L_K = (n-1)I - (\mathbf 1 - I) = nI - \mathbf 1 If v1v \perp \vec 1, then Lkv=nIv1v=nvL_k v = nIv - \cancel{\mathbf 1 v} = nv. This also proves that the eigenvalue 11 has algebraic multiplicity n1n-1.

Relaxing the Graph Cut Problem

First, note that

SSc=(iV1(xi=+1))(jV1(xj=1))=i,jV1(xi=+1)1(xj=1)=12i,jV1(xixj)=14i<j(xixj)2 ,\begin{aligned} |S| |S^c| &= \left(\sum_{i \in V} \mathbb{1}(x_i = +1)\right) \left(\sum_{j \in V} \mathbb{1}(x_j = -1)\right) \\ &= \sum_{i, j \in V} \mathbb{1}(x_i = +1)\mathbb{1}(x_j = -1) \\ &= \frac{1}{2} \sum_{i, j \in V} \mathbb{1}(x_i \neq x_j) \\ &= \frac{1}{4} \sum_{i < j} (x_i - x_j)^2 \,, \end{aligned}

allowing us to rewrite the cut ratio as[3]

ϕ(S)=e(S)SSc=(i,j)E(xixj)2i<j(xixj)2 .\phi'(S) = \frac{e(S)}{|S||S^c|} = \frac{\sum_{(i,j) \in E} (x_i - x_j)^2}{\sum_{i < j} (x_i - x_j)^2} \,.

[3] i<j=j=1ni=1j1 .\sum_{i < j} = \sum_{j=1}^n \sum_{i=1}^{j-1} \,.
One very interesting and surprising fact is that the sum of squared pairwise differences in the numerator and denominator of ϕ\phi' can be written in terms of matrix products using the graph Laplacian.[4] Specifically, we can write

(i,j)E(xixj)2=xTLGxandi<j(xixj)2=xTLKx , \sum_{(i,j) \in E} (x_i - x_j)^2 = x^T L_G x \quad\text{and}\quad \sum_{i < j} (x_i - x_j)^2 = x^T L_K x \,,

where LGL_G is the graph Laplacian of GG and LKL_K is the graph Laplacian of the complete graph on VV.[5]

[4] See Definition 1 in this section.
[5] To prove this, we can use the fact that every Laplacian can be written as a sum of edge Laplacians, LG=(i,j)ELij=(i,j)E(eiej)(eiej)TL_G = \sum_{(i, j) \in E} L_{ij} = \sum_{(i, j) \in E} (e_i - e_j)(e_i - e_j)^T. Then xTLGx=(i,j)ExT(eiej)(eiej)Tx=(i,j)E(xixj)2 .x^T L_G x = \sum_{(i, j) \in E} x^T(e_i - e_j)(e_i - e_j)^Tx = \sum_{(i, j) \in E} (x_i - x_j)^2 \,. Note that this also proves that the Laplacian is positive semi-definite.
This, along with the fact min{S,Sc}SSc12min{S,Sc} ,\min\{|S|, |S^c|\} \geq |S| |S^c| \geq \frac{1}{2} \min\{|S|, |S^c|\} \,, allows us to bound this re-expressed cut ratio:

ϕ(S)ϕ(S)2ϕ(S) . \phi(S) \leq \phi'(S) \leq 2 \phi(S) \,.

While removing the min\min in the denominator makes ϕ\phi' a convex objective, the minimization problem is still NP-hard because of the support of xx. We can further relax the problem by letting xRnx \in \mathbb{R}^n since we can threshold the values of xx to make it binary. Then, note that WLOG, we can also assume that i=1nxi=0\sum_{i = 1}^n x_i = 0 since we could arbitrarily shift xx however we wanted.[6] Therefore, our final optimization problem is

minxRn,x1xTLGxxTLKx . \min_{x \in \mathbb R^n, x \perp \vec 1} \frac{x^T L_G x}{x^T L_K x} \,.

To solve this elegantly, we will appeal to classical properties of the graph Laplacian.

[6] This innocuous constraint also buys us x1x \perp \vec 1. To see this, think about the sum in terms of a dot product.

Solving the Graph Cut Relaxation

Since we can assume that x1x \perp \vec 1, Corollary 1 allows us to rewrite our optimization problem as

minxRn,x1xTLGxxTLKx=1nminxRn,x1xTLGxxTx . \min_{x \in \mathbb R^n, x \perp \vec 1} \frac{x^T L_G x}{x^T L_K x} = \frac{1}{n} \min_{x \in \mathbb R^n, x \perp \vec 1} \frac{x^T L_G x}{x^Tx} \,.

This bounded quadratic form is present in many optimization problems, and since the graph Laplacian is symmetric, we can solve it using the Courant–Fischer theorem.[7] Specifically, appealing to Corollary 1 in the PCA derivation, we can solve minxRn, x1xTLGxxTx=λ2\min_{x \in \mathbb R^n,\ x \perp \vec 1} \frac{x^T L_G x}{x^Tx} = \lambda_2, where λ1=0λ2λn\lambda_1=0 \leq \lambda_2 \leq \cdots \lambda_n are the eigenvalues of LGL_G. Note that we know the eigenvalues of LGL_G are non-negative because we proved that LGL_G is positive semi-definite in deriving the approximate objective function.

Therefore, the final solution to the graph cut relaxation is

minSVϕ(S)=λ2n . \min_{S \subset V} \phi'(S) = \frac{\lambda_2}{n} \,.
[7] For background on the Courant–Fischer theorem, see this derivation of PCA. In this proof, we use the variant of Courant–Fischer where the eigenvalues are sorted in ascending order. The statement is mostly the same, except the min-max becomes a max-min.