Expander graphs III: graphs and their adjacency matrices

Note: This post is one in a series introducing one of the deepest ideas in modern computer science, expander graphs. Expanders are one of those powerful ideas that crop up in many apparently unrelated contexts, and that have a phenomenally wide range of uses. The goal of the posts is to explain what an expander is, and to learn just enough about them that we can start to understand some of their amazing uses. The posts require a little background in graph theory, computer science, linear algebra and Markov chains (all at about the level of a first course) to be comprehensible. I am not an expert on expanders, and the posts are just an introduction. They are are mostly based on some very nice 2003 lecture notes by Nati Linial and Avi Wigderson, available on the web at http://www.math.ias.edu/~boaz/ExpanderCourse/.

Today’s post digresses to explain one of the most interesting ideas in graph theory, namely, to describe a graph through its adjacency matrix. By describing a graph in terms of matrices we enable all the powerful tools of linear algebra to be brought to bear on graph theory, and we’ll see that there are all sorts of interesting connections between graphs and linear algebraic concepts such as eigenvalues. The connection to expanders will be explained in later posts, where we’ll see that the expansion parameter is connected to the second largest eigenvalue of the adjacency matrix. I believe (but have not checked) that it is this result which is used to establish that the graphs described in the previous post are actually expanders.

Graphs and their adjacency matrices

How can we prove that a family of graphs is an expander? Stated another way, how does the expansion parameter [tex]h(G)[/tex] vary as the graph [tex]G[/tex] is varied over all graphs in the family?

One way of tackling the problem of computing [tex]h(G)[/tex] is to do a brute force calculation of the ratio [tex]|\partial S|/|S|[/tex] for every subset [tex]S[/tex] of vertices containing no more than half the vertices in the graph. Doing this is a time-consuming task, since if there are [tex]n[/tex] vertices in the graph, then there are exponentially many such subsets [tex]S[/tex].

Problem: In general, how hard is it to find the subset [tex]S[/tex] minimizing [tex]|\partial S|/|S|[/tex]? Can we construct an NP-Complete variant of this problem? I don’t know the answer to this question, and I don’t know if anyone else does, either.

Fortunately, there is an extraordinarily beautiful approach to the problem of determining [tex]h(G)[/tex] which is far less computationally intensive. It involves the adjacency matrix [tex]A(G)[/tex] of the graph [tex]G[/tex]. By definition, the rows and columns of the adjacency matrix are labelled by the vertices of [tex]V[/tex]. For vertices [tex]v[/tex] and [tex]w[/tex] the entry [tex]A(G)_{vw}[/tex] is defined to be [tex]1[/tex] if [tex](v,w)[/tex] is an edge, and [tex]0[/tex] if it is not an edge.

It is a marvellous fact that properties of the eigenvalue spectrum of the adjacency matrix [tex]A(G)[/tex] can be used to understand properties of the graph [tex]G[/tex]. This occurs so frequently that we refer to the spectrum of [tex]A(G)[/tex] as the spectrum of the graph [tex]G[/tex]. It is useful because the eigenvalue spectrum can be computed quickly, and certain properties, such as the largest and smallest eigenvalue, the determinant and trace, can be computed extremely quickly.

More generally, by recasting graphs in terms of adjacency matrices, we open up the possibility of using tools from linear algebra to study the properties of graphs. Although we’re most interested in studying expanders, for the rest of this post I’m going to digress from the study of expanders, studying how the linear algebraic point of view can help us understand graphs, without worrying about how this connects to expanders. This digression is partially motivated by the fact that this is beautiful stuff (at least in my opinion), and is partially because our later discussion of expanders will be based on this linear algebraic point of view, and so it’s good to get comfortable with this point of view.

The following exercise provides a good example of how graph properties can be related to the eigenvalues of the graph.

Exercise: Prove that if two graphs are isomorphic, then they have the same spectrum.

This result is often useful in proving that two graphs are not isomorphic: simply compute their eigenvalues, and show that they are different. A useful extension of the exercise is to find an example of two graphs which have the same spectra, but which are not isomorphic.

Note that the adjacency matrix may be considered as a matrix over any field, and the result of the exercise is true over any field. (I’ve often wondered if the converse is true, but don’t know the answer.) Nonetheless, by and large, we’ll consider the adjacency matrix as a matrix over the field [tex]R[/tex] of real numbers. Assuming that [tex]G[/tex] is an undirected graph, we see that [tex]A(G)[/tex] is a real symmetric matrix, and thus can be diagonalized. We will find it convenient to write the eigenvalues of a graph [tex]G[/tex] in non-increasing order, as [tex]\lambda_1(G) \geq \lambda_2(G) \geq \ldots \geq \lambda_n(G)[/tex].

A fact we’ll make a lot of use of is that when [tex]G[/tex] is [tex]d[/tex]-regular the largest eigenvalue of [tex]G[/tex] is just [tex]d[/tex]. To see this, note that the vector [tex]\vec 1 \equiv (1,1,\ldots,1)[/tex] is an eigenvector of [tex]G[/tex] with eigenvalue [tex]d[/tex]. To prove that [tex]d[/tex] is the largest eigenvalue seems to be a little bit harder. We’ll just sketch a proof. To prove this it is sufficient to show that [tex]v^T A(G) v \leq d[/tex] for all normalized vectors [tex]v[/tex]. From the [tex]d[/tex]-regularity of [tex]G[/tex] it follows that [tex]A(G)/d[/tex] is a doubly stochastic matrix, i.e., has non-negative entries, and all rows and columns sum to one. A theorem of Birkhoff ensures that [tex]A(G)/d[/tex] can be written as a convex combination of permutation matrices, so [tex]A(G) = d \sum_j p_j P_j[/tex], where [tex]p_j[/tex] are probabilities, and the [tex]P_j[/tex] are permutation matrices. This gives [tex]v^T A(G) v = d \sum_j p_j v^T P_j v[/tex]. But [tex]v^T P_j v \leq 1[/tex] for any permutation [tex]P_j[/tex], which gives the desired result.

The following proposition gives another example of the relationships one can find between a graph and its spectrum.

Proposition: A [tex]d[/tex]-regular graph [tex]G[/tex] is connected if and only if [tex]\lambda_1(G) > \lambda_2(G)[/tex].

Proof: The easy direction is the reverse implication, for which we prove the contrapositive, namely, that a [tex]d[/tex]-regular disconnected graph has [tex]\lambda_1(G) = \lambda_2(G)[/tex]. This follows by breaking [tex]G[/tex] up into disconnected components [tex]G_1[/tex] and [tex]G_2[/tex], and observing that [tex]A(G) = A(G_1) \oplus A(G_2)[/tex], where [tex]\oplus[/tex] is the matrix direct sum. Since both [tex]G_1[/tex] and [tex]G_2[/tex] are [tex]d[/tex]-regular it follows that they both have maximal eigenvalue [tex]d[/tex], and so [tex]d[/tex] appears at least twice in the spectrum of [tex]A(G)[/tex].

At the moment, I don’t see an easy way of proving the forward implication. One not very satisfying proof is to observe that [tex]A(G)/d[/tex] is the Markov transition matrix for a random walk on the graph, and that since the graph is connected, the random walk must converge to a unique distribution, which implies that in the limit of large [tex]n[/tex] there can only be one vector [tex]v[/tex] such that [tex](G^n/d^n) v = v[/tex]. This means that [tex]G^n[/tex]’s largest eigenvalue is non-degenerate, from which it follows that [tex]G[/tex]’s largest eigenvalue is non-degenerate. This is a sketch, but it can all be established rigorously with a little work and the aid of well-known theorems on Markov chains.

The proof sketched in the previous paragraph is not really satisfactory, since it involves an appeal to theorems which are in some sense less elementary than the result under discussion. Another possibility which I’ve explored but haven’t made work with complete rigour is to investigate [tex]G^n/d^n[/tex] more explicitly. With a little thought one can prove that the entry [tex]G^n_{vw}[/tex] is just the number of paths between [tex]v[/tex] and [tex]w[/tex] of length [tex]n[/tex]. Since [tex]G[/tex] is connected, we’d expect in the limit of large [tex]n[/tex] this number would be dominated by a term which does not depend on [tex]w[/tex], and would just scale like the total number of paths of length [tex]n[/tex] starting at [tex]v[/tex] (which is [tex]d^n[/tex]), divided by the total number of possible destinations [tex]w[/tex], which is [tex]n[/tex], giving [tex]G^n_{vw}/d^n \rightarrow 1/n[/tex]. (This would only be true if [tex]G[/tex] has self-loops [tex](v,v)[/tex].) Of course, the matrix whose entries are all [tex]1/n[/tex] has a single eigenvalue [tex]1[/tex], with all the rest [tex]0[/tex], which would suffice to establish the theorem.

QED

Problem: How should we interpret the determinant of a graph? What about the trace?

Problem: If we consider [tex]A(G)[/tex] as a matrix over the field [tex]Z_2 = \{0,1\}[/tex], then it is possible to define a matrix sum [tex]G_1+G_2[/tex], whose adjacency matrix is just [tex]A(G_1)+A(G_2)[/tex], and a matrix product [tex]G_1 \times G_2[/tex] whose adjacency matrix is just [tex]A(G_1) A(G_2)[/tex]. Many questions naturally suggest themseves: (1) when is there an edge between [tex]v[/tex] and [tex]w[/tex] in [tex]G_1+G_2[/tex]; (2) when is there an edge between [tex]v[/tex] and [tex]w[/tex] in [tex]G_1 \times G_2[/tex] (these first two questions are easy to answer); (3) for which graphs is [tex]A(G)[/tex] invertible, and thus a natural inverse graph [tex]G^{-1}[/tex] exists; (4) how can we interpret the inverse graph; (5) when do two graphs commute?

Problem: Along similar lines to the previous problem, it’s possible to define a tensor product of graphs. What are the properties of the graph tensor product?

The ideas I’ve described in this post are examples of the important general principle that once you’ve defined a mathematical object, you should seek out alternate representations (or even just partial representations) of that object in terms of mathematical objects that you already understand. By recasting graphs as matrices, we open up the possibility of using all the tools of linear algebra to answer questions about graphs. This can work in one of two ways: we can ask a question about graphs, and try to see if it’s possible to give a linear algebraic answer, or we can ask what implication known results of linear algebra have for graphs – what does the Gaussian elimination procedure correspond to, or the spectral decomposition, or two matrices commuting, or the wedge product, or whatever. Exploring such connections has the potential to greatly enrich both subjects.

In this post we’ve started to get a feel for how the properties of graphs can be studied using linear algebra. In the next post we’ll turn our attention back to expanders, and understand how the expansion coefficient can be related to the gap [tex]\lambda_1(G)-\lambda_2(G)[/tex] between the largest and second largest eigenvalues of [tex]G[/tex].

9 comments

aram harrow says:

May 30, 2005 at 1:57 pm

here are some more elementary proofs of some of those claims. they’re not necessarily simpler, but they don’t rely on any powerful theorems.

Claim: d is the largest e-value of a d-regular graph
Proof: The random walk on G corresponds to multiplying a probability distribution P by A/d. If P had a component which was an e-vector of A/d with e-value >1 then continuing the random walk would mean that eventually some vertex would have probability >1 (or less than -1!).

Claim: A d-regular graph is connected iff lambda_2 =1 and =0, where 1 denotes the all ones vector, meaning that v has some positive entries and some negative entries. Let w_i = |v_i|. Since ==1, then = sum_ij v_iv_j A_ij 0} and {i : v_i
aram harrow says:

May 30, 2005 at 2:06 pm

that second proof got messed up b/c i tried to use \leq signs.
instead i’ll denote inner product by (v,w).

Claim: A d-regular graph is connected iff lambda_2 = d.
Proof: If lambda_2=d then there exists a vector v s.t. (v,Av)=d, (v,v)=1 and (1,v)=0, where 1 denotes the all ones vector. This last point means that v has some positive entries and some negative entries. Let w be a vector given by w_i = |v_i|. Then (w,w) = (v,v) = 1, so (w,Aw) \leq d.

Now
d = (v,Av) = sum_ij v_i v_j A_ij \leq sum_ij |v_i v_j| A_ij
= (w,Aw) \leq d.
Each \leq must be an equality, so v_i and v_j have the same sign whenever A_ij is nonzero. Thus {i : v_i greater-than 0} and {i : v_i less-than 0} are nonempty, disconnected components.
aram harrow says:

May 30, 2005 at 2:08 pm

oops
the statement of that last claim is backwards. it should be:
G is disconnected iff lambda_2 = d.

and i should be working on my thesis…
Michael Nielsen says:

May 30, 2005 at 2:27 pm

Hi Aram — A problem with your first proof is that there may not be any probability distribution with a component in the direction of the hypothetical eigenvector.

I really like your second proof.
Troy Lee says:

May 30, 2005 at 9:03 pm

another proof of the first claim (about d being the largest eigenvalue of a
d regular graph): let x be an eigenvector corresponding to the largest
eigenvalue \lambda_1, and let x_i be the largest component of x in absolute
value.

Consider the matrix equation Ax=\lambda_1 x. Divide the ith row of this
equation by x_i to obtain
A_i1 (x_1/x_i) + … + A_in (x_n/x_i) = \lambda_1.
The left hand side has at most d nonzero terms, each of absolute value less than or equal to 1, and the claim follows.
Andrei Lopatenko says:

May 31, 2005 at 12:27 am

“Problem: In general, how hard is it to find the subset S minimizing |\partial S|/|S|? Can we construct an NP-Complete variant of this problem? I donâ€™t know the answer to this question, and I donâ€™t know if anyone else does, either.”

The coNP-hardness of testing if given graph G is (n,k,c)-expander is proved by reduction to testing whether a graph is a superconcetrator. The superconcetrator problem was proved to be coNP-hard by by Blum et al in 1981 (The complexity of testing whether a graph is a superconcetrator, IPL, 1981)

Or we can do following simple reduction. It is not the answer to your problem, but it gives a bound on expansion factor.
The following problem is NP-hard.

MINIMUM COVER.
Given a set C of subset of a finite set S and a constant K (positive integer). Does C contain a cover for S of a size K or less.

Encode an instance of a Minimum Cover as a bipartite graph (A \cup B, E).
Every element of S, we encode as a vertex in B
Every set C, we encode as a vertex in A
There is an edge between \a in A and b \in B, if set C_a contains element S_b

So, if an expansion factor is c, then there exist set of size |B|/c \subset A such that it covers B. So, the decide if an expansion factor for bipartite graph (very restricted case) is greater then constant is the same hard as to solve MINIMUM COVER
Andrei Lopatenko says:

May 31, 2005 at 12:44 am

“when G is d-regular the largest eigenvalue of G is just d.”
It can prooved in another (easer) way

Ay = \lambda y, y_j is an entry of y which is the largest in absolute value (of course, y is not = 0)
since Ay = \lambda y, then \sum y_i = \lambda y_j, where sum is taken overl all vertices y_i adjacent to y_j
|\lambda y_j| = |\sum y_i = \lambda y_j| \leq \sum_{all adjacent} |y_i | \leq d|y_j| (because of maximality y_j)

it follows that |\lambda| \leq d
Andrei Lopatenko says:

May 31, 2005 at 3:10 am

Problem: How should we interpret the determinant of a graph? What about the trace?
For any (without loops) graph, Tr A is equal to zero, so it does not have any meaning (property which would be different for different graphs )
Tr A^2 is 2 * e where is e is the number of edges
Tr A^3 is 6 * the number of triangles
For higher n, the dependency is not that simple since there are closed n-walks of different shapes

I do not know any meaning of the determinant for adjacency matrix. But besided adjacency matrix graphs are characterized by laplasian, matrixes given in a cut and flow space. The determinant of M \prod M^T, where M is a basic for the cut space is equal to the number of spanning trees of a graph in case if a graph is connected
aram harrow says:

May 31, 2005 at 8:35 am

I don’t know about determinants, but the Pfaffian can be used to count perfect matchings in planar graphs. [Kasteleyn, Physica A, 27, 1209-1225 (1961)]

This is used by Barahona to calculate in polynomial time the partition function of 2-D Ising models with arbitrary couplings between neighbors but no applied magnetic field. [F Barahona 1982 J. Phys. A: Math. Gen. 15 3241-3253] I think this result is pretty awesome, when you compare it with how hard it can be to even find a ground state of a translationally-invariant 2-D system.

Also, my proof of ||A|| \leq d can be patched up b/c 1 + epsilon v is a valid distribution for sufficiently small epsilon. But then you need to use the real spectral theorem to show that v is real and the other proof in the comments is way simpler.

Comments are closed.