Note: This post is one in a series introducing one of the deepest ideas in modern computer science, expander graphs. Expanders are one of those powerful ideas that crop up in many apparently unrelated contexts, and that have a phenomenally wide range of uses. The goal of the posts is to explain what an expander is, and to learn just enough about them that we can start to understand some of their amazing uses. The posts require a little background in graph theory, computer science, linear algebra and Markov chains (all at about the level of a first course) to be comprehensible. I am not an expert on expanders, and the posts are just an introduction. They are are mostly based on some very nice 2003 lecture notes by Nati Linial and Avi Wigderson, available on the web at http://www.math.ias.edu/~boaz/ExpanderCourse/.
Today’s post digresses to explain one of the most interesting ideas in graph theory, namely, to describe a graph through its adjacency matrix. By describing a graph in terms of matrices we enable all the powerful tools of linear algebra to be brought to bear on graph theory, and we’ll see that there are all sorts of interesting connections between graphs and linear algebraic concepts such as eigenvalues. The connection to expanders will be explained in later posts, where we’ll see that the expansion parameter is connected to the second largest eigenvalue of the adjacency matrix. I believe (but have not checked) that it is this result which is used to establish that the graphs described in the previous post are actually expanders.
Graphs and their adjacency matrices
How can we prove that a family of graphs is an expander? Stated another way, how does the expansion parameter
vary as the graph
is varied over all graphs in the family?
One way of tackling the problem of computing
is to do a brute force calculation of the ratio
for every subset
of vertices containing no more than half the vertices in the graph. Doing this is a time-consuming task, since if there are
vertices in the graph, then there are exponentially many such subsets
.
Problem: In general, how hard is it to find the subset
minimizing
? Can we construct an NP-Complete variant of this problem? I don’t know the answer to this question, and I don’t know if anyone else does, either.
Fortunately, there is an extraordinarily beautiful approach to the problem of determining
which is far less computationally intensive. It involves the adjacency matrix
of the graph
. By definition, the rows and columns of the adjacency matrix are labelled by the vertices of
. For vertices
and
the entry
is defined to be
if
is an edge, and
if it is not an edge.
It is a marvellous fact that properties of the eigenvalue spectrum of the adjacency matrix
can be used to understand properties of the graph
. This occurs so frequently that we refer to the spectrum of
as the spectrum of the graph
. It is useful because the eigenvalue spectrum can be computed quickly, and certain properties, such as the largest and smallest eigenvalue, the determinant and trace, can be computed extremely quickly.
More generally, by recasting graphs in terms of adjacency matrices, we open up the possibility of using tools from linear algebra to study the properties of graphs. Although we’re most interested in studying expanders, for the rest of this post I’m going to digress from the study of expanders, studying how the linear algebraic point of view can help us understand graphs, without worrying about how this connects to expanders. This digression is partially motivated by the fact that this is beautiful stuff (at least in my opinion), and is partially because our later discussion of expanders will be based on this linear algebraic point of view, and so it’s good to get comfortable with this point of view.
The following exercise provides a good example of how graph properties can be related to the eigenvalues of the graph.
Exercise: Prove that if two graphs are isomorphic, then they have the same spectrum.
This result is often useful in proving that two graphs are not isomorphic: simply compute their eigenvalues, and show that they are different. A useful extension of the exercise is to find an example of two graphs which have the same spectra, but which are not isomorphic.
Note that the adjacency matrix may be considered as a matrix over any field, and the result of the exercise is true over any field. (I’ve often wondered if the converse is true, but don’t know the answer.) Nonetheless, by and large, we’ll consider the adjacency matrix as a matrix over the field
of real numbers. Assuming that
is an undirected graph, we see that
is a real symmetric matrix, and thus can be diagonalized. We will find it convenient to write the eigenvalues of a graph
in non-increasing order, as
.
A fact we’ll make a lot of use of is that when
is
-regular the largest eigenvalue of
is just
. To see this, note that the vector
is an eigenvector of
with eigenvalue
. To prove that
is the largest eigenvalue seems to be a little bit harder. We’ll just sketch a proof. To prove this it is sufficient to show that
for all normalized vectors
. From the
-regularity of
it follows that
is a doubly stochastic matrix, i.e., has non-negative entries, and all rows and columns sum to one. A theorem of Birkhoff ensures that
can be written as a convex combination of permutation matrices, so
, where
are probabilities, and the
are permutation matrices. This gives
. But
for any permutation
, which gives the desired result.
The following proposition gives another example of the relationships one can find between a graph and its spectrum.
Proposition: A
-regular graph
is connected if and only if
.
Proof: The easy direction is the reverse implication, for which we prove the contrapositive, namely, that a
-regular disconnected graph has
. This follows by breaking
up into disconnected components
and
, and observing that
, where
is the matrix direct sum. Since both
and
are
-regular it follows that they both have maximal eigenvalue
, and so
appears at least twice in the spectrum of
.
At the moment, I don’t see an easy way of proving the forward implication. One not very satisfying proof is to observe that
is the Markov transition matrix for a random walk on the graph, and that since the graph is connected, the random walk must converge to a unique distribution, which implies that in the limit of large
there can only be one vector
such that
. This means that
‘s largest eigenvalue is non-degenerate, from which it follows that
‘s largest eigenvalue is non-degenerate. This is a sketch, but it can all be established rigorously with a little work and the aid of well-known theorems on Markov chains.
The proof sketched in the previous paragraph is not really satisfactory, since it involves an appeal to theorems which are in some sense less elementary than the result under discussion. Another possibility which I’ve explored but haven’t made work with complete rigour is to investigate
more explicitly. With a little thought one can prove that the entry
is just the number of paths between
and
of length
. Since
is connected, we’d expect in the limit of large
this number would be dominated by a term which does not depend on
, and would just scale like the total number of paths of length
starting at
(which is
), divided by the total number of possible destinations
, which is
, giving
. (This would only be true if
has self-loops
.) Of course, the matrix whose entries are all
has a single eigenvalue
, with all the rest
, which would suffice to establish the theorem.
QED
Problem: How should we interpret the determinant of a graph? What about the trace?
Problem: If we consider
as a matrix over the field
, then it is possible to define a matrix sum
, whose adjacency matrix is just
, and a matrix product
whose adjacency matrix is just
. Many questions naturally suggest themseves: (1) when is there an edge between
and
in
; (2) when is there an edge between
and
in
(these first two questions are easy to answer); (3) for which graphs is
invertible, and thus a natural inverse graph
exists; (4) how can we interpret the inverse graph; (5) when do two graphs commute?
Problem: Along similar lines to the previous problem, it’s possible to define a tensor product of graphs. What are the properties of the graph tensor product?
The ideas I’ve described in this post are examples of the important general principle that once you’ve defined a mathematical object, you should seek out alternate representations (or even just partial representations) of that object in terms of mathematical objects that you already understand. By recasting graphs as matrices, we open up the possibility of using all the tools of linear algebra to answer questions about graphs. This can work in one of two ways: we can ask a question about graphs, and try to see if it’s possible to give a linear algebraic answer, or we can ask what implication known results of linear algebra have for graphs – what does the Gaussian elimination procedure correspond to, or the spectral decomposition, or two matrices commuting, or the wedge product, or whatever. Exploring such connections has the potential to greatly enrich both subjects.
In this post we’ve started to get a feel for how the properties of graphs can be studied using linear algebra. In the next post we’ll turn our attention back to expanders, and understand how the expansion coefficient can be related to the gap
between the largest and second largest eigenvalues of
.