June 2005 – Page 2 – Michael Nielsen

Back to fermions today, with a post introducing the canonical commutation relations fermions, and discusses something of their mathematical and physical significance. The next post will get more into the meat, describing the consequences these commutation relations have.

Note: This post is one in a series describing fermi algebras, and a powerful tool known as the Jordan-Wigner transform, which allows one to move back and forth between describing a system as a collection of qubits, and as a collection of fermions. The posts assume familiarity with elementary quantum mechanics, comfort with elementary linear algebra (but not advanced techniques), and a little familiarity with the basic nomenclature of quantum information science (qubits, the Pauli matrices).

Fermions

The canonical commutation relations for Fermions

Suppose we have a set of operators [tex]a_1,\ldots,a_n[/tex] acting on some Hilbert space [tex]V[/tex]. Then we say that these operations satisfy the canonical commutation relations (CCRs) for Fermions if they satisfy the equations

[tex] \{ a_j, a_k^\dagger \} = \delta_{jk} I; \,\,\,\, \{ a_j,a_k \} = 0, [/tex]

where [tex]\{ A, B \} \equiv AB+BA[/tex] is the anticommutator. Note that when we take the conjugate of the second of these relations we obtain [tex]\{ a_j^\dagger, a_k^\dagger \} = 0[/tex], which is sometimes also referred to as one of the CCRs. It is also frequently useful to set [tex]j = k[/tex], giving [tex]a_j^2 = (a_j^\dagger)^2 = 0[/tex].

How should one understand the CCRs? One way of thinking about the CCRS is in an axiomatic mathematical sense. In this way of thinking they are purely mathematical conditions that can be imposed on a set of matrices: for a given set of matrices, we can simply check and verify whether those matrices satisfy or do not satisfy the CCRs. For example, when the state space [tex]V[/tex] is that of a single qubit, we can easily verify that the operator [tex]a = |0\rangle \langle 1|[/tex] satisfies the Fermionic CCRs. From this axiomatic point of view the question to ask is what consequences about the structure of [tex]V[/tex] and the operators [tex]a_j[/tex] can be deduced from the fact that the CCRs hold.

There’s also a more sophisticated (but still entirely mathematical) way of understanding the CCRs, as an instance of the relationship between abstract algebraic objects (such as groups, Lie algebras, or Hopf algebras), and their representations as linear maps on vector spaces. My own knowledge of representation theory is limited to a little representation theory of finite groups and of Lie algberas, and I certainly do not see the full context in the way an expert on representation theory would. However, even with that limited background, one can see that there are common themes and techniques: what may appear to be an isolated technique or trick is often really an instance of a much deeper idea or set of ideas that become obvious once once one has enough broad familiarity with representation theory. I’m not going to pursue this point of view in these notes, but thought it worth mentioning for the sake of giving context and motivation to the study of other topics.

Finally, there’s a physical way in which we can understand the CCRs. When we want to describe a system containing Fermions, one way to begin is to start by writing down a set of operators satisfying the CCRs, and then to try to guess what sort of Hamiltonian involving those operators could describe the interactions observed in the system, often motivated by classical considerations, or other rules of thumb. This is, for example, the sort of point of view pursued in the BCS theory of superconductivity, and which people are trying to pursue in understanding high temperature superconductors.

Of course, one can ask why physicists want to use operators satisfying the Fermionic CCRs to describe a system of Fermions, or why anyone, mathematician or physicist, would ever write down the CCRs in the first place. These are good questions, which I’m not going to try to answer here, although one or both questions might make a good subject for some future notes. (It is, of course, a lot easier to answer these questions once you understand the material I present here.)

Instead, I’m going to approach the Fermionic CCRs from a purely mathematical point of view, asking the question “What can we deduce from the fact that a set of operators satisfying the CCRs exists?” The surprising answer is that we can deduce quite a lot about the structure of [tex]V[/tex] and the operators [tex]a_j[/tex] simply from the fact that the [tex]a_j[/tex] satisfy the canonical commutation relations!

We’ll take this subject up in the next post, where we’ll show that the CCRs essentially uniquely determine the action of the [tex]a_j[/tex]s, up to a choice of basis.

I return to expanders today, with a discussion of expansion and eigenvalue gaps. Thanks to all the commenters on earlier posts for all the proof ideas and simplifications!

Note: This post is one in a series introducing one of the deepest ideas in modern computer science, expander graphs. Expanders are one of those powerful ideas that crop up in many apparently unrelated contexts, and that have a phenomenally wide range of uses. The goal of the posts is to explain what an expander is, and to learn just enough about them that we can start to understand some of their amazing uses. The posts require a little background in graph theory, computer science, linear algebra and Markov chains (all at about the level of a first course) to be comprehensible. I am not an expert on expanders, and the posts are just an introduction. They are are mostly based on some very nice 2003 lecture notes by Nati Linial and Avi Wigderson, available on the web at http://www.math.ias.edu/~boaz/ExpanderCourse/.

In today’s post we really reap the benefits of recasting graph theory in linear algebraic terms, showing that the the expansion parameter can be related to the gap between the largest and second largest eigenvalue of the graph. This ought to surprise you, perhaps even shock you. It’s a highly non-trivial result, with the big payoff of relating something we didn’t really understand all that well (the expansion parameter) to eigenvalue gaps, about which an enormous amount it known.

Expansion and the eigenvalue gap

Let’s return our attention to expander graphs, and see what the eigenvalues of a graph have to do with its expansion parameter. We define the gap for the graph [tex]G[/tex] to be the difference [tex]\Delta(G) \equiv \lambda_1(G)-\lambda_2(G)[/tex] between the largest and second-largest eigenvalues. The expansion parameter and the gap are connected by the following theorem:

Theorem: The expansion parameter [tex]h(G)[/tex] for a [tex]d[/tex]-regular graph [tex]G[/tex] is related to the gap [tex]\Delta(G)[/tex] by:

[tex] \frac{\Delta(G)}{2} \leq h(G) \leq \sqrt{2d \Delta(G)}. [/tex]

Thus, properties of the eigenvalue gap can be used to deduce properties of the expansion parameter. For example, if the eigenvalue gap for a family of [tex]d[/tex]-regular graphs is bounded below by a positive constant, then the expansion parameter must also be bounded below by a positive constant, and so the family is an expander.

One reason for finding the connection between the gap and the expansion parameter interesting is that it is far easier to estimate the gap of an [tex]n[/tex] by [tex]n[/tex] matrix than it is to enumerate the exponentially many subsets [tex]S[/tex] of the vertex set [tex]V[/tex], and compute [tex]|\partial S|/|S|[/tex] for each one.

Proof discussion: We already understand that [tex]\lambda_1(G) = d[/tex] for this graph, with eigenvector [tex]\vec 1 = (1,1,\ldots,1)[/tex]. So we’ll concentrate on trying to understand the behaviour of the second largest eigenvalue, [tex]\lambda_2(G)[/tex]. The theorem tells us that the difference between [tex]d[/tex] and [tex]\lambda_2(G)[/tex] is controlled both above and below by the expansion parameter [tex]h(G)[/tex].

How can we get control over the second largest eigenvalue of [tex]G[/tex]? One way is to observe that [tex]\lambda_2(G)[/tex] is just the maximum of the expression [tex]v^T A v / v^T v[/tex], where [tex]A[/tex] is the adjacency matrix of [tex]G[/tex], and we maximize over all vectors [tex]v[/tex] orthogonal to the eigenvector [tex]\vec 1[/tex]. An encouraging fact is that this expression is quite easy to deal with, because the condition that [tex]v[/tex] be orthogonal to [tex]\vec 1[/tex] is actually equivalent to the sum of [tex]v[/tex]’s entries being equal to [tex]0[/tex], so we have

[tex] \lambda_2(G) = \max_{v: {\rm tr}(v) = 0} \frac{v^T A v}{v^T v}, [/tex]

where [tex]{\rm tr}(v)[/tex] is just the sum of the entries of the vector [tex]v[/tex].

We’re going to provide a lower bound on [tex]\lambda_2(G)[/tex] by simply guessing a good choice of [tex]v[/tex] satisfying [tex]\mbox{tr}(v) = 0[/tex], and using the fact that

[tex] \lambda_2(G) \geq \frac{v^T A v}{v^T v}. [/tex]

To make a good guess, it helps to have a way of thinking about expressions like [tex]v^T A v[/tex], where [tex]\mbox{tr}(v) = 0[/tex]. A convenient way of thinking is to rewrite [tex]v[/tex] as the difference of two disjoint probability distributions, [tex]p[/tex] and [tex]q[/tex], i.e., [tex]v = p-q[/tex], where [tex]p[/tex] and [tex]q[/tex] are non-negative vectors each summing to [tex]1[/tex], and with disjoint support. This results in terms like [tex]p^T A q[/tex], which we can think of in terms of transition probabilities between [tex]q[/tex] and [tex]p[/tex]. This will allow us to apply the expansion properties of the graph.

Let’s make these ideas a little more concrete. The key is to define [tex]\vec 1_S[/tex] to be the vector whose entries are [tex]1[/tex] on [tex]S[/tex], and [tex]0[/tex] elsewhere, and to observe that

[tex] \vec 1_S^T A \vec 1_T = |E(S,T)|, [/tex]

where [tex]|E(S,T)|[/tex] is the number of edges between the vertex sets [tex]S[/tex] and [tex]T[/tex]. This suggests that we should choose [tex]p[/tex] and [tex]q[/tex] in terms of vectors like [tex]\vec 1_S[/tex], since it will enable us to relate expressions like [tex]v^T A v[/tex] to the sizes of various edge sets, which, in turn, can be related to the expansion parameter.

Suppose in particular that we choose

[tex] v = \frac{\vec 1_S}{|S|}-\frac{\vec 1_{\overline S}}{|\overline S|}. [/tex]

This satisfies the condition [tex]\mbox{tr}(v) = 0[/tex], and gives

[tex] v^T v = \frac{1}{|S|}+\frac{1}{|\overline S|} [/tex]

and

[tex] v^T A v = \frac{1}{|S|^2} E(S,S) + \frac{1}{|\overline S|^2} E(\overline S,\overline S) – \frac{2}{|S||\overline S|} E(S,\overline S). [/tex]

The definition of an expander graph gives us control over [tex]E(S,\overline S)[/tex], so it is convenient to rewrite [tex]E(S,S)[/tex] and [tex]E(\overline S,\overline S)[/tex] in terms of [tex]E(S,\overline S)[/tex], using the [tex]d[/tex]-regularity of the graph:

[tex] E(S,S)+E(S,\overline S) = d |S|; \,\,\, E(\overline S,\overline S)+ E(S,\overline S) = d |\overline S|. [/tex]

A straightforward substitution and a little algebra gives:

[tex] v^T A v = d \left( \frac{1}{|S|}+\frac{1}{|\overline S|} \right) – \left(\frac{1}{|S|}+\frac{1}{|\overline S|}\right)^2 E(S,\overline S). [/tex]

Comparing with the earlier expression for the denominator [tex]v^T v[/tex], we obtain

[tex] \lambda_2(G) \geq d-\left(\frac{1}{|S|}+\frac{1}{|\overline S|}\right) E(S,\overline S). [/tex]

Now choose [tex]S[/tex] so that [tex]E(S,\overline S) = h(G) |S|[/tex], and [tex]|S| \leq n/2[/tex], giving after a little algebra

[tex] \lambda_2(G) \geq d-2 h(G), [/tex]

and thus

[tex] \frac{\Delta(G)}{2} \leq h(G), [/tex]

which was the first of the two desired inequalities in the theorem.

The proof of the second inequality is a little more complicated. Unfortunately, I haven’t managed to boil the proof down to a form that I’m really happy with, and for this reason I won’t describe the details. If you’re interested, you should try to prove it yourself, or refer to the notes of Linial and Wigderson.

QED

Problem: Can we generalize this result so that it applies to a general undirected graph [tex]G[/tex], not just to [tex]d[/tex]-regular graphs? Can we prove an analogous statement for directed graphs, perhaps in terms of singular values? Can we define a a generalized notion of “expansion” which can be applied to any symmetric matrix [tex]A[/tex] with non-negative entries, and connect that notion of expansion to the eigenvalue gap? Can we generalize even further? What happens if we change the field over which the matrix is considered?

Month: June 2005

Fermions and the Jordan-Wigner transform II: the Canonical Commutation Relations

Fermions

The canonical commutation relations for Fermions

Expander graphs IV: Expansion and the eigenvalue gap

Expansion and the eigenvalue gap