General – Page 4 – Michael Nielsen

Ten semi-grand challenges for quantum computing theory

Here’s Scott Aaronson’s list. The problems are certainly all very good ones. Here’s a quick list of five questions I like, off the top of my head:

How can we build a quantum computer?
How quickly can we simulate a desired physical operation on a quantum computer? An exact formula would be nice, or at least good bounds, preferably completely general, and both upper and lower bounds. I’d guess most computer scientists think this is a hopelessly optimistic goal; I’m nothing if not optimistic.
What makes quantum computers powerful?
How powerful are they, relative to classical computer?
What physical resources are necessary and sufficient to do quantum computation? This is interesting in both the noise-free and noisy cases. Even in the noise-free case we currently know very little, for example, about when measurement-based quantum computation is possible. The problem is much harder in the noisy case (and includes determining the threshold).

Update: After posting that, I’ve decided I have fairly serious reservations about how I express the second hope. Can we really hope for a formula for the number of quantum gates required to synthesize a quantum operation? Although I haven’t thought it through in detail, it seems highly unlikely that there is any reasonably general computable formula for the time complexity of some specifed (but arbitrary) family of operations. However, this doesn’t mean that we can’t obtain a lot more general insight into this question than we currently have.

Fermions and the Jordan-Wigner transform VI: the Jordan-Wigner post

The last post! A single pdf will follow in a day or so.

Today’s post discusses a surprising connection between the Fermionic ideas we’ve been discussing up to now, and one-dimensional spin systems. In particular, a tool known as the Jordan-Wigner transform can be used to establish an equivalence between a large class of one-dimensional spin systems and the type of Fermi systems we have been considering. This is interesting because included in this class of one-dimensional spin systems are important models such as the transverse Ising model, which serve as a general prototype for quantum magnetism, are a good basis for understanding some naturally occurring physical systems, and which also serve as prototypes for the understanding of quantum phase transitions.

Note: This post is one in a series describing fermi algebras, and a powerful tool known as the Jordan-Wigner transform, which allows one to move back and forth between describing a system as a collection of qubits, and as a collection of fermions. The posts assume familiarity with elementary quantum mechanics, comfort with elementary linear algebra (but not advanced techniques), and a little familiarity with the basic nomenclature of quantum information science (qubits, the Pauli matrices).

The Jordan-Wigner transform

In this post we describe the Jordan-Wigner transform, explaining how it can be used to map a system of qubits (i.e., spin-[tex]\frac 12[/tex] systems) to a system of Fermions, and vice versa. We also explain a nice applications of these ideas, to solving one-dimensional quantum spin systems.

Suppose we have an [tex]n[/tex]-qubit system, with the usual state space [tex]C^{2^n}[/tex], and Pauli operators [tex]X_j, Y_j,Z_j[/tex] acting on qubit [tex]j[/tex]. We are going to use these operators to define a set of [tex]a_j[/tex] operators acting on [tex]C^{2^n}[/tex], and satisfying the Fermionic CCRs.

To begin, suppose for the sake of argument that we have found such a set of operators. Then from our earlier discussion the action of the [tex]a_j[/tex] operators in the occupation number representation [tex]|\alpha\rangle = |\alpha_1,\ldots,\alpha_n\rangle[/tex] must be as follows:

Suppose [tex]\alpha_j = 0[/tex]. Then [tex]a_j|\alpha\rangle = 0[/tex].
Suppose [tex]\alpha_j = 1[/tex]. Let [tex]\alpha'[/tex] be that vector which results when the [tex]j[/tex]th entry of [tex]\alpha[/tex] is changed to [tex]0[/tex]. Then [tex]a_j|\alpha\rangle = -(-1)^{s_\alpha^j} |\alpha’\rangle[/tex], where [tex]s_\alpha^j \equiv \sum_{k=1}^{j-1} \alpha_k[/tex].

If we identify the occupation number state [tex]|\alpha\rangle[/tex] with the corresponding computational basis state [tex]|\alpha\rangle[/tex], then this suggests taking as our definition

[tex] a_j \equiv -\left( \otimes_{k=1}^{j-1} Z_k \right) \otimes \sigma_j, [/tex]

where [tex]\sigma_j[/tex] is used to denote the matrix [tex]\sigma \equiv |0\rangle \langle 1|[/tex] acting on the [tex]j[/tex]th qubit. It is easily verified that these operators satisfy the Fermionic CCRs. This definition of the [tex]a_j[/tex] is known as the Jordan-Wigner transform. It allows us to define a set of operators [tex]a_j[/tex] satisfying the Fermionic CCRs in terms of the usual operators we use to describe qubits, or spin-[tex]\frac 12[/tex] systems.

The Jordan-Wigner transform can be inverted, allowing us to express the Pauli operators in terms of the Fermionic operators [tex]a_1,\ldots,a_n[/tex]. In particular, we have

[tex] Z_j = a_ja_j^\dagger-a_j^\dagger a_j. [/tex]

This observation may also be used to obtain an expression for [tex]X_j[/tex] by noting that [tex]X_j = \sigma_j +\sigma_j^\dagger[/tex], and thus:

[tex] X_j = -(Z_1 \ldots Z_{j-1}) (a_j+a_j^\dagger). [/tex]

Substituting in the expressions for [tex]Z_1,\ldots,Z_{j-1}[/tex] in terms of the Fermionic operators gives the desired expression for [tex]X_j[/tex] in terms of the Fermionic operators. Similarly, we have

[tex] Y_j = i (Z_1 \ldots Z_{j-1}) (a_{j}^\dagger-a_{j}), [/tex]

which, together with the expression for the [tex]Z_j[/tex] operators, enables us to express [tex]Y_j[/tex] solely in terms of the Fermionic operators.

These expressions for [tex]X_j[/tex] and [tex]Y_j[/tex] are rather inconvenient, involving as they do products of large numbers of Fermi operators. Remarkably, however, for certain simple products of Pauli operators it is possible to obtain quite simple expressions in terms of the Fermi operators. In particular, with a little algebra we see that:

[tex] Z_j = a_ja_j^\dagger – a_j^\dagger a_j [/tex]

[tex] X_jX_{j+1} = (a_j^\dagger-a_j)(a_{j+1}+a_{j+1}^\dagger ) [/tex]

[tex] Y_jY_{j+1} = -(a_j^\dagger+a_j)(a_{j+1}^\dagger-a_{j+1}) [/tex]

[tex] X_jY_{j+1} = i(a_j^\dagger-a_j) (a_{j+1}^\dagger-a_{j+1}) [/tex]

[tex] Y_jX_{j+1} = i(a_j^\dagger+a_j) (a_{j+1}^\dagger+a_{j+1}). [/tex]

Suppose now that we have an [tex]n[/tex]-qubit Hamiltonian [tex]H[/tex] that can be expressed as a sum over operators from the set [tex]Z_j, X_jX_{j+1},Y_jY_{j+1},X_jY_{j+1}[/tex] and [tex]Y_{j}X_{j+1}[/tex]. An example of such a Hamiltonian is the transverse Ising model,

[tex] H = \alpha \sum_j Z_j + \beta \sum_j X_j X_{j+1}, [/tex]

which describes a system of magnetic spins with nearest neighbour couplings of strength [tex]\beta[/tex] along their [tex]\hat x[/tex] axes, and in an external magnetic field of strength [tex]\alpha[/tex] along the [tex]\hat z[/tex] axis.

For any such Hamiltonian, we see that it is possible to re-express the Hamiltonian as a Fermi quadratic Hamiltonian. As we saw in an earlier post, determining the energy levels is then a simple matter of finding the eigenvalues of a [tex]2n \times 2n[/tex] matrix, which can be done very quickly. In particular, finding the ground state energy is simply a matter of finding the smallest eigenvalue of that matrix, which is often particularly easy. In the case of models like the transverse Ising model, it is even possible to do this diagonalization analytically, giving rise to exact expressions for the energy spectrum. Details can be found in the paper by Lieb, Schulz and Mattis mentioned earlier, or books such as the well-known book by Sachdev on quantum phase transitions.

Exercise: What other products of Pauli operators can be expressed as quadratics in Fermi operators?

Problem: I’ve made some pretty vague statements about finding the spectrum of a matrix being “easy”. However, I must admit that I’m speaking empirically here, in the sense that in practice I know this is easily done on a computer, but I don’t know a whole lot about the computational complexity of the problem. One obvious observation is that finding the spectrum is equivalent to finding the roots of the characteristic equation, which is easily computed, so the problem may be viewed as being about the computational complexity of root-finding.

Finis

More google goodness

Google is amazing: google print; and google earth. I’ve taken the liberty of picking a good sample application of google print in the link.

Fermions and the Jordan-Wigner transform V: Diagonalization Continued

In today’s post we continue with the train of thought begun in the last post, learning how to find the energy spectrum of any Hamiltonian quadratic in Fermi operators. Although we don’t dwell in this post on the connection to specific physical models, this class of Hamiltonians covers an enormous number of models of relevance in condensed matter physics. In later posts we’ll apply these results (together with the Jordan-Wigner transform) to understand the energy spectrum of some models of interacting spins in one dimension, which are prototypes for an understanding of quantum magnets and many, many other phenomena, including many systems which undergo quantum phase transitions.

Update: Thanks to Steve Mayer for fixing a bug in the way matrices are displayed.

The Hamiltonian [tex]H = \sum_{jk} \alpha_{jk} a_j^\dagger a_k[/tex] we diagonalized in the last post can be generalized to any Hamiltonian which is quadratic in Fermi operators, by which we mean it may contain terms of the form [tex]a^\dagger_j a_k, a_j a_k^\dagger, a_j a_k[/tex] and [tex]a_j^\dagger a_k[/tex]. We will not allow linear terms like [tex]a_j+a_j^\dagger[/tex]. Additive constant terms [tex]\gamma I[/tex] are easily incorporated, since they simply displace all elements of the spectrum by an amount [tex]\gamma[/tex]. There are several ways one can write such a Hamiltonian, but the following form turns out to be especially convenient for our purposes:

[tex] H = \sum_{jk} \left( \alpha_{jk} a_j^\dagger a_k -\alpha^*_{jk} a_j a_k^\dagger + \beta_{jk} a_j a_k – \beta^*_{jk} a_j^\dagger a_k^\dagger \right). [/tex]

The reader should spend a little time convincing themselves that for the class of Hamiltonians we have described, it is always possible to write the Hamiltonian in this form, up to an additive constant [tex]\gamma I[/tex], and with [tex]\alpha[/tex] hermitian and [tex]\beta[/tex] antisymmetric.

This class of Hamiltonian appears to have first been diagonalized in an appendix to a famous Annals of Physics paper by Lieb, Schultz and Mattis, dating to 1961 (volume 16, pages 407-466), and the procedure we follow is inspired by theirs. We begin by defining operators [tex]b_1,\ldots,b_n[/tex]:

[tex] b_j \equiv \sum_k \left( \gamma_{jk} a_k + \mu_{jk} a_k^\dagger \right). [/tex]

We will try to choose the complex numbers [tex]\gamma_{jk}[/tex] and [tex]\mu_{jk}[/tex] to ensure that: (1) the operators [tex]b_j[/tex] satisfy Fermionic CCRs; and (2) when expressed in terms of the [tex]b_j[/tex], [tex]H[/tex] has the same form as [tex]H_{\rm free}[/tex], and so can be diagonalized.

A calculation shows that the condition [tex]\{ b_j, b_k^\dagger \} = \delta_{jk} I[/tex] is equivalent to the condition

[tex] \gamma \gamma^\dagger + \mu \mu^\dagger = I, [/tex]

while the condition [tex]\{ b_j, b_k \} = 0[/tex] is equivalent to the condition

[tex] \gamma \mu^T+\mu \gamma^T = 0. [/tex]

These are straightforward enough equations, but their meaning is perhaps a little mysterious. More insight into their structure is obtained by rewriting the connection between the [tex]a_j[/tex]s and the [tex]b_j[/tex]s in an equivalent form using vectors whose individual entries are not numbers, but rather are operators such as [tex]a_j[/tex] and [tex]b_j[/tex], and using a block matrix with blocks [tex]\gamma, \mu, \mu^*[/tex] and [tex]\gamma^*[/tex]:

[tex] \left[ \begin{array}{c} b_1 \\ \vdots \\ b_n \\ b_1^\dagger \\ \vdots \\ b_n^\dagger \end{array} \right] = \left[ \begin{array}{cc} \gamma & \mu \\ \mu^* & \gamma^* \end{array} \right] \left[ \begin{array}{c} a_1 \\ \vdots \\ a_n \\ a_1^\dagger \\ \vdots \\ a_n^\dagger \end{array} \right]. [/tex]

The conditions derived above for the [tex]b_j[/tex]s to satisfy the CCRs are equivalent to the condition that the transformation matrix

[tex] T \equiv \left[ \begin{array}{cc} \gamma & \mu \\ \mu^* & \gamma^* \end{array} \right] [/tex]

is unitary, which is perhaps a somewhat less mysterious condition than the earlier equations involving [tex]\gamma[/tex] and [tex]\mu[/tex]. One advantage of this representation is that it makes it easy to find an expression for the [tex]a_j[/tex] in terms of the [tex]b_j[/tex], simply by inverting this unitary transformation, to obtain:

[tex] \left[ \begin{array}{c} a_1 \\ \vdots \\ a_n \\ a_1^\dagger \\ \vdots \\ a_n^\dagger \end{array} \right] = T^\dagger \left[ \begin{array}{c} b_1 \\ \vdots \\ b_n \\ b_1^\dagger \\ \vdots \\ b_n^\dagger \end{array} \right]. [/tex]

The next step is to rewrite the Hamiltonian in terms of the [tex]b_j[/tex] operators. To do this, observe that:

[tex] H = [ a_1^\dagger \ldots a_n^\dagger a_1 \ldots a_n ] \left[ \begin{array}{cc} \alpha & -\beta^* \ \beta & -\alpha^* \end{array} \right] \left[ \begin{array}{c} a_1 \\ \vdots \\ a_n \\ a_1^\dagger \\ \vdots \\ a_n^\dagger \end{array} \right]. [/tex]

It is actually this expression for [tex]H[/tex] which motivated the original special form which we chose for [tex]H[/tex]. The expression is convenient, for it allows us to easily transform back and forth between [tex]H[/tex] expressed in terms of the [tex]a_j[/tex] and [tex]H[/tex] in terms of the [tex]b_j[/tex]. We already have an expression in terms of the [tex]b_j[/tex] operators for the column vector containing the [tex]a[/tex] and [tex]a^\dagger[/tex] terms. With a little algebra this gives rise to a corresponding expression for the row vector containing the [tex]a^\dagger[/tex] and [tex]a[/tex] terms:

[tex] [a_1^\dagger \ldots a_n^\dagger a_1 \ldots a_n] = [b_1^\dagger \ldots b_n^\dagger b_1 \ldots b_n] T. [/tex]

This allows us to rewrite the Hamiltonian as

[tex] H = [b^\dagger b] T M T^\dagger \left[ \begin{array}{c} b \\ b^\dagger \end{array} \right], [/tex]

where we have used the shorthand [tex][b^\dagger b][/tex] to denote the vector with entries [tex]b_1^\dagger, \ldots, b_n^\dagger,b_1,\ldots,b_n[/tex], and

[tex] M = \left[ \begin{array}{cc} \alpha & -\beta^* \\ \beta & -\alpha^* \end{array} \right]. [/tex]

Supposing we can choose [tex]T[/tex] such that [tex]TMT^\dagger[/tex] is diagonal, we see that the Hamiltonian can be expressed in the form of [tex]H_{\rm free}[/tex], and the energy spectrum found, following our earlier methods.

Since [tex]\alpha[/tex] is hermitian and [tex]\beta[/tex] antisymmetric it follows that [tex]M[/tex] also is hermitian, and so can be diagonalized for some choice of unitary [tex]T[/tex]. However, the fact that the [tex]b_j[/tex]s must satisfy the CCRs constrains the class of [tex]T[/tex]s available to us. We need to show that such a [tex]T[/tex] can be used to do the diagonalization.

We will give a heuristic and somewhat incomplete proof that this is possible, before making some brief remarks about what is required for a rigorous proof. I’ve omitted the rigorous proof, since the way I understand it is uses a result from linear algebra that, while beautiful, I don’t want to explain in full detail here.

Suppose [tex]T[/tex] is any unitary such that

[tex] T M T^\dagger = \left[ \begin{array}{cc} d & 0 \\ 0 & -d \end{array} \right], [/tex]

where [tex]d[/tex] is diagonal, and we used the special form of [tex]M[/tex] to deduce that the eigenvalues are real and appear in matched pairs [tex]\pm \lambda[/tex]. We’d like to show that [tex]T[/tex] can be chosen to be of the desired special form. To see that this is plausible, consider the map [tex]X \rightarrow S X^* S^\dagger[/tex], where [tex]S[/tex] is a block matrix:

[tex] S = \left[ \begin{array}{cc} 0 & I \\ I & 0 \end{array} \right]. [/tex]

Applying this map to both sides of the earlier equation we obtain

[tex] ST^* M^* T^T S^\dagger = \left[ \begin{array}{cc} -d & 0 \\ 0 & d \end{array} \right] = -TMT^\dagger. [/tex]

But [tex]M^* = -S^\dagger M S[/tex], and so we obtain:

[tex] -ST^* S^\dagger M S T^T S^\dagger = -TMT^\dagger. [/tex]

It is at least plausible that we can choose [tex]T[/tex] such that [tex]ST^*S^\dagger = T[/tex], which would imply that [tex]T[/tex] has the required form. What this actually shows is, of course, somewhat weaker, namely that [tex]T^\dagger ST^* S^\dagger[/tex] commutes with [tex]M[/tex].

One way of obtaining a rigorous proof is to find a [tex]T[/tex] satisfying

[tex] T M T^\dagger = \left[ \begin{array}{cc} d & 0 \\ 0 & -d \end{array} \right], [/tex]

and then to apply the cosine-sine (or CS) decomposition from linear algebra, which provides a beautiful way of representing block unitary matrices, and which, in this instance, allows us to obtain a [tex]T[/tex] of the desired form with just a little more work. The CS decomposition may be found, for example, as Theorem VII.1.6 on page 196 of Bhatia’s book “Matrix Analysis” (Springer-Verlag, New York, 1997).

Problem: Can we extend these results to allow terms in the Hamiltonian which are linear in the Fermi operators?

In this post we’ve seen how to diagonalize a general Fermi quadratic Hamiltonian. We’ve treated this as a purely mathematical problem, although most physicists will probably have little trouble believing that these techniques are useful in a wide range of physical problems. In the next post we’ll explain a surprising connection between these ideas and one-dimensional spin systems: a tool known as the Jordan-Wigner transform can be used to establish an equivalence between a large class of one-dimensional spin systems and the type of Fermi systems we have been considering. This is interesting because included in this class of one-dimensional spin systems are important models such as the transverse Ising model, which serve as a general prototype for quantum magnetism, are a good basis for understanding some naturally occurring physical systems, and which also serve as prototypes for the understanding of quantum phase transitions.

Fermions and the Jordan-Wigner transform IV: Diagonalizing Fermi Quadratic Hamiltonians

I’ve finally had a chance to get back to Fermions. Today’s post explains how to diagonalize a Hamiltonian which is quadratic in operators satisfying the Fermionic CCRs. Remarkably, we’ll do this using only the CCRs: the operators could arise in many different ways physically, but, as we shall see, it is only the CCRs that matter for determining the spectrum! This class of Hamiltonians arises in a lot of realistic physical systems, and we’ll see an explicit example later on, when we show that a particular spin model (the X-Y model) is equivalent to a Fermi quadratic Hamiltonian.

(Unfortunately, there seems to be a bug in WordPress that required me to strip most of the tags denoting emphasis (e.g. bold or italics) out of this post. Weird.)

Diagonalizing a Fermi quadratic Hamiltonian

Suppose [tex]a_1,\ldots,a_n[/tex] satisfy the Fermionic CCRs, and we have a system with Hamiltonian

[tex] H_{\rm free} = \sum_j \alpha_j a_j^\dagger a_j, [/tex]

where [tex]\alpha_j \geq 0[/tex] for each value of [tex]j[/tex]. In physical terms, this is the Hamiltonian used to describe a system of free, i.e., non-interacting, Fermions.

Such Hamiltonians are used, for example, in the simplest possible quantum mechanical model of a metal, the Drude-Sommerfeld model, which treats the conduction electrons as free Fermions. Such a model may appear pretty simplistic (especially after we solve it, below), but actually there’s an amazing amount of physics one can get out of such simple models. I won’t dwell on these physical consequences here, but if you’re unfamiliar with the Drude-Sommerfeld theory, you could profitably spend a couple of hours looking at the first couple of chapters in a good book on condensed matter physics, like Ashcroft and Mermin’s “Solid State Physics”, which explains the Drude-Sommerfeld model and its consequences in detail. (Why such a simplistic model does such a great job of describing metals is another long story, which I may come back to in a future post.)

Returning to the abstract Hamiltonian [tex]H_{\rm free}[/tex], the positivity of the operators [tex]a_j^\dagger a_j[/tex] implies that [tex]\langle \psi |H_{\rm free} |\psi\rangle \geq 0[/tex] for any state [tex]|\psi\rangle[/tex], and thus the ground state energy of [tex]H_{\rm free}[/tex] is non-negative. However, our earlier construction also shows that we can find at least one state [tex]|\mbox{vac}\rangle[/tex] such that [tex]a_j^\dagger a_j|\mbox{vac}\rangle = 0[/tex] for all [tex]j[/tex], and thus [tex]H_{\rm free}|\mbox{vac}\rangle = 0[/tex]. It follows that the ground state energy of [tex]H_{\rm free}[/tex] is exactly [tex]0[/tex].

This result is easily generalized to the case where the [tex]\alpha_j[/tex] have any sign, with the result that the ground state energy is [tex]\sum_j \min(0,\alpha_j)[/tex], and the ground state [tex]|\psi\rangle[/tex] is obtained from [tex]|\mbox{vac}\rangle[/tex] by applying the raising operator [tex]a_j[/tex] for all [tex]j[/tex] with [tex]\alpha_j < 0[/tex]. More generally, the allowed energies of the excited states of this system correspond to sums over subsets of the [tex]\alpha_j[/tex]. Exercise: Express the excited states of the system in terms of [tex]|\mbox{vac}\rangle[/tex]. Just by the way, readers with an interest in computational complexity theory may find it interesting to note a connection between the spectrum of [tex]H_{\rm free}[/tex] and the Subset-Sum problem from computer science. The Subset-Sum problem is this: given a set of integers [tex]x_1,\ldots,x_n[/tex], with repetition allowed, is there a subset of those integers which adds up to a desired target, [tex]t[/tex]? Obviously, the problem of determining whether [tex]H_{\rm free}[/tex] has a particular energy is equivalent to the Subset-Sum problem, at least in the case where the [tex]\alpha_j[/tex] are integers. What is interesting is that the Subset-Sum problem is known to be NP-Complete, in the language of computational complexity theory, and thus is regarded as computationally intractable. As a consequence, we deduce that the problem of determining whether a particular value for energy is in the spectrum of [tex]H_{\rm free}[/tex] is in general NP-Hard, i.e., at least as difficult as the NP-Complete problems. Similar results hold for the more general Fermi Hamiltonians considered below. Furthermore, this observation suggests the possibility of an interesting link between the physical problem of estimating the density of states, and classes of problems in computational complexity theory, such as the counting classes (e.g., #P), and also to approximation problems. Let's generalize our results about the spectrum of [tex]H_{\rm free}[/tex]. Suppose now that we have the Hamiltonian [tex] H = \sum_{jk} \alpha_{jk} a_j^\dagger a_k. [/tex] Taking the adjoint of this equation we see that in order for [tex]H[/tex] to be hermitian, we must have [tex]\alpha_{jk}^* = \alpha_{kj}[/tex], i.e., the matrix [tex]\alpha[/tex] whose entries are the [tex]\alpha_{jk}[/tex] is itself hermitian. Suppose we introduce new operators [tex]b_1,\ldots,b_n[/tex] defined by [tex] b_j \equiv \sum_{k=1}^n \beta_{jk} a_k, [/tex] where [tex]\beta_{jk}[/tex] are complex numbers. We are going to try to choose the [tex]\beta_{jk}[/tex] so that (1) the operators [tex]b_j[/tex] satisfy the Fermionic CCRs, and (2) when expressed in terms of the [tex]b_j[/tex], the Hamiltonian [tex]H[/tex] takes on the same form as [tex]H_{\rm free}[/tex], and thus can be diagonalized. We begin by looking for conditions on the complex numbers [tex]\beta_{jk}[/tex] such that the [tex]b_j[/tex] operators satisfy Fermionic CCRs. Computing anticommutators we find [tex] \{ b_j, b_k^\dagger \} = \sum_{lm} \beta_{jl} \beta_{km}^* \{ a_l,a_m^\dagger \}. [/tex] Substituting the CCR [tex]\{ a_l,a_m^\dagger \} = \delta_{lm} I[/tex] and writing [tex]\beta_{km}^* = \beta_{mk}^\dagger[/tex] gives [tex] \{ b_j, b_k^\dagger \} = \sum_{lm} \beta_{jl} \delta_{lm} \beta_{mk}^\dagger I= (\beta \beta^\dagger)_{jk} I [/tex] where [tex]\beta \beta^\dagger[/tex] denotes the matrix product of the matrix [tex]\beta[/tex] with entries [tex]\beta_{jl}[/tex] and its adjoint [tex]\beta^\dagger[/tex]. To compute [tex]\{b_j,b_k\}[/tex] we use the linearity of the anticommutator bracket in each term to express [tex]\{b_j,b_k\}[/tex] as a sum over terms of the form [tex]\{ a_l,a_m \}[/tex], each of which is [tex]0[/tex], by the CCRs. As a result, we have: [tex] \{ b_j,b_k \} = 0. [/tex] It follows that provided [tex]\beta \beta^\dagger = I[/tex], i.e., provided [tex]\beta[/tex] is unitary, the operators [tex]b_j[/tex] satisfy the Fermionic CCRs. Let's assume that [tex]\beta[/tex] is unitary, and change our notation, writing [tex]u_{jk} \equiv \beta_{jk}[/tex] in order to emphasize the unitarity of this matrix. We now have [tex] b_j = \sum_k u_{jk} a_k. [/tex] Using the unitarity of [tex]u[/tex] we can invert this equation to obtain [tex] a_j = \sum_k u^\dagger_{jk} b_k. [/tex] Substituting this expression and its adjoint into [tex]H[/tex] and doing some simplification gives us [tex] H = \sum_{lm} (u \alpha u^\dagger)_{lm} b_l^\dagger b_m. [/tex] Since [tex]\alpha[/tex] is hermitian, we can choose [tex]u[/tex] so that [tex]u \alpha u^\dagger[/tex] is diagonal, with entries [tex]\lambda_j[/tex], the eigenvalues of [tex]\alpha[/tex], giving us [tex] H = \sum_j \lambda_j b_j^\dagger b_j. [/tex] This is of the same form as [tex]H_{\rm free}[/tex], and thus the ground state energy and excitation energies may be computed in the same way as we described earlier. What about the ground state of [tex]H[/tex]? Assuming that all the [tex]\lambda_j[/tex] are non-negative, it turns out that a state [tex]|\psi\rangle[/tex] satisfies [tex]a_j^\dagger a_j |\psi\rangle = 0[/tex] for all [tex]j[/tex] if and only if [tex]b_j^\dagger b_j|\psi\rangle = 0[/tex] for all [tex]j[/tex], and so the ground state for the two sets of Fermi operators is the same. This follows from a more general observation, namely, that [tex]a_j^\dagger a_j |\psi\rangle = 0[/tex] if and only if [tex]a_j|\psi\rangle = 0[/tex]. In one direction, this is trivial: just multiply [tex]a_j|\psi\rangle = 0[/tex] on the left by [tex]a_j^\dagger[/tex]. In the other direction, we multiply [tex]a_j^\dagger a_j |\psi\rangle = 0[/tex] on the left by [tex]a_j[/tex] to obtain [tex]a_j a_j^\dagger a_j |\psi\rangle = 0[/tex]. Substituting the CCR [tex]a_j a_j^\dagger = -a_j^\dagger a_j + I[/tex], we obtain [tex] (-a_j^\dagger a_j^2+a_j)|\psi\rangle = 0. [/tex] But [tex]a_j^2 = 0[/tex], so this simplifies to [tex]a_j|\psi\rangle = 0[/tex], as desired. Returning to the question of determining the ground state, supposing [tex]a_j^\dagger a_j|\psi\rangle = 0[/tex] for all [tex]j[/tex], we immediately have [tex]a_j|\psi\rangle = 0[/tex] for all [tex]j[/tex], and thus [tex]b_j|\psi\rangle = 0[/tex] for all [tex]j[/tex], since the [tex]b_j[/tex] are linear functions of the [tex]a_j[/tex], and thus [tex]b_j^\dagger b_j|\psi\rangle = 0[/tex] for all [tex]j[/tex]. This shows that the ground state for the two sets of Fermi operators, [tex]a_j[/tex] and [tex]b_j[/tex], is in fact the same. The excitations for [tex]H[/tex] may be obtained by applying raising operators [tex]b_j^\dagger[/tex] to the ground state. Exercise: Suppose some of the [tex]\lambda_j[/tex] are negative. Express the ground state of [tex]H[/tex] in terms of the simultaneous eigenstates of the [tex]a_j^\dagger a_j[/tex]. Okay, that's enough for one day! We've learnt how to diagonalize a fairly general class of Hamiltonians quadratic in Fermi operators. In the next post we'll go further, learning how to cope with additional terms like [tex]a_j a_j[/tex] and [tex]a_j^\dagger a_k^\dagger[/tex].

Expander graphs: the complete notes

The full pdf text of my series of posts about expander graphs. Thankyou very much to all the people who commented on the posts; if you’re reading this text, and haven’t seen the comments on earlier posts, I recommend you look through them to see all the alternate proofs, generalizations and so on that people have offered.

Expander graphs VI: reducing randomness

Back from Boston! This is the final installment in my series about expanders. I’ll post a pdf containing the whole text in the next day or two. Thanks to everyone who’s contributed in the comments!

Today’s post explains how expander graphs can be used to reduce the number of random bits needed by a randomized algorithm in order to achieve a desired success probability. This post is the culmination of the series: we make use of the fact, proved in the last post, that random walks on an expander are exponentially unlikely to remain localized in any sufficiently large subset of vertices, a fact that relies in turn on the connection, developed in earlier posts, between the eigenavlue gap and the expansion parameter.

Note: This post is one in a series introducing one of the deepest ideas in modern computer science, expander graphs. Expanders are one of those powerful ideas that crop up in many apparently unrelated contexts, and that have a phenomenally wide range of uses. The goal of the posts is to explain what an expander is, and to learn just enough about them that we can start to understand some of their amazing uses. The posts require a little background in graph theory, computer science, linear algebra and Markov chains (all at about the level of a first course) to be comprehensible. I am not an expert on expanders, and the posts are just an introduction. They are are mostly based on some very nice 2003 lecture notes by Nati Linial and Avi Wigderson, available on the web at http://www.math.ias.edu/~boaz/ExpanderCourse/.

Reducing the number of random bits required by an algorithm

One surprising application of expanders is that they can be used to reduce the number of random bits needed by a randomized algorithm in order to achieve a desired success probability.

Suppose, for example, that we are trying to compute a function [tex]f(x)[/tex] that can take the values [tex]f(x) = 0[/tex] or [tex]f(x) = 1[/tex]. Suppose we have a randomized algorithm [tex]A(x,Y)[/tex] which takes as input [tex]x[/tex] and an [tex]m[/tex]-bit uniformly distributed random variable [tex]Y[/tex], and outputs either [tex]0[/tex] or [tex]1[/tex]. We assume that:

[tex]f(x) = 0[/tex] implies [tex]A(x,Y) = 0[/tex] with certainty.
[tex]f(x) = 1[/tex] implies [tex]A(x,Y) = 1[/tex] with probability at least [tex]1-p_f[/tex].

That is, [tex]p_f[/tex] is the maximal probability that the algorithm fails, in the case when [tex]f(x) = 1[/tex], but [tex]A(x,Y) = 0[/tex] is output by the algorithm.

An algorithm of this type is called a one-sided randomized algorithm, since it can only fail when [tex]f(x) = 1[/tex], not when [tex]f(x) = 0[/tex]. I won’t give any concrete examples of one-sided randomized algorithms here, but the reader unfamiliar with them should rest assured that they are useful and important – see, e.g., the book of Motwani and Raghavan (Cambridge University Press, 1995) for examples.

As an aside, the discussion of one-sided algorithms in this post can be extended to the case of randomized algorithms which can fail when either [tex]f(x) = 0[/tex] or [tex]f(x) = 1[/tex]. The details are a little more complicated, but the basic ideas are the same. This is described in Linial and Wigderson’s lecture notes. Alternately, extending the discussion to this case is a good problem.

How can we descrease the probability of failure for a one-sided randomized algoerithm? One obvious way of decreasing the failure probability is to run the algorithm [tex]k[/tex] times, computing [tex]A(x,Y_0),A(x,Y_1),\ldots,A(x,Y_{k-1})[/tex]. If we get [tex]A(x,Y_j) = 0[/tex] for all [tex]j[/tex] then we output [tex]0[/tex], while if [tex]A(x,Y_j) = 1[/tex] for at least one value of [tex]J[/tex], then we output [tex]f(x) = 1[/tex]. This algorithm makes use of [tex]km[/tex] bits, and reduces the failure probability to at most [tex]p_f^k[/tex].

Expanders can be used to substantially decrease the number of random bits required to achieve such a reduction in the failure probability. We define a new algorithm [tex]\tilde A[/tex] as follows. It requires a [tex]d[/tex]-regular expander graph [tex]G[/tex] whose vertex set [tex]V[/tex] contains [tex]2^m[/tex] vertices, each of which can represent a possible [tex]m[/tex]-bit input [tex]y[/tex] to [tex]A(x,y)[/tex]. The modified algorithm [tex]\tilde A[/tex] works as follows:

Input [tex]x[/tex].
Sample uniformly at random from [tex]V[/tex] to generate [tex]Y_0[/tex].
Now do a [tex]k-1[/tex] step random walk on the expander, generating random variables [tex]Y_1,\ldots, Y_{k-1}[/tex].
Compute [tex]A(x,Y_0),\ldots,A(x,Y_{k-1})[/tex]. If any of these are [tex]1[/tex], output [tex]1[/tex], otherwise output [tex]0[/tex].

We see that the basic idea of the algorithm is similar to the earlier proposal for running [tex]A(x,Y)[/tex] repeatedly, but the sequence of independent and uniformly distributed samples [tex]Y_0,\ldots,Y_{k-1}[/tex] is replaced by a random walk on the expander. The advantage of doing this is that only [tex]m+k \log(d)[/tex] random bits are required – [tex]m[/tex] to sample from the initial uniform distribution, and then [tex]\log(d)[/tex] for each step in the random walk. When [tex]d[/tex] is a small constant this is far fewer than the [tex]km[/tex] bits used when we simply repeatedly run the algorithm [tex]A(x,Y_j)[/tex] with uniform and independently generated random bits [tex]Y_j[/tex].

With what probability does this algorithm fail? Define [tex]B_x[/tex] to be the set of values of [tex]y[/tex] such that [tex]A(x,y) = 0[/tex], yet [tex]f(x) = 1[/tex]. This is the “bad” set, which we hope our algorithm will avoid. The algorithm will fail only if the steps in the random walk [tex]Y_0,Y_1,\ldots,Y_{k-1}[/tex] all fall within [tex]B_x[/tex]. From our earlier theorem we see that this occurs with probability at most:

[tex] \left( \frac{|B_x|}{2^m} + \frac{\lambda_2(G)}{d} \right)^{k-1}. [/tex]

But we know that [tex]|B_x|/2^m \leq p_f[/tex], and so the failure probability is at most

[tex] \left( p_f + \frac{\lambda_2(G)}{d} \right)^{k-1}. [/tex]

Thus, provided [tex]p_f+\lambda_2(G)/d < 1[/tex], we again get an exponential decrease in the failure probability as the number of repetitions [tex]k[/tex] is increased. Conclusion These notes have given a pretty basic introduction to expanders, and there's much we haven't covered. More detail and more applications can be found in the online notes of Linial and Wigderson, or in the research literature. Still, I hope that these notes have given some idea of why these families of graphs are useful, and of some of the powerful connections between graph theory, linear algebra, and random walks.

Adios

I’m off to Boston for a few days, so blogging will be on hold until the middle of next week.

New paper: Operator quantum error correction

A new paper (postscript), joint with David Poulin, all about operator quantum error correction, a clever way (alas, not my idea!) of protecting quantum states against the effects of noise. The paper provides several sets of easily checkable conditions (both algebraic and information-theoretic) characterizing when operator error-correction is feasible.

Postscript: Dave Bacon has a just put out a nice paper related to operator quantum error-correction, constructing some physically motivated examples of self-correcting quantum memories. This idea – that a quantum system can correct itself – is absolutely fascinating, and runs completely counter to the standard folk “wisdom” about quantum mechanics, namely that quantum states are delicate objects that are easily (and irreversibly) destroyed.

Appendix to my posts on Fermions and the Jordan-Wigner transform

This is a little appendix to my post about the consequences of the fermionic CCRs. The results described in the appendix are very well-known – they are taught in any undergrad quantum course – but I’m rather fond of the little proof given, and so am indulging myself by including it here. The results are used in the previous post.

Appendix on mutually commuting observables

Any undergraduate quantum mechanics course covers the fact that a mutually commuting set of Hermitian operators possesses a common eigenbasis. Unfortunately, in my experience this fact is usually proved rather early on, and suffers from being presented in a slightly too elementary fashion, with inductive constructions of explicit basis sets and so on. The following proof is still elementary, but from a slightly more sophisticated perspective. It is, I like to imagine, rather more like what would be given in an advanced course in linear algebra, were linear algebraists to actually cover this kind of material. (They don’t, so far as I know, having other fish to fry.)

Suppose [tex]H_1,\ldots,H_n[/tex] are commuting Hermitian (indeed, normal suffices) operators with spectral decompositions:

[tex] H_j = \sum_{jk} E_{jk} P_{jk}, [/tex]

where [tex]E_{jk}[/tex] are the eigenvalues of [tex]H_j[/tex], and [tex]P_{jk}[/tex] are the corresponding projectors. Since the [tex]H_j[/tex] commute, it is not difficult to verify that for any quadruple [tex]j,k,l,m[/tex] the operators [tex]P_{jk}[/tex] and [tex]P_{lm}[/tex] also commute. For a vector [tex]\vec k = (k_1,\ldots,k_n)[/tex] define the operator

[tex] P_{\vec k} \equiv P_{1 k_1} P_{2 k_2} \ldots P_{n k_n}. [/tex]

Note that the order of the operators on the right-hand side does not matter, since they all commute with one another. The following equations all follow easily by direct computation, the mutual commutativity of the [tex]P_{jk}[/tex] operators, and standard properties of the spectral decomposition:

[tex] P_{\vec k}^\dagger = P_{\vec k}; \,\,\,\, \sum_{\vec k} P_{\vec k} = I; \,\,\,\, P_{\vec k} P_{\vec l} = \delta_{\vec k \vec l} P_{\vec k}. [/tex]

Thus, the operators [tex]P_{\vec k}[/tex] form a complete set of orthonormal projectors. Furthermore, suppose we have [tex]P_{\vec k} |\psi\rangle = |\psi\rangle[/tex]. Then we will show that for any [tex]j[/tex] we have [tex]P_{jk_j} |\psi\rangle = |\psi\rangle[/tex], so [tex]|\psi\rangle[/tex] is an eigenstate of [tex]H_j[/tex] with eigenvalue [tex]k_j[/tex]. This shows that the operators [tex]P_{\vec k}[/tex] project onto a complete orthonormal set of simultaneous eigenspaces for the [tex]H_j[/tex], and will complete the proof.

Our goal is to show that if [tex]P_{\vec k} |\psi\rangle = |\psi\rangle[/tex] then for any [tex]j[/tex] we have [tex]P_{jk_j} |\psi\rangle = |\psi\rangle[/tex]. To see this, simply multiply both sides of [tex]P_{\vec k} |\psi\rangle = |\psi\rangle[/tex] by [tex]P_{jk_j}[/tex], and observe that [tex]P_{jk_j} P_{\vec k} = P_{\vec k}[/tex]. This gives [tex]P_{\vec k}|\psi\rangle = P_{jk_j}|\psi\rangle[/tex]. But [tex]P_{\vec k}|\psi\rangle = |\psi\rangle[/tex], so we obtain [tex]|\psi\rangle = P_{j k_j}|\psi\rangle[/tex], which completes the proof.