Fermions and the Jordan-Wigner transform III: Consequences of the Canonical Commutation Relations

Back to Fermions again! In today’s post we’ll show that the CCRs uniquely determine the action of the fermionic operators [tex]a_j[/tex], up to a choice of basis. Mathematically, the argument is somewhat detailed, but it’s also the kind of argument that rewards detailed study, especially if studied in conjunction with related topics, such as the representation theory of [tex]su(2)[/tex]. You’ll need to look elsewhere for that, however!

Note: This post is one in a series describing fermi algebras, and a powerful tool known as the Jordan-Wigner transform, which allows one to move back and forth between describing a system as a collection of qubits, and as a collection of fermions. The posts assume familiarity with elementary quantum mechanics, comfort with elementary linear algebra (but not advanced techniques), and a little familiarity with the basic nomenclature of quantum information science (qubits, the Pauli matrices).

Consequences of the fermionic CCRs

We will assume that the vector space [tex]V[/tex] is finite dimensional, and that there are [tex]n[/tex] operators [tex]a_1,\ldots,a_n[/tex] acting on [tex]V[/tex] and satisfying the Fermionic CCRs. At the end of this paragraph we’re going to give a broad outline of the steps we go through. Upon a first read, some of these steps may appear a little mysterious to the reader not familiar with representation theory. In particular, please don’t worry if you get a little stuck in your understanding of the outline at some points, as the exposition is very much at the bird’s-eye level, and not all detail is visible at that level. Nonetheless, the reason for including this broad outline is the belief that repeated study will pay substantial dividends, if it is read in conjunction with the more detailed exposition to follow, or similar material on, e.g., representations of the Lie algebra [tex]su(2)[/tex]. Indeed, the advantage of operating at the bird’s-eye level is that it makes it easier to see the connections between these ideas, and the use of similar ideas in other branches of representation theory.

  • We’ll start by showing that the operators [tex]a_j^\dagger a_j[/tex] are positive Hermitian operators with eigenvalues [tex]0[/tex] and [tex]1[/tex].
  • We’ll show that [tex]a_j[/tex] acts as a lowering operator for [tex]a_j^\dagger a_j[/tex], in the sense that if [tex]|\psi\rangle[/tex] is a normalized eigenstate of [tex]a_j^\dagger a_j[/tex] with eigenvalue [tex]1[/tex], then [tex]a_j |\psi\rangle[/tex] is a normalized eigenstate of [tex]a_j^\dagger a_j[/tex] with eigenvalue [tex]0[/tex]. If [tex]|\psi\rangle[/tex] is a normalized eigenstate of [tex]a_j^\dagger a_j[/tex] with eigenvalue [tex]0[/tex], then [tex]a_j |\psi\rangle[/tex] vanishes.
  • Similarly, [tex]a_j^\dagger[/tex] acts as a raising operator for [tex]a_j^\dagger a_j[/tex], in the sense that if [tex]|\psi\rangle[/tex] is a normalized eigenstate of [tex]a_j^\dagger a_j[/tex] with eigenvalue [tex]0[/tex], then [tex]a_j^\dagger |\psi\rangle[/tex] is a normalized eigenstate of [tex]a_j^\dagger a_j[/tex] with eigenvalue [tex]1[/tex]. If [tex]|\psi\rangle[/tex] is a normalized eigenstate of [tex]a_j^\dagger a_j[/tex] with eigenvalue [tex]1[/tex], then [tex]a_j^\dagger |\psi\rangle[/tex] vanishes.
  • We prove that the operators [tex]a_j^\dagger a_j[/tex] form a mutually commuting set of Hermitian matrices, and thus there exists a state [tex]|\psi\rangle[/tex] which is a simultaneous eigenstate of [tex]a_j^\dagger a_j[/tex] for all values [tex]j=1,\ldots,n[/tex].
  • By raising and lowering the state [tex]|\psi\rangle[/tex] in all possible combinations, we’ll construct a set of [tex]2^n[/tex] orthonormal states which are simultaneous eigenstates of the [tex]a_j^\dagger a_j[/tex]. The corresponding vector of eigenvalues uniquely labels each state in this orthonormal basis.
  • Suppose the vector space spanned by these [tex]2^n[/tex] simultaneous eigenstates is [tex]W[/tex]. At this point, we know that [tex]a_j[/tex] and [tex]a_j^\dagger[/tex] map [tex]W[/tex] into [tex]W[/tex], and, indeed, we know everything about the action [tex]a_j[/tex] and [tex]a_j^\dagger[/tex] have on [tex]W[/tex].
  • Suppose we define [tex]W_\perp[/tex] to be the orthocomplement of [tex]W[/tex] in [tex]V[/tex]. Then we’ll show that the [tex]a_j[/tex] and [tex]a_j^\dagger[/tex] map [tex]W_\perp[/tex] into itself, and their restrictions to [tex]W_\perp[/tex] satisfy the Fermionic CCRs. We can then repeat the above procedure, and identify a [tex]2^n[/tex]-dimensional subspace of [tex]W_\perp[/tex] on which we know the action of the [tex]a_j[/tex] and [tex]a_j^\dagger[/tex] exactly.
  • We iterate this procedure until [tex]W_\perp[/tex] is the trivial vector space, at which point it is no longer possible to continue. At this point we have established an orthonormal basis for the whole of [tex]V[/tex] with respect to which we can explicilty write down the action of both [tex]a_j[/tex] and [tex]a_j^\dagger[/tex].

Let’s go through each of these steps in more detail.

The [tex]a_j^\dagger a_j[/tex] are positive Hermitian with eigenvalues [tex]0[/tex] and [tex]1[/tex]: Observe that the [tex]a_j^\dagger a_j[/tex] are positive (and thus Hermitian) matrices. We will show that [tex](a_j^\dagger a_j)^2 = a_j^\dagger a_j[/tex], and thus the eigenvalues of [tex]a_j^\dagger a_j[/tex] are all [tex]0[/tex] or [tex]1[/tex].

To see this, observe that [tex](a_j^\dagger a_j)^2 = a_j^\dagger a_j a_j^\dagger a_j = -(a_j^\dagger)^2 a_j^2 + a_j^\dagger a_j[/tex], where we used the CCR [tex]\{a_j,a_j^\dagger \} = I[/tex]. Note also that [tex]a_j^2 = 0[/tex] by the CCR [tex]\{a_j,a_j\} = 0[/tex]. It follows that [tex](a_j^\dagger a_j)^2 = a_j^\dagger a_j[/tex], as claimed.

The [tex]a_j[/tex] are lowering operators: Suppose [tex]|\psi\rangle[/tex] is a normalized eigenstate of [tex]a_j^\dagger a_j[/tex] with eigenvalue [tex]1[/tex]. Then we claim that [tex]a_j|\psi\rangle[/tex] is a normalized eigenstate of [tex]a_j^\dagger a_j[/tex] with eigenvalue [tex]0[/tex]. To see that [tex]a_j|\psi\rangle[/tex] is normalized, note that [tex]\langle \psi|a_j^\dagger a_j |\psi\rangle = \langle \psi|\psi\rangle = 1[/tex], where we used the fact that [tex]|\psi\rangle[/tex] is an eigenstate of [tex]a_j^\dagger a_j[/tex] with eigenvalue [tex]1[/tex] to establish the first equality. To see that it has eigenvalue [tex]0[/tex], note that [tex]a_j^\dagger a_j a_j|\psi\rangle = 0[/tex], since [tex]\{ a_j,a_j \} = 0[/tex].

Exercise: Suppose [tex]|\psi\rangle[/tex] is a normalized eigenstate of [tex]a_j^\dagger a_j[/tex] with eigenvalue [tex]0[/tex]. Show that [tex]a_j |\psi\rangle = 0[/tex].

The [tex]a_j[/tex] are raising operators: Suppose [tex]|\psi\rangle[/tex] is a normalized eigenstate of [tex]a_j^\dagger a_j[/tex] with eigenvalue [tex]0[/tex]. Then we claimed that [tex]a_j^\dagger|\psi\rangle[/tex] is a normalized eigenstate of [tex]a_j^\dagger a_j[/tex] with eigenvalue [tex]1[/tex].

To see the normalization, we use the CCR [tex]\{ a_j,a_j^\dagger \} = I[/tex] to deduce [tex]\langle \psi|a_j a_j^\dagger |\psi\rangle = -\langle \psi|a_j^\dagger a_j|\psi\rangle + \langle \psi|\psi\rangle[/tex]. But [tex]a_j^\dagger a_j|\psi\rangle = 0[/tex], by the eigenvalue assumption, and [tex]\langle \psi|\psi\rangle = 1[/tex], whence [tex]\langle \psi|a_j a_j^\dagger |\psi\rangle = 1[/tex], which is the desired normalization condition.

To see that [tex]a_j^\dagger |\psi\rangle[/tex] is an eigenstate with eigenvalue [tex]1[/tex], use the CCR [tex]\{a_j,a_j^\dagger \} = I[/tex] to deduce that [tex]a_j^\dagger a_j a_j^\dagger |\psi\rangle = – a_j^\dagger a_j^\dagger a_j|\psi\rangle + a_j^\dagger|\psi\rangle = a_j^\dagger |\psi\rangle[/tex], where the final equality can be deduced either from the assumption that [tex]a_j^\dagger a_j |\psi\rangle = 0[/tex], or from the CCR [tex]\{ a_j^\dagger,a_j^\dagger \} = 0[/tex]. This is the desired eigenvalue equation for [tex]a_j^\dagger|\psi\rangle[/tex].

Exercise: Suppose [tex]|\psi\rangle[/tex] is a normalized eigenstate of [tex]a_j^\dagger a_j[/tex] with eigenvalue [tex]1[/tex]. Show that [tex]a_j^\dagger |\psi\rangle = 0[/tex].

The [tex]a_j^\dagger a_j[/tex] form a mutually commuting set of observables: To see this, let [tex]j \neq k[/tex], and apply the CCRs repeatedly to obtain [tex]a_j^\dagger a_j a_k^\dagger a_k = a_k^\dagger a_k a_j^\dagger a_j[/tex], which is the desired commutativity.

Existence of a common eigenstate: It is well known that a mutually commuting set of Hermitian operators possesses a common eigenbasis. This fact is usually taught in undergraduate quantum mechanics courses; for completeness, I’ve included a simple proof in an appendix to these notes. We won’t make use of the full power of this result here, but instead simply use the fact that there exists a normalized state [tex]|\psi\rangle[/tex] which is a simultaneous eigenstate of all the [tex]a_j^\dagger a_j[/tex] operators. In particular, for all [tex]j[/tex] we have:

[tex] a_j^\dagger a_j |\psi\rangle = \alpha_j |\psi\rangle, [/tex]

where for each [tex]j[/tex] either [tex]\alpha_j = 0[/tex] or [tex]\alpha_j = 1[/tex]. It will be convenient to assume that [tex]\alpha_j = 0[/tex] for all [tex]j[/tex]. This assumption can be made without loss of generality, by applying lowering operators to the [tex]|\psi\rangle[/tex] for each [tex]j[/tex] such that [tex]\alpha_j = 1[/tex], resulting in a normalized state [tex]|\mbox{vac}\rangle[/tex] such that [tex]a_j^\dagger a_j |\mbox{vac}\rangle = 0[/tex] for all [tex]j[/tex].

Defining an orthonormal basis: For any vector [tex]\alpha = (\alpha_1,\ldots,\alpha_n)[/tex], where each [tex]\alpha_j = 0[/tex] or [tex]1[/tex], define a corresponding state:

[tex] |\alpha \rangle \equiv (a_1^\dagger)^{\alpha_1} \ldots (a_n^\dagger)^{\alpha_n}|\mbox{vac}\rangle. [/tex]

It is clear that there are [tex]2^n[/tex] such states [tex]|\alpha\rangle[/tex], and that they form an orthonormal set spanning a subspace of [tex]V[/tex] that we shall call [tex]W[/tex].

The action of the [tex]a_j[/tex] and [tex]a_j^\dagger[/tex] on [tex]W[/tex]: How do [tex]a_j[/tex] and [tex]a_j^\dagger[/tex] act on [tex]W[/tex]? Stated another way, how do they act on the orthonormal basis we have constructed for [tex]W[/tex], the states [tex]|\alpha\rangle[/tex]? Applying the CCRs and the definition of the states [tex]|\alpha\rangle[/tex] it is easy to verify that the action of [tex]a_j[/tex] is as follows:

  • Suppose [tex]\alpha_j = 0[/tex]. Then [tex]a_j|\alpha\rangle = 0[/tex].
  • Suppose [tex]\alpha_j = 1[/tex]. Let [tex]\tilde \alpha[/tex] be that vector which results when the [tex]j[/tex]th entry of [tex]\alpha[/tex] is changed to [tex]0[/tex]. Then [tex]a_j|\alpha\rangle = -(-1)^{s_\alpha^j} |\tilde \alpha\rangle[/tex], where [tex]s_\alpha^j \equiv \sum_{k=1}^{j-1} \alpha_k[/tex].

The action of [tex]a_j^\dagger[/tex] on [tex]W[/tex] is similar:

  • Suppose [tex]\alpha_j = 0[/tex]. Let [tex]\tilde \alpha[/tex] be that vector which results when the [tex]j[/tex]th entry of [tex]\alpha[/tex] is changed to [tex]1[/tex]. Then [tex]a_j^\dagger|\alpha\rangle = -(-1)^{s_\alpha^j}|\tilde \alpha\rangle[/tex], where [tex]s_\alpha^j \equiv \sum_{k=1}^{j-1} \alpha_k[/tex].
  • Suppose [tex]\alpha_j = 1[/tex]. Then [tex]a_j^\dagger |\alpha\rangle = 0[/tex].

Action of [tex]a_j[/tex] and [tex]a_j^\dagger[/tex] on [tex]W_\perp[/tex]: We have described the action of the [tex]a_j[/tex] and the [tex]a_j^\dagger[/tex] on the subspace [tex]W[/tex]. What of the action of these operators on the remainder of [tex]V[/tex]? To answer that question, we first show that [tex]a_j[/tex] and [tex]a_j^\dagger[/tex] map the orthocomplement [tex]W_\perp[/tex] into itself.

To see this, let [tex]|\psi\rangle \in W_\perp[/tex], and consider [tex]a_j|\psi\rangle[/tex]. We wish to show that [tex]a_j|\psi\rangle \in W_\perp[/tex] also, i.e., that for any [tex]|\phi\rangle \in W[/tex] we have [tex]\langle \phi|a_j|\psi\rangle = 0[/tex]. This follows easily by considering the complex conjugate quantity [tex]\langle \psi|a_j^\dagger |\phi\rangle[/tex], and observing that [tex]a_j^\dagger |\phi\rangle \in W[/tex], since [tex]|\phi\rangle \in W[/tex], and thus [tex]\langle \psi|a_j^\dagger |\phi\rangle = 0[/tex]. A similar argument shows that [tex]a_j^\dagger[/tex] maps [tex]W_\perp[/tex] into itself.

Consider now the operators [tex]\tilde a_j[/tex] obtained by restricting [tex]a_j[/tex] to [tex]W_\perp[/tex]. Provided [tex]W_\perp[/tex] is nontrivial it is clear that these operators satisfy the CCRs on [tex]W_\perp[/tex]. Repeating the above argument, we can therefore identify a [tex]2^n[/tex]-dimensional subspace of [tex]W_\perp[/tex] on which we can compute the action of [tex]\tilde a_j[/tex] and [tex]\tilde a_j^\dagger[/tex], and thus of [tex]a_j[/tex] and [tex]a_j^\dagger[/tex].

We may iterate this procedure many times, but the fact that [tex]V[/tex] is finite dimensional means that the process must eventually terminate. At the point of termination we will have broken up [tex]V[/tex] as a direct sum of some finite number [tex]d[/tex] of orthonormal [tex]2^n[/tex]-dimensional vector spaces, [tex]W_1,W_2,\ldots,W_d[/tex], and on each vector space we will have an orthonormal basis with respect to which the action of [tex]a_j[/tex] and [tex]a_j^\dagger[/tex] is known precisely.

Stated another way, we can introduce an orthonormal basis [tex]|\alpha,k\rangle[/tex] for [tex]V[/tex], where [tex]\alpha[/tex] runs over all [tex]n[/tex]-bit vectors, and [tex]k = 1,\ldots,d[/tex], and such that the action of the [tex]a_j[/tex] and [tex]a_j^\dagger[/tex] is to leave [tex]k[/tex] invariant, and to act on [tex]|\alpha\rangle[/tex] as described above. In this representation it is clear that [tex]V[/tex] can be regarded as a tensor product [tex]C^{2^n} \otimes C^d[/tex], with the action of [tex]a_j[/tex] and [tex]a_j^\dagger[/tex] trivial on the [tex]C^d[/tex] component. We will call this the occupation number representation for the Fermi algebra [tex]a_j[/tex].

It’s worth pausing to appreciate what has been achieved here: starting only from the CCRs for [tex]a_1,\ldots,a_n[/tex] we have proved that [tex]V[/tex] can be broken down into a tensor product of a [tex]2^n[/tex]-dimensional vector space and a [tex]d[/tex]-dimensional vector space, with the [tex]a_j[/tex]s acting nontrivially only on the [tex]2^n[/tex]-dimensional component. Furthermore, the action of the [tex]a_j[/tex]s is completely known. I think it’s quite remarkable that we can say so much: at the outset it wasn’t even obvious that the dimension of [tex]V[/tex] should be a multiple of [tex]2^n[/tex]!

When [tex]d=1[/tex] we will call this the fundamental representation for the Fermionic CCRs. (This is the terminology I use, but I don’t know if it is standard or not.) Up to a change of basis it is clear that all other representations can be obtained by taking a tensor product of the fundamental representation with the trivial action on a [tex]d[/tex]-dimensional vector space.

We’ve now understood the fundamental mathematical fact about fermions: the mere existence of operators satisfying Fermionic canonical commutation relations completely determines the action of those operators with respect to some suitable orthonormal occuptation number basis. That’s a very strong statement! In the next post we’ll use this technology to study a problem of direct physical interest: finding the energy spectrum and eigenstates of a Hamiltonian which is quadratic in terms of Fermi operators.