Quantum computing for everyone

“Can you give me a simple, concrete explanation of how quantum computers work?”

I’ve been asked this question a lot. I worked on quantum computing full time for 12 years, wrote 60 or so papers, and co-authored the standard text. But for many years the question stumped me. I had several pat answers, but none really satisfied me or my questioners.

It turns out, though, that there is a satisfying answer to the question, which anyone can understand if they’re willing to spend some time concentrating hard.

To understand the answer, let’s back up and think first about why big media outlets like the New York Times and the Economist regularly run stories about quantum computers.

The reason is that quantum computer scientists believe quantum computers can solve problems that are intractable for conventional computers. That is, it’s not that quantum computers are like regular computers, but smaller and faster. Rather, quantum computers work according to principles entirely different than conventional computers, and using those principles can solve problems whose solution will never be feasible on a conventional computer.

In everyday life, all our experience is with objects which can be directly simulated by a conventional computer. We don’t usually think about this fact, but movie-makers rely on it, and we take it for granted – special effects are basically just rough computer simulations of events that would be more expensive for the movie makers to create in real life than they are to simulate inside a computer. Much more detailed simulations are used by companies like Boeing to test designs for their latest aircraft, and by Intel to test designs for their latest chips. Everything you’ve ever seen or done in your life – driving a car, walking in the park, cooking a meal – all these actions can be directly simulated using a conventional computer.

Because of this, when we think in concrete terms we invariably think about things that can be directly simulated on a conventional computer.

Now, imagine for the sake of argument that I could give you a simple, concrete explanation of how quantum computers work. If that explanation were truly correct, then it would mean we could use conventional computers to simulate all the elements in a quantum computer, giving us a way to solve those supposedly intractable problems I mentioned earlier.

Of course, this is absurd! What’s really going on is that no simple concrete explanation of quantum computers is possible. Rather, there is an intrinsic quantum gap between how quantum computers work, and our ability to explain them in simple concrete terms. This quantum gap is what made it hard for me to answer people’s requests for a concrete explanation. The right answer to such requests is that quantum computers cannot be explained in simple concrete terms; if they could be, quantum computers could be directly simulated on conventional computers, and quantum computing would offer no advantage over such computers. In fact, what is truly interesting about quantum computers is understanding the nature of this gap between our ability to give a simple concrete explanation and what’s really going on.

This account of quantum computers is distinctly at odds with the account that appears most often in the mainstream media. In that account, quantum computers work by exploiting what is called “quantum parallelism”. The idea is that a quantum computer can simultaneously explore many possible solutions to a problem. Implicitly, such accounts promise that it’s then possible to pick out the correct solution to the problem, and that it’s this which makes quantum computers tick.

Quantum parallelism is an appealing story, but it’s misleading. The problem comes in the second part of the story: picking out the correct solution. Most of the time this turns out to be impossible. This isn’t just my opinion, in some cases you can mathematically prove it’s impossible. In fact, the problem of figuring out how to extract the solution, which is glossed over in mainstream accounts, is really the key problem. It’s here that the quantum gap lies, and glossing over it does nothing to promote genuine understanding.

None of my discussion to now actually explains how quantum computers work. But it’s a good first step to understanding, for it prepares you to expect a less concrete explanation of quantum computers than you might at first have hoped for. I won’t give a full description here, but I will sketch what’s going on, and give you some suggestions for further reading.

Quantum computers are built from “quantum bits”, or “qubits” [1], which are the quantum analogue of the bits which make up conventional computers. Here’s a magnified picture of a baby quantum computer made up of three Beryllium atoms, which are used to store three qubits:

trapped_ions(credit to the NIST trapped ions group)

The atoms are held in place using an atom trap, which you can’t see because it’s out of frame, but which surrounds the atoms, holding them suspended in place using electric and magnetic fields, similar to the way magnets can be used to levitate larger objects in the air.

The atoms in the picture are about three micrometers apart, which means that if you laid a million end to end, they wouldn’t quite span the length of a living room. Very fine human hair is about 20 micrometers in diameter – it’d pretty much cover the width of this photo.

The atoms themselves are about a thousand times smaller than the spacing between the atoms. They look a lot bigger in the picture, and the reason is interesting. Although the atoms are very small, the way the picture was created was by shining laser light on the atoms to light them up, and then taking a photograph. The particles making up the laser light are much bigger than the atoms, which makes the picture come out all blurry; the photo above is basically a very blurry photograph of the atoms, which is why they look so much bigger than they really are.

I called this a baby quantum computer because it has only three qubits, but in fact it’s pretty close to the state of the art. It’s hard to build quantum computers, and adding extra qubits turns out to be tricky. Exactly who holds the record for the most qubits depends on who you ask, because different people have different ideas about what standards need to be met to qualify as a genuine quantum computer. The current consensus for the record is about 5-10 qubits.

Okay, a minor alert is in order. I’ve tried to keep this essay as free from mathematics as possible, but the rest of the essay will use a little high-school mathematics. If this is the kind of thing that puts you off, do not be alarmed! You should be able to get the gist even if you skip over the mathematical bits.

How should we describe what’s inside a quantum computer? We can give a bare-bones description of a conventional computer by listing out the state of all its internal components. For example, its memory might contains the bits 0,0,1,0,1,1, and so on. It turns out that a quantum computer can also be described using a list of numbers, although how this is done is quite different. If our quantum computer has n qubits (in the example pictured above n = 3), then it turns out that the right way to describe the quantum computer is using a list of 2n numbers. It’s helpful to give these numbers labels: the first is s1, the second s2, and so on, so the entire list is:

s1, s2,…, s2n.

What are these numbers, and how are they related to the n qubits in our quantum computer? This is a reasonable question – in fact, it’s an excellent question! Unfortunately, the relationship is somewhat indirect. For that reason, I’m not going to describe it in detail here, although you can get a better picture from some of the further reading I describe below. For us, the thing to take away is that describing n qubits requires 2n numbers.

One result of this is that the amount of information needed to describe the qubits gets big really quickly. More than a million numbers are needed to describe a 20-qubit quantum computer! The contrast with conventional computers is striking – a conventional 20-bit computer needs only 20 numbers to describe it. The reason is that each added qubit doubles the amount of information needed to describe the quantum computer [2]. The moral is that quantum computers get complex far more quickly than conventional computers as the number of components rises.

The way a quantum computer works is that quantum gates are applied to the qubits making up the quantum computer. This is a fancy way of saying that we do things to the qubits. The exact details vary quite a bit in different quantum computer designs. In the example I showed above, it basically involves manipulating the atoms by shining laser light on them. Quantum gates usually involve manipulating just one or two qubits at a time; some quantum computer designs involve more at the same time, but that’s a luxury, it’s not actually necessary. A quantum computation is just a sequence of these quantum gates done in a particular order. This sequence is called a quantum algorithm; it plays the same role as a program for a conventional computer.

The effect of a quantum gate is to change the description s1, s2,… of the quantum computer. Let me show you a specific example to make this a bit more concrete. There’s a particular type of quantum gate called a Hadamard gate. This type of gate affects just one qubit. If we apply a Hadamard gate to the first qubit in a quantum computer, the effect is to produce a new description for the quantum computer with numbers t1, t2,… given by

t1 = (s1+s2n/2+1)/√ 2

t2 = (s2+s2n/2+2)/√ 2,

t3 = (s3+s2n/2+3)/√ 2,

and so on, down through all 2n different numbers in the description. The details aren’t important, the salient point is that even though we’ve manipulated just one qubit, the way we describe the quantum computer changes in an extremely complicated way. It’s bizarre: by manipulating just a single physical object, we reshuffle and recombine the entire list of 2n numbers!

It’s this reshuffling and recombination of all 2n numbers that is the heart of the matter. Imagine we were trying to simulate what’s going on inside the quantum computer using a conventional computer. The obvious way to do this is to track the way the numbers s1, s2,… change as the quantum computation progresses. The problem with doing this is that even a single quantum gate can involve changes to all 2n different numbers. Even when n is quite modest, 2n can be enormous. For example, when n = 300, 2n is larger than the number of atoms in the Universe. It’s just not feasible to track this many numbers on a conventional computer.

You should now be getting a feeling for why quantum computer scientists believe it is infeasible for a conventional computer to simulate a quantum computer. What’s really clever, and not so obvious, is that we can turn this around, and use the quantum manipulations of all these exponentially many numbers to solve interesting computational problems.

I won’t try to tell that story here. But if you’re interested in learning more, here’s some reading you may find worthwhile.

In an earlier essay I explain why conventional ways of thinking simply cannot give a complete description of the world, and why quantum mechanics is necessary. Going a little further, an excellent lay introduction to quantum mechanics is Richard Feynman’s QED: The Strange Theory of Light and Matter. It requires no mathematical background, but manages to convey the essence of quantum mechanics. If you’re feeling more adventurous still, Scott Aaronson’s lecture notes are a fun introduction to quantum computing. They contain a fair bit of mathematics, but are written so you can get a lot out of them even if some of the mathematics is inaccessible. Scott and Dave Bacon run excellent blogs that occasionally discuss quantum computing, and their blogrolls are a good place to find links to other quantum bloggers.

Finally, if you’ve enjoyed this essay, you may enjoy some of my other essays, or perhaps like to subscribe to my blog. Thanks for reading!


Thanks to Jen Dodd and Kate Nielsen for providing feedback that greatly improved early drafts of this essay.

About the author

Michael Nielsen is a writer living near Toronto, and working on a book about The Future of Science. If you’d like to be notified when the book is available, please send a blank email to the.future.of.science@gmail.com with the subject “subscribe book”. You’ll be emailed to let you know when the book is to be published; your email address will not be used for any other purpose.


[1] Ben Schumacher, who coined the term “qubit”, runs an occasional blog.

[2] Motivated by this observation, in my PhD thesis I posed a tongue-in-cheek quantum analogue of Moore’s Law: to keep pace with conventional computers, quantum computers need only add a single qubit every two years. So far, things are neck and neck.

Categorized as Quantum

Why the world needs quantum mechanics

Conventional wisdom holds that quantum mechanics is hard to learn. This is more or less correct, although often overstated. However, the necessity of abandoning conventional ways of thinking about the world, and finding a radically new way – quantum mechanics – can be understood by any intelligent person willing to spend some time concentrating hard. Conveying that understanding is the purpose of this essay.

Reading the essay requires a little more effort than most blog posts. The argument is occasionally a little abstract, and you may need to read over some paragraphs quite carefully, or perhaps more than once. Ideally, you’ll test your understanding by explaining the entire argument to someone else. The effort is worth it, for when you’re done, you’ll understand one of the great discoveries of all time: why the world needs quantum mechanics.

One of the challenges of understanding modern physics is that some of the concepts seem quite abstract when you’re talking about microscopic objects outside the realm of everyday experience. So let’s first get our bearings in a more conventional setting.

I want to talk about coins. We take it for granted that we can determine whether a coin has landed heads or tails; these seem like self-evident properties. But actually quite a lot is going on when we make that determination. Sunlight or some other type of light has to bounce off the coin, into your eye, stimulate your optic nerve, before finally registering either “heads” or “tails” in your brain [1].

This process of figuring out whether the coin is heads or tails is what physicists call a measurement process. In physicists’ language, what’s going on when we look at the coin is that we’re measuring a two-valued or binary property of the coin. This usage of the term measurement is somewhat different from everyday usage, where, for example, we might measure something with a ruler. But the basic idea is the same – a measurement is a process that determines a physical property, whether it be the length of an object, or the side a coin has landed.

All this language may seem pedantic – we’re just looking at a coin! But it comes in handy when we move to the microscopic realm of photons, the tiny particles that make up light. When you see red light, for example, what’s going on is that lots and lots of red photons are entering your eye. The more that enter, the brighter the red sensation.

Photons, like coins, can have binary properties. One of those properties is something called polarization. You’re probably already familiar with polarization, although you may not realize it. If you take a pair of sunglasess, and hold them up towards the surface of the ocean or a pool on a sunny day, you’ll notice that depending on the angle you hold the sunglasses, different amounts of light come through. What this means is that depending on the angle, different numbers of photons are coming through [2].

Imagine, for example, that you hold the sunglasses horizontally:

The photons that make it through the sunglasses have what is called horizontal polarization. Not all photons coming toward the sunglasses have this polarization, which is why not all of the photons make it through. In our earlier language, what’s going on is that the sunglasses are measuring the photons coming toward the sunglasses, to determine whether or not they have horizontal polarization. Those which do, pass through the sunglasses; those which do not, are blocked. Again, it’s not quite the everyday meaning of “measurement”, but hopefully you’re getting the hang of the physicists’ language.

There are other, different physical properties that can be measured in a similar way. For example, imagine holding the sunglasses at 45 degrees to horizontal:

The photons that make it through the sunglasses have a polarization at 45 degrees to horizontal. In our earlier language, these sunglasses are again measuring a binary property of the photons, in this case whether they have a polarization at 45 degrees to the horizontal or not [3].

Physicists routinely measure polarization in their laboratories. They don’t use sunglasses; they use “polarization photodetectors” instead. Despite the intimidating name, these are essentially just like sunglasses, but have a more convenient shape and size for laboratory use, are more accurate, less fashionable, and far more expensive.

I’m now going to describe an experiment involving photon polarization that physicists can do in their laboratories. We’ll build up the description of the experiment piece by piece. Along the way there’s a few details that may seem ad hoc – some angles of polarization measurement, and things like that. Don’t worry too much about those ad hoc details, just try to get the basic picture straight.

Let’s start by imagining an experimentalist named Alice. Alice is measuring a photon to determine whether or not it has horizontal polarization. Alice will record A = 1 when it does have horizontal polarization, and A = -1 when it does not.

Of course, Alice might have decided to measure a different polarization, say at an angle of 45 degrees to the horizontal. Alice will record B = 1 when it has a polarization at 45 degrees to the horizontal, and B = -1 when it does not. Here’s a picture summarizing the different things I want you to imagine Alice doing. By the way, I haven’t put the photon she’s measuring in, but you should imagine it coming into the screen, towards the sunglasses:

Let’s move briefly away from photons, and back to coins. The usual way we think about the world is that the coin is either heads or tails, and our measurement reveals which. The coin intrinsically “knows” which side is facing up, i.e., its orientation is an intrinsic property of the coin itself. By analogy, you’d expect that a photon knows whether it has horizontal polarization or not. And it should also know whether it has a polarization at 45 degrees to horizontal or not.

It turns out the world isn’t that simple. What I’ll now prove to you is that there are fundamental physical properties that don’t have an independent existence like this. In particular, we’ll see that prior to Alice measuring the A or B polarization, the photon itself does not actually know what the value for A or B is going to be. This is utterly unlike our everyday experience – it’s as though a coin doesn’t decide whether to be heads or tails until we’ve measured it.

That last paragraph may have sounded like gobbledygook. In fact, if it didn’t give you pause, I suggest you go back and reread it. The reason it’s difficult to understand is because the paragraph is really a declaration of non-understanding, a declaration that the world is radically different from our intuitive understanding.

To prove this, what we’ll do is first proceed on the assumption that our everyday view of the world is correct. That is, we’ll assume that photons really do know whether they have horizontal polarization or not, i.e., they have intrinsic values A = 1 or A = -1 (and, for that matter, B =1 or B = -1). We’ll find that this assumption leads us to a conclusion that is contradicted by real experiments. The only way this could be the case was if our original assumption was in fact wrong, i.e., photons don’t have intrinsic properties in this way.

This strategy may sound complex, but we reason similarly quite often in our everyday experience. Imagine your Aunt has shown you how to bake a cake. You decide to bake it on your own, but realize partway through that you’ve forgotten whether she said to put one or two cups of flour into the cake. You decide to proceed on the assumption that it was one cup of flour. Unfortunately, the cake falls and is a disaster; you conclude that your original assumption was wrong, and the cake must have needed two cups. In a similar way, if we proceed on the assumption that photons do have intrinsic values for A and B, and then arrive at a contradiction with experiment, we’ll know our original assumption must have been wrong.

Alright, let’s finish describing the experiment. In addition to Alice, the experiment involves another experimentalist, Bob, and a third person, Eve, who prepares two photons, and sends one to Alice, and one to Bob. When the photon gets to Alice, she measures one of the polarization values, A or B, as described above. She makes the choice of which to measure at random (e.g., by flipping a coin), for reasons which we’ll understand later. When the photon gets to Bob, he decides at random to measure either the polarization C, at 22.5 degrees to horizontal, or D, at 67.5 degrees to horizontal. Here’s a picture summarizing what’s going on, but leaving out Eve and the photons that she sent to Alice and Bob:

To make this all more concrete, let’s think about what might happen in a typical instance of the experiment. Over on Alice’s side, she decides to measure the B polarization of her photon, and gets the result 1, i.e., the polarization at 45 degrees to horizontal. Over on Bob’s side, he decides to measure the C polarization of his photon, and gets the result -1, i.e., the photon does not have polarization at 22.5 degrees to horizontal.

You might imagine Alice, Bob and Eve doing this experiment many times. If they did this, they could conveniently represent the separate runs of the experiment in a table:


Each row of the table represents a single run of the experiment, so this table shows a case where they did the experiment four times. Looking at the first row of the table, we see that in the first run of the experiment Alice chose to measure A, and got the result 1, while Bob chose to measure D, and also got the result 1.

Now that we’ve understood how the experiment is performed, let’s move on to the analysis. Remember, we’re starting from the assumption that the respective photons have independently existing and well-defined values for A, B, C, and D. Two of these four values are revealed in any given instance of the experiment, depending on what Alice and Bob choose to measure. However, because all four quantities have (by assumption) an independent existence, we can consider quantities which involve all four, like the quantity Q defined by the equation

Q = AC + BC + BD – AD.

(Things like AC mean A times C – it makes the essay less messy to omit the multiplication sign.)

I must apologize for springing this quantity Q on you completely out of the blue. It’s as though a friend suddenly started reciting ancient poetry in mid-conversation; you would certainly wonder why. It turns out that the easiest way to understand this material is to accept the definition of Q for now, and move forward. With a little more work, we’ll see that thinking about Q leads to some very interesting conclusions. With those conclusions in mind, we’ll be able to double back, and understand better where Q came from.

Although Q’s definition may appear to have come from out of the blue, it’s certainly easy enough to calculate for any given set of values for A, B, C, and D. For example, when A = 1, B = -1, C = 1 and D = -1 we get

Q = 1 x 1 + (-1) x 1 + (-1) x (-1) – 1 x (-1) = 2.

In fact, it turns out that no matter what value A, B, C and D have, the value of Q is always equal to either 2 or -2. If you like, you can run through all 16 sets of possible values for A, B, C and D, and verify that Q is indeed always either 2 or -2. I won’t go through all that here, although I encourage you to pause and go through the exercise on paper [4].

Now, when Alice and Bob actually do an experiment, Alice chooses to measure just one of A or B, and Bob chooses to measure just one of C or D. So they can’t actually measure Q directly, although on any given run they can determine one of the four terms that make up Q, that is, they can always determine one of AC, BC, BD or -AD.

But if they repeat the experiment many times, Alice and Bob can build up average value for each of the four quantities AC, BC, BD and -AD. Because the total of these four quantities is always 2 or -2, as we’ve seen, the sum of their averages over multiple runs of the experiment can not possibly be more than 2:

        Avg(AC)+Avg(BC)+Avg(BD)-Avg(AD) ≤ 2.

To understand why this is true, imagine you calculated the average population of all the countries in the world. Whatever the average is, it’s definitely going to be less than the population of China, which is the most populous country.

The inequality above is called the Clauser-Horne-Shimony-Holt (CHSH) inequality, after the names of its four discoverers. CHSH were building on earlier ideas of John Bell, who discovered a similar inequality in 1964.

You might wonder why we need to average in the CHSH inequality. Why can’t Alice measure both A and B, and Bob measure both C and D, so they can determine Q directly?

To understand this, remember that the idea we’re testing is the idea that the photon has an actual intrinsic value for A and an actual intrinsic value for B, each of which is merely revealed by the measurement. A single photon is quite delicate, and if Alice measured both A and B, there’s a chance the measurement of A would interfere with the measurement of B, and vice versa, and so mess up the measurement of Q. To keep things clean we force Alice to choose which one she wants to measure in any given instance, and stick to it. That’s why we have to work with averages over many experiments.

If you’re a bit more paranoid, you might also wonder if maybe Alice’s measurement could interfere with what Bob sees. This may seem unlikely, but it’s at least plausible. But Einstein’s relativity tells us that no influence can travel faster than the speed of light. If Alice and Bob do their measurements simultaneously and very quickly, nothing Alice does can possibly affect what Bob sees.

So, in principle, it ought to be possible for Alice and Bob to do the experiment many times, and work out the averages Avg(AC), Avg(BC), and so on, and check that the CHSH inequality does, in fact, hold.

An experiment testing this was done in the early 1980s, by Alain Aspect’s group, in France [5]. Experimentally, they found that if Eve prepares the two photons in just the right way, then what Alice and Bob see after many runs of the experiment is:

Avg(AC)+Avg(BC)+Avg(BD)-Avg(AD) ≅ 2.8.

That is, Aspect found that the CHSH inequality fails to hold in the real world! This means our belief that objects have intrinsic properties with their own independent existence must actually be wrong. The experimental failure of the CHSH inequality forces us to seek an alternate way of understanding the world, a way radically different from our conventional way of thinking.

Fortunately, a more radical theory of the world is available, a theory in which objects don’t have intrinsic properties that exist in and of themselves. That more radical theory is quantum mechanics. I won’t explain how the quantum mechanical analysis of the Aspect experiment works; that’s not the point of this essay. I will report though, that if you use quantum mechanics to analyze Aspect’s experiment, the prediction you get matches the experimental results exactly. In fact, Clauser, Horne, Shimony and Holt had already done the quantum mechanical analysis in advance of the experiment, and knew this. What the Aspect experiment did was provide a real-world example where the CHSH inequality demonstrably fails, yet quantum mechanics explains the results perfectly [6].

The analysis done in this essay can be extended to nearly all physical properties. In principle, it holds even for everyday properties like whether a coin is heads or tails, whether a cat is alive or dead, or nearly anything else you care to think of. Although experiments like the Aspect experiment are still far too difficult to do for these much more complex systems, quantum mechanics predicts that in principle it should be possible to do an experiment with these systems where the CHSH inequality fails. Assuming this is the case – and all the evidence points that way – at some fundamental level it is a mistake to think even of everyday properties as having an intrinsic independent existence.

You might wonder what this all means. Should you lose your belief in the idea that objects have intrinsic properties with an independent existence? Should you start thinking about your coins or your cat as though they might be in some indeterminate state? The answer, of course, is no: believing in such intrinsic properties is a perfectly good way to go about your everyday life. In fact, quantum physicists have spent quite a bit of time trying to understand why it is that so many properties in practice do behave like intrinsic properties with their own independent existence. The analysis is complex, but the final conclusion is unambiguous: for most practical everyday purposes, we can treat a coin as knowing whether it is heads or tails, and a cat as knowing whether it is alive or dead. Although these beliefs are not correct at some fundamental level, in most practical situations they work extremely well. It’s only in extraordinary circumstances quite outside everyday life that this way of thinking could ever lead you astray.

I promised that we’d go back and try to understand where Q comes from. In fact, Q was no less mysterious for Clauser, Horne, Shimony and Holt than it is for you. When they started their work, they had in mind an argument roughly like the one above (which was inspired by Bell) but they did not have a specific form for Q in mind. Their idea was to find a form for Q using trial-and-error so that they could prove an inequality like the CHSH inequality, and also simultaneously find a situation where quantum mechanics predicted that the inequality should fail to hold. That strategy allowed them to suggest an experiment – the experiment ultimately done by Aspect – which could be used to test between the two views of reality. I don’t know how long it took them to find their form for Q, but I suspect it took hundreds of hours of hard work. If you’ve been wondering what Q “means”, that’s your answer: it’s the answer to the question Clauser, Horne, Shimony and Holt’s were asking about what quantity would best let them distinguish between our usual picture of the world, and the actual reality. Given how long it took them to answer that question, it would not be surprising if you got a bit of a jolt when I introduced Q out of the blue.

The need for quantum mechanics isn’t ordinarily explained the way I have described in this essay. I think this is a pity, because the explanation here is, in my opinion, simpler, more compelling, and more clearcut [7] than the standard explanation.

The standard explanation is based on the historical development of quantum mechanics between 1900 and 1930. During that time there were a series of crises in physics. The pattern was that each time some experimental fact would be noticed that seemed hard to explain with the old “classical” way of viewing the world. Each time, physicists would bandage over the old classical thinking with an ad hoc bandaid. This happened over and over again until, in the mid-1920s, the sick patient of classical physics finally keeled over completely, and was replaced with the new framework of quantum mechanics.

The problem with this style of explanation, and what makes it confusing, is that none of those early crises was entirely clearcut. In each case, there were physicists who argued that the new experimental results could be explained pretty well with a conventional classical picture. And, in fact, with hindsight, we can now see that some of these crises have pretty good explanations that are essentially classical.

What’s beautiful about the CHSH inequality and the Aspect experiment is that they are so simple and compelling. They leave no doubt that we have to abandon our conventional assumptions about the world, and confront the need for a radically new theory. That theory is quantum mechanics.

Further reading

If you liked this essay, you may enjoy my essay “What makes quantum computers powerful”, to appear on this blog in two weeks time.

An excellent elementary introduction to quantum mechanics is Richard Feynman’s QED: The Strange Theory of Light and Matter.

Subscribe to my blog here.

You may enjoy some of my other essays.


Thanks to Dave Bacon, Jen Dodd, Mary Granade, Kate Nielsen, Amund Tveit, and Jo Vermeulen for feedback that improved an early draft of this essay.

About the author

Between 1995 and 2008, Michael Nielsen was a professional theoretical physicist. During that time he co-authored the standard text on quantum computing, proved one of the fundamental theorems about the behaviour of entangled quantum states, and participated in one of the first quantum teleportation experiments. None of this made him feel comfortable with quantum mechanics.

Michael is now a writer living outside Toronto, and working on a book about “The Future of Science”. A taste of the book may be found here. If you’d like to be notified when the book is available, please send a blank email to the.future.of.science@gmail.com with the subject “subscribe book”. You’ll be emailed to let you know when the book is to be published; your email address will not be used for any other purpose.


[1] Of course, a coin might also land on its side. We’ll ignore that for the purposes of the present discussion.

[2] Not all sunglasses are polarizing in this way. But many are. You can check if your sunglasses are polarizing by holding them up towards pretty much any surface that reflects glare. The ocean or a pool on a sunny day work well.

[3] You might be wondering whether there’s any relationship between a photon having horizontal polarization, and having polarization at 45 degrees (or some other angle) to horizontal. This is a good question, and the answer is that there is a relationship. But it would take us quite a ways afield to understand the relationship, and we don’t need it for the purposes of this essay, so I’ve skipped over it.

[4] An alternate way of seeing that Q is always 2 or -2 starts by rewriting Q as

Q = (A+B)C + (B-A)D.

We can split our analysis up into two cases: the case when A = B, and the case when A = -B. One of these two must always be true, because A and B are both always either 1 or -1.

First case: A = B. In this case the B-A terms in Q vanish, leaving just contributions from the (A+B)C term. A bit of thought and experimentation should convince you this is either 2 or -2.

Second case: A = -B. The A+B terms vanish, leaving just contributions from the (B-A)D term, which again a bit of thought should convince you is either 2 or -2.

[5] Real experiments have imperfections, and Aspect and his co-workers had to use a careful analysis to take those imperfections into account. For example, the polarization photodetectors in the experiment would sometimes miss a photon, and this needs to be taken into account in analyzing the results. I won’t go into all those details here. More modern experiments are getting very close to the ideal experiment described in my essay.

[6] When people see the CHSH inequality and the results of the Aspect experiment for the first time, they sometimes say “oh, isn’t that just like the uncertainty principle, where particles don’t have a simultaneously well-defined position and momentum?” It is similar, but the contradiction of the CHSH inequality by experiment is a much stronger result. It’s true that the uncertainy principle does say that in quantum mechanics, a particle can’t have a simultaneously well-defined position and momentum. But this is just an assertion about the theory of quantum mechanics. The CHSH inequality and the Aspect experiment give us a direct experimental disproof of the idea that a particle has real intrinsic properties with their own independent existence.

[7] There are still a few people who believe that it’s possible to avoid the conclusion that the CHSH inequality and Aspect’s experiment force on us. There are two common lines of attack. The first is to argue that something Alice does can instantaneously influence what Bob sees, but in a way that doesn’t allow faster-than-light signalling. This is an interesting line of thought, but is in its own way also quite a radical departure from classical thought. The second is to argue that somehow the fact that the polarization photodetectors sometimes miss a photon is responsible for the failure of the CHSH inequality. Both these lines of attack continue to be developed, although neither is regarded as mainstream.