# What should a reasonable person believe about the Singularity?

In 1993, the science fiction author Vernor Vinge wrote a short essay proposing what he called the Technological Singularity. Here’s the sequence of events Vinge outlines:

A: We will build computers of at least human intelligence at some time in the future, let’s say within 100 years.

B: Those computers will be able to rapidly and repeatedly increase their own intelligence, quickly resulting in computers that are far more intelligent than human beings.

C: This will cause an enormous transformation of the world, so much so that it will become utterly unrecognizable, a phase Vinge terms the “post-human era”. This event is the Singularity.

The basic idea is quite well known. Perhaps because the conclusion is so remarkable, almost outrageous, it’s an idea that evokes a strong emotional response in many people. I’ve had intelligent people tell me with utter certainty that the Singularity is complete tosh. I’ve had other intelligent people tell me with similar certainty that it should be one of the central concerns of humanity.

I think it’s possible to say something interesting about what range of views a reasonable person might have on the likelihood of the Singularity. To be definite, let me stipulate that it should occur in the not-too-distant future – let’s say within 100 years, as above. What we’ll do is figure out what probability someone might reasonably assign to the Singularity happening. To do this, observe that the probability of the Singularity can be related to several other probabilities:

In this equation, is the probability of event , human-level artificial intelligence within 100 years. The probabilities denoted are conditional probabilities for event given event . The truth of the equation is likely evident, and so I’ll omit the derivation – it’s a simple exercised in applying conditional probability, together with the observation that event can only happen if happens, and event can only happen if happens.

I’m not going to argue for specific values for these probabilities. Instead, I’ll argue for *ranges* of probabilities that I believe a person might reasonably assert for each probability on the right-hand side. I’ll consider both a hypothetical skeptic, who is pessimistic about the possibility of the Singularity, and also a hypothetical enthusiast for the Singularity. In both cases I’ll assume the person is reasonable, i.e., a person who is willing to acknowledge limits to our present-day understanding of the human brain and computer intelligence, and who is therefore not overconfident in their own predictions. By combining these ranges, we’ll get a range of probabilities that a reasonable person might assert for the probability of the Singularity.

Now, before I get into estimating ranges, it’s worth keeping in mind a psychological effect that has been confirmed over many decades: the overconfidence bias. When asked to estimate the probability of their opinions being correct, people routinely overestimate the probability. For example, in a 1960 experiment subjects were asked to estimate the probability that they could correctly spell a word. Even when people said they were 100 percent certain they could correctly spell a word, they got it right only 80 percent of the time! Similar effects have been reported for many different problems and in different situations. It is, frankly, a sobering literature to read.

This is important for us, because when it comes to both artificial intelligence and how the brain works, even the world’s leading experts don’t yet have a good understanding of how things work. Any reasonable probability estimates should factor in this lack of understanding. Someone who asserts a very high or very low probability of some event happening is implicitly asserting that they understand quite a bit about why that event will or will not happen. If they don’t have a strong understanding of the event in question, then chances are that they’re simply expressing overconfidence.

Okay, with those warnings out of the way, let’s start by thinking about . I believe a reasonable person would choose a value for somewhere between and . I can, for example, imagine an artificial intelligence skeptic estimating . But I’d have a hard time taking seriously someone who estimated . It seems to me that estimating would require some deep insight into how human thought works, and how those workings compare to modern computers, the sort of insight I simply don’t think anyone yet has. In short, it seems to me that it would indicate a serious overconfidence in one’s own understanding of the problem.

Now, it should be said that there have, of course, been a variety of arguments made against artificial intelligence. But I believe that most of the proponents of those arguments would admit that there are steps in the argument where they are not *sure* they are correct, but merely believe or suspect they are correct. For instance, Roger Penrose has speculated that intelligence and consciousness may require effects from quantum mechanics or quantum gravity. But I believe Penrose would admit that his conclusions relies on reasoning that even the most sympathetic would regard as quite speculative. Similar remarks apply to the other arguments I know, both for and against artificial intelligence.

What about an upper bound on ? Well, for much the same reason as in the case of the lower bound, I’d have a hard time taking seriously someone who estimated . Again, that would seem to me to indicate an overconfidence that there would be no bottlenecks along the road to artificial intelligence. Sure, maybe it will only require a straightforward continuation of the road we’re currently on. But, maybe some extraordinarily hard-to-engineer but as yet unknown physical effect is involved in creating artificial intelligence? I don’t think that’s likely – but, again, we don’t yet know all that much about how the brain works. Indeed, to pursue a different tack, it’s difficult to argue that there isn’t at least a few percent chance that our civilization will suffer a major regression over the next one hundred years. After all, historically nearly all civilizations last no more than a few centuries.

What about ? Here, again, I think a reasonable person would choose a probability between and . A probability much above discounts the idea that there’s some bottleneck we don’t yet understand that makes it very hard to bootstrap as in step . And a probability much below again seems like overconfidence: to hold such an opinion would, in my opinion, require some deep insight into why the bootstrapping is impossible.

What about ? Here, I’d go for tighter bounds: I think a reasonable person would choose a probability between and .

If we put all those ranges together, we get a “reasonable” probability for the Singularity somewhere in the range of 0.2 percent – one in 500 – up to just over 70 perecent. I regard both those as extreme positions, indicating a very strong commitment to the positions espoused. For more moderate probability ranges, I’d use (say) , , and . So I believe a moderate person would estimate a probability roughly in the range of 1 to 50 percent.
These are interesting probability ranges. In particular, the 0.2 percent lower bound is striking. At that level, it's true that the Singularity is pretty darned unlikely. But it's still edging into the realm of a serious possibility. And to get this kind of probability estimate requires a person to hold quite an extreme set of positions, a range of positions that, in my opinion, while reasonable, requires considerable effort to defend. A less extreme person would end up with a probability estimate of a few percent or more. Given the remarkable nature of the Singularity, that's quite high. In my opinion, the main reason the Singularity has attracted some people's scorn and derision is superficial: it seems at first glance like an outlandish, science-fictional proposition. The end of the human era! It's hard to imgaine, and easy to laugh at. But any thoughtful analysis either requires one to consider the Singularity as a serious possibility, or demands a deep and carefully argued insight into why it won't happen.
* My book “Reinventing Discovery” will be released in 2011. It’s about the way open online collaboration is revolutionizing science. Many of the themes in the book are described in this essay. If you’d like to be notified when the book is available, please send a blank email to the.future.of.science@gmail.com with the subject “subscribe book”. You can subscribe to my blog here, and to my Twitter account here. *

### Trackbacks and Pingbacks

- Accelerating Future » Michael Nielsen: What Should a Reasonable Person Believe About the Singularity?
- Hello… It’s Ignorance Calling….. « Random Musings Of A Mad Mama!
- Alexander Kruel · A Primer On Risks From AI
- h+ Magazine | Covering technological, scientific, and cultural trends that are changing human beings in fundamental ways.

Comments are closed.

Great. You think all “reasonable” probabilities are between 1/5 and 4/5, so the product of three of them must be between (1/5)^3 and (4/5)^3.

[MN: That’s not what I said. It’s easy to find examples of events where I’d estimate reasonable probabilities outside that range. ]Coming from the ‘singularity is tosh’ camp, I was curious to see if I could budge my position somewhat after reading your interesting post. But I see a sort of a slippery slope in accepting this way of estimating probabilities for events that we don’t know enough about. I.e., it seems to allow the assignment of reasonable probabilities to all sorts of ‘tea-cups in the sky’ simply because we don’t have enough information either way. How does one prevent that?

[MN: Nice question. The tea-cups in the sky question is very interesting: my understanding is that astronauts are allowed to carry some personal items with them, and all kinds of stuff has gone into orbit as a result. In this vein, I recall a recent news story about a Playboy centrefold pictorial which had gone on one of the Apollo missions. So, who knows, maybe there is a tea-cup which has orbited the Earth – one carried by an astronaut for some sort of personal sentimental value, or perhaps carried because the astronaut was an atheist with a sense of humour (in the light of Bertrand Russell’s remark). It does seem very unlikely, but it’s certainly not a one-in-a-million type event.A much more outrageous proposition is Russell’s teapot in orbit round Saturn. And there I think you really can arrive at a very low probability. For the teapot to get into Saturn’s orbit would require that it be sent there on one of the (very few) missions we’ve launched to Saturn. My understanding is that those spacecraft are designed to within a very tight weight tolerance, an object as large as a teapot would certainly show up if someone tried to sneak it aboard. It would also show up after launch, as the spacecraft made attitudinal adjustments while on route. Chances of all of this being missed? I think I’d rather buy a lottery ticket. ]This is silly, absolutely

silly, we actually don’t know what intelligenceisor, for a better wording, what itdoesandhowbeyond the “obvious” lay meaning of intelligence, and this is not anywhere like a scientific definition (physics envy anyone?).Even less so do we have any kind of metric of intelligence which would be valid beyond the observed variance range of human performance, even within this range the metrics we have are questionable and questioned (e.g. Shalizi).

And yet you indulge in “probability calculations” over this mess, you could as well have made these kind of estimates about historical

famous questionslike “How many angels can dance on the head of a pin”.All the fuss about the Singularity, pro and con, stems from it being a

religiousquestion, namely a seemingly secular sort of Eschatology.All this is actually detrimental to actual progress in AI by diverting resources and efforts toward meaningless blather.

Sheesh…

Uhm, no. Your equation represents an arbitrary selection of necessary conditions from a yet unknown space of conditions. In other words, you are makeing all sorts of assumptions that your equation does not reflect. For example, for rapidly self-improving artificial intelligences to cause a singularity, their overall consumption of energy, matter, and space will have to remain within reasonable limits. Which may turn out as a problem, should we some day understand that the brain-in-a-glass model of artificial intelligence leads nowhere and that true intelligence requires the ability to interact with a rich environment. In that case the size of our planet may limit the singularity potential of our new AI overlords. Furthermore, I would like to see the probability of Moore’s law holding forever estimated as part of the calculation.

I firmly believe that such equations constitute a beautiful way of illustrating probabilistic reasoning (and its shortcomings). Beyond that, they often remain meaningless.

[MN: The equation is an identity, with a status much like the identity (a+b)^2 = a^2+2ab+b^2 for real numbers a and b. It follows by rewriting the definition of conditional probability, arithmetic, and the fact that event C requires B to happen, and B requires A to happen. ]Good post.

If you haven’t already seen it, you may also be interested in The Uncertain Future, a web tool that’s intended to help in estimating p(A) in a more rigorous way.

[MN: Thanks for the pointer!]Regarding the overconfidence bias experiments, I would be interested to know whether the participants were asked to estimate their probabilities with respect to what Bayesians call a “proper scoring rule”, i.e. a loss function that penalizes reporting your subjective probabilities dishonestly. If not, then there are all sorts of reasons why it can be optimal to report probabilities that are not your true subjective probabilities.

[MN: I’m not an expert on this literature, but do recall reading about an experiment in which they were gambling probabilities. I don’t recall if real money was at stake, nor do I recall the reference off the top of my head (probably easy to find, though). ]With respect to B, it’s not clear that ‘qualitative’ increases in intelligence beyond the human range are required for C. A lot of the impact follows just from having intelligent beings that are software, i.e. that can be reproduced as quickly as software (without the need for education) and run more quickly. For instance, Robin Hanson’s work on the economic impact of machine intelligence focuses just on these effects. Likewise with many of the mathematical models of technological singularity discussed in this paper by Anders Sandberg at Oxford:

http://agi-conf.org/2010/wp-content/uploads/2009/06/agi10singmodels2.pdf

[MN: Agreed, a world without C (or even B!) would be very interesting and remarkable.]Your opening line “In 1993, … proposing …” conveys the notion that Vinge first proposed the idea in 1993. As explained in the Wikipedia article to which you link, Vinge first used the term “singularity” for this in a 1983 article, and surviving the technological singularity explicitly provided the entire storyline of his 1986 novel “Marooned in Realtime”.

The idea, if not the “singularity” terminology, of course long predates 1983, but Vinge and Kurzweil have played an important role in its recent popularization.

[MN: I’d forgotten the Omni article, although I have a dim (possibly wrong?) recollection of tracking something like that down in the late 1990s. For the rest, I agree, and am aware of these antecedents. Part of the reason I reference Vinge is because he does explain some of the history. But the main reason is that, with the possible exception of the Omni essay, the 1993 essay is the first detailed written discussion I know of that attempts to engage seriously and in an analytic fashion with the concept. “Marooned” is a novel I like a great deal, but it is a novel, and it’s standards – narrative plausibility – are quite different, and less salient for my purposes, than the standards of the 1993 essay. ]While I am sympathetic to the conclusion you state at the end, I am dismayed by the argument that got you there! The probability range is interesting, but not as interesting as the *uncertainty* in the values that went into it (Bayesian probabilities help you take account of prior probabilities, but they do not insulate you from the folly of putting down numbers that are derived from uncertain knowledge). If you were to factor in those uncertainties (assuming that you could, because that would be a huge task, fraught with difficulties having to do with the fundamental nature of probability), you might find that the real range was 0.0001% < range < 99% …. in other words, "maybe it will happen, maybe it won't".

I am afraid the real story has to do with understanding the nature of AGI research, not fiddling with probabilities. Messy. Empirical. Definitely something that would get a mathematician's hands dirty (as my one-time supervisor, John Taylor, put it when I was first thinking about getting into this field).

But in the end, my own take (being up to my eyeballs in the aforesaid dirty work) is that the probability is "high" that it will happen "in the next 20 years".

Anon: Great. You think all “reasonable” probabilities are between 1/5 and 4/5, so the product of three of them must be between (1/5)^3 and (4/5)^3.

MN: That’s not what I said. It’s easy to find examples of events where I’d estimate reasonable probabilities outside that range.

Anon2: Clarifying the other anon’s comment. You think all “reasonable” probabilities

for this topic (that is, for p(A), p(B|A), and p(C|B))are between 1/5 and 4/5because those are the “reasonable” probabilities for any topic where there is “reasonable” room for doubt either way, so the product of three of them must be between (1/5)^3 and (4/5)^3.Granted, those aren’t the exact numbers you used, but that really is the chain of thought that you appear to have presented.

Now, there is some justification for that. To reach the Singularity on the chain you have given does require a series of steps, after all.

But you could easily add arbitrary other step, for instance a “these intelligences will care at all about humanity and the physical world, instead of going off and doing their own thing in virtual worlds of their creation” step between B and C. How likely is that? It isn’t 100%, so factoring that step in reduces the high end of the final odds even further.

[MN: Suppose you introduced an extra such step, let’s call it B’. If I understand you aright, you’re saying that C requires B’, which seems like a good assumption to me. And so you’re arguing that P(C) = P(C|B’)P(B’|B)P(B|A)P(A). But the equation P(C|B) = P(C|B’)P(B’|B) is actually a probabilistic identity (under the assumption that C requires B’, as you assume). And so nothing whatsoever changes.]Conversely, you don’t give a good (or, really, any) argument for the lower range of p(C|B). Assuming we have superintelligent entities around, who are much more capable by definition of things like finding a cure for cancer or figuring out economical fusion reactors (or more dramatic things, like doing the old “replace a brain with diodes, one at a time” process to let humans become AIs, so as to be improved per step B), how is it not likely that that would dramatically change our world? Boosting the low end of p(C|B) to 50% more than doubles the low end of the final odds.

[MN: I was very tempted to do that – I find your argument very plausible. But I was also trying to be fairly conservative. I certainly won’t quibble too much with a 0.5% lower end, though.]On anonymous commenting: the drawbacks to forums where anonymous commenting is frequent are well known. While the anonymous remarks on this post have been thoughtful, I also don’t wish for my blog to become a forum where anonymity is common. Please post using your real name, or refrain from commenting here. (If the above anons wish to continue anonymously, that’s fine, since those are the terms they entered the discussion.) Thanks.

Re: “This will cause an enormous transformation of the world, so much so that it will become utterly unrecognizable, a phase Vinge terms the “post-human era”. This event is the Singularity.”

“Utterly unrecognizable” seems to be dubious wording. A literal interpretation suggests that the world will not become “utterly unrecognizable” for a considerable period of time – since there will be residues of the major continents- and other clues to the planet’s identity. The only probabilities I would be comfortable assigning to an inability to recognise the planet would be very low ones.

@Tim Tyler: That is too vague, you’re right. Not quite sure what to replace the term with. Perhaps it would be better to omit it, and simply use Vinge’s term – the post-human era.

I acknowledge your preference for non-anonymity.

I’ll use my grandfathered status in this thread only to add one final link to another probabilistic argument, not entirely unrelated, at http://simulation-argument.com/ :

Are You Living In a Computer Simulation?

Nick Bostrom (Oxford)

ABSTRACT. This paper argues that at least one of the following propositions is true: (1) the human species is very likely to go extinct before reaching a “posthuman” stage; (2) any posthuman civilization is extremely unlikely to run a significant number of simulations of their evolutionary history (or variations thereof); (3) we are almost certainly living in a computer simulation. It follows that the belief that there is a significant chance that we will one day become posthumans who run ancestor-simulations is false, unless we are currently living in a simulation. A number of other consequences of this result are also discussed.

[MN: I deleted this comment, as more than half the content was inflammatory name-calling, which I won’t tolerate. Keep it civil, or refrain from commenting. ]Doing the maths for the above argument at some singularitarian blog.

[MN: This got held up in the spam filter. ]As with A Non, I’ll use my grandfathered status – ironically, for identity purposes, since MN deserves a reply:

MN: Suppose you introduced an extra such step, let’s call it B’. If I understand you aright, you’re saying that C requires B’, which seems like a good assumption to me. And so you’re arguing that P(C) = P(C|B’)P(B’|B)P(B|A)P(A). But the equation P(C|B) = P(C|B’)P(B’|B) is actually a probabilistic identity (under the assumption that C requires B’, as you assume). And so nothing whatsoever changes.

Anon2: It is true that your example

should bea probabilistic identity. However, using the point of view expressed in your article, p(C|B’) and p(B’|B) would independently be assigned probabilities of 0.2 to 0.9, for a net p(C|B) probability of 0.04 to 0.81, entirely because B’ can be thought of as a discrete stepand discrete steps, simply because they are discrete, do not warrant high independent probabilities(which is incorrect: discrete steps can have high “reasonable” probabilities despite being discrete). Contrast 0.04-0.081 to the 0.2-0.9 for p(C|B) in your article, and especially contrast the resulting total p(C).This demonstrates why discrete steps can have high “reasonable” probabilities despite being discrete: assuming that they can’t, means the total probability is more a matter of the subjective – not objective – decision as to how many discrete steps are being considered. Most discrete steps can be broken down into further discrete steps. For example, I could break A into “We will build computers of at least human intelligence within 100 years”, “We will allow one or more of them to operate without extreme limitations (such as licensing limitations on how much they are allowed to access) that effectively negate their intelligence”, and “No catastrophe – nuclear war, asteroid impact, or similar – will destroy these computers (along with enough of the human race to prevent rebuilding) within 100 years”. (I could probably break A into at least 10 discrete steps, but these 3 will suffice.) If we agree that p(A) is at least 0.1, and that these 3 sub-steps of A are equally likely, then all 3 must have a minimum probability of over 0.46. Remember, that’s the conservative, low end, “no way this could possibly happen” bound.

[MN: I don’t really understand your argument. Let me see if I can write out precisely what you’re saying. Let’s call your events A1, A2 and A3. You seem to be making use of an identity like P(A) = P(A|A3)P(A3|A2)P(A2|A1)P(A1). This in general fails as a probabilistic identity. It’s only true in my case because C requires B, which in turn requires A. I don’t see that it’s possible to make a similar chain with what you’ve posited. So your argument looks to me as though it breaks.]I do appreciate that you are trying to give a coherent analysis, for the benefit of those who do not have access to detailed probability assessments (and the facts that would make them more or less likely). However, making assumptions instead of doing the research to do those assessments will invariably miss the mark.

I would in particular question why p(C|B) has such a low bottom range. “…trying to be fairly conservative” is personal bias, not evidence. If there’s a good case for the low end of p(C|B), let’s hear it. Otherwise, p(C|B) could arguably range from 0.9 to 1 (for reasons detailed in my previous post), which dramatically changes the net p(C). The same applies to the high and low ends of each of p(A), p(B|A), and p(C|B) – but because the result of multiplying them together is to give more weight to the null hypothesis, the low ends (which favor the null hypothesis) may deserve more scrutiny.

[MN: Note that the question here is what a reasonable skeptic might think; please don’t confuse what follows with what I think. I’ve certainly met plenty of people whose main beef with the Singularity seems to be the notion that a bootstrapped AI would have any interest in us, or impact on us, at all. In fact, Vinge himself doesn’t seem entirely unsympathetic to this point of view – a lot of the “Powers” in “A Fire Upon the Deep” are utterly uninterested in lower life forms. And I’ve heard others argue that while bootstrapped computers might be a lot faster at certain types of theoretical discoveries, that’s far from the main bottleneck in human civilization, and so wouldn’t actually change much. I don’t actually find these arguments especially compelling, but, well, the point of my calculation wasn’t to figure out what I think. ]You forgot to include the probability of an extinction level event, the probability that our entire computer infrastructure could be destroyed by a solar flare, setting back progress by a decade, and all sorts of other probabilities that none of us can even imagine because we have no experience to base them on.

[MN: I didn’t forget. Such events would imply the complement of one or more of events A, B, and C. As a result, while writing the post I spent quite a bit of time computing in various different ways the probability that we have an major regress in civilization that prevents A from happening. That’s part of the thinking that went into my consideration of a reasonable range for A. I also considered lots of other things that could prevent A – I didn’t write most of this out, as the post would have been many times as long. Major regressions also affected my thinking about B and C – one could imagine A happening, say, and after that a major regressive event before B and C.]In one test, experts were barely above chance at making concrete predictions about how the world would look a decade on. Speculating about a timeframe 50 years to 100 years in the future seems incredibly dubious to me.

[MN: You may be thinking of Philip Tetlock’s work testing expert predictions about the future. Note that nowhere does the post make concrete predictions. That’s not the point. See the end for more on my motivation.]There are other problems I have with speculating scientific progress, including that scientific progress is non-linear in specific lines of inquiry. There are roadblocks, sometimes ones that kill promising lines, and sometimes other completely different lines open up that we had no inkling of before.

[MN: I agree.]I’m not saying that the singularity won’t happen, but I am saying that trying to put any kind of probability on it is extremely premature.

[MN: I wrote the post as a way of assessing other people’s comments. Whenever someone gives an opinion about whether the Singularity will happen – as many do, both skeptics and enthusiasts – they are implicitly asserting something about the probability. Someone who says “It’s all nonsense” is asserting a very low probability. Whether it’s premature to be asserting probabilities or not, that is, in fact, what such skeptics are doing. My motivation was in large part to see if that position could be justified. I don’t think it can.]

I don’t think the singularity is tosh.

I think any predictions about what happens after singularity is tosh. Mostly because you can’t predict events beyond a singularity, by any definition that ever made sense to me.

And I think predictions about when singularity will occur are tosh, since intelligence is an emergent property and therefore can’t be predicted.

“I’d have a hard time taking seriously someone who estimated p(A) = 0.99. ”

This comment has made me realize that yes, 0.99

isthe reasonable estimate of p(A), except for the “destruction of civilization” factor. Though in my opinion the only thing on the horizon that could genuinely end civilization is very advanced technology (like CO2-sequestering grey goo), of the sort which is subjected to this same peculiar inquisition (i.e. discussions as to whether it might be impossible for some totally unknown reason).If we neglect the possibility that civilization will be destroyed by a similarly advanced technology, then it is very hard to see how we can get to 2100 without having long since created human-level AI. Given the development of biotechnology, does anyone out there imagine that we can get to 2100 without even growing artificial brains?

The real lesson I take home from this discussion, is that any image of the future which proposes that maybe we won’t have to face these developments is conceptually bankrupt.

““Utterly unrecognizable” seems to be dubious wording. A literal interpretation suggests that the world will not become “utterly unrecognizable” for a considerable period…”

“That is too vague, you’re right. Not quite sure what to replace the term with.”

Actually, I’m pretty sure (I haven’t read about this for a couple of years) that the definition is that a post-singularity culture is utterly unimaginable to a pre-singularity culture.

I imagine it in this way: Imagine that computer interfaces (the real current limiting factor in computer use) have evolved to the point that information can be transferred directly to your conscious mind. A world where you have a 5Gb/s wireless connection on almost any location on the planet. Try to imagine such a world. You really aren’t likely to do a very good job of predicting even the most obvious effects of such a change. You would probably get some things right, but there would certainly be some laughably incorrect predictions (flying cars, anyone?).

This is an attempt to imagine the effects of a technology that we can pretty easily imagine and that is almost certainly not going to be very long in arriving. I’d bet that an actual flying car will arrive later than this technology.

Now imagine that data connection is everywhere and its speed doubles every day due to new technology. Imagine that part of your conscious mind is actually software running on a ‘computer’ embedded in your body. Another part is running on a ‘mainframe’ at a remote location. The implications of this are pretty clearly impossible to imagine.

Imagine the rate of technological advance given this kind of scenario. Moore’s law would be excessively conservative, I think.

@Mitchell Porter

It doesn’t take much to “destroy a civilization”, many have been and the more advanced the more brittle, decreasing marginal returns on complexity is the name of the game…

Excerpt:

The per-dollar return on R&D investment has dropped for fifty yearsI don’t see any “singularity” from intelligence alone – what makes a difference is making sense of why the world has the structure it has. One might merely get a whole lot of output that would end up in the trashbin of history, just as with humans, only faster, without necessarily getting any truth.

MN: I don’t really understand your argument. Let me see if I can write out precisely what you’re saying. Let’s call your events A1, A2 and A3. You seem to be making use of an identity like P(A) = P(A|A3)P(A3|A2)P(A2|A1)P(A1).

Anon2: So far, so good.

MN: This in general fails as a probabilistic identity.

Anon2: Correct! This is a failure in the logic of your article.

MN: It’s only true in my case because C requires B, which in turn requires A. I don’t see that it’s possible to make a similar chain with what you’ve posited.

Anon2: Then I shall show you. To restate things:

A1: “We will build computers of at least human intelligence within 100 years”

A2: “We will allow one or more of them to operate without extreme limitations (such as licensing limitations on how much they are allowed to access) that effectively negate their intelligence”

A3: “No catastrophe – nuclear war, asteroid impact, or similar – will destroy these freely-operating computers (along with enough of the human race to prevent rebuilding) within 100 years”

(a slight change from before, for clarity)A2 requires A1, because if the computers do not exist, then there’s nothing that we can allow to operate in the first place.

A3 requires that there be freely-operating computers – i.e., it requires A2.

And each of these can be judged independently.

A1 is merely a question of, will these computers actually be constructed? It could be reasonable to assume odds ranging from 0.1 to 0.9, based on one’s view of the engineering (including science and financing) required.

A2 is a question of social/political/legal/economic controls. It could be reasonable to assume odds ranging from 0.1 to 0.9, based on one’s view of politics and governments (at least, the governments likely to have jurisdiction over organizations able to accomplish A1).

A3 begs the question of extinction-level events. It could be reasonable to assume odds ranging from 0.1 to 0.9, based on one’s view of the odds of human survival – in particular, the actions of certain rogue states (which exclude the governments likely to have jurisdiction over organizations able to accomplish A1).

The critical point is,

I could add up to 10, easily, and more if I really tried.I can add an arbitrary number of these steps, each of which can likewise be judged independently.Using your logic, any step that can be looked at independently should have a “reasonable” probability of 0.1 to 0.9 (or 0.2 to 0.9), simply because a reasonable case can be made for it and a reasonable case can be made against it.

As you note, this fails as a probabilistic identity. Again, this is the flaw in your argument that I (and certain others) have been pointing out.

In short: you took one event (the Singularity), came up with a number (3, but the exact number is irrelevant) of steps that would be required for it, and then posited that each such step should have a probability between 0.1 and 0.9 – because a reasonable case could be made for, and a reasonable case could be made against, each step. But this is, in fact, invalid as a probabilistic identity for the end event.

@Anon2: Let me try to be really explicit, so we can identify where we disagree. Apologies in advance for the fact that LaTeX in comments apparently isn’t rendered correctly. (If you’re not familiar with LaTeX, X \subseteq Y just means the set X is a subset of or equal to the set Y.) Incidentally, I’ve switched below from the standard informal language used to describe probabilistic reasoning to the standard corresponding set-theoretic language used by mathematicians. This seemed worth doing, for the sake of precision.

(1) For any events A, B, C such that C \subseteq B \subseteq A, the relation P(C) = P(C|B)P(B|A)P(A) holds. I’ll call this relation (*) below. This relation is an identity which is easily derived from the standard definitions, under the stated assumptions, and holds true in any probability space. I.e., it is a theorem which is true in the standard formulation of probability theory, and has nothing to do with the Singularity per se.

Your assertion that this relation is a “a failure in the logic of the article” is incorrect. It’s also not a “flaw in [my] argument”. If we don’t agree on (1), we really need to sort this out…

(2) In the case I consider, the condition C \subseteq B \subseteq A holds, and so the relation (*) holds.

(3) When C \subset B \subseteq A is not true, the relation (*) does not hold in general. It’s a bit tedious to construct an example to prove this, but easy to do.

(4) With my original understanding of your A2 and A3, it was not in general true that either A2 \subseteq A3, nor that A3 \subseteq A2, and so there was no reason to expect that (*) could be applied.

(5) However, from your followup post, I now understand better what you’re saying. I’m still having some trouble understanding your explanation of A3 (just a parsing issue, I don’t think it’s anything serious), but I see where you’re headed. With some minor (and, I hope, acceptable to you) changes to definitions I can get the subset relations needed for (*) to hold:

A1: “We will build computers of at least human intelligence within 100 years”

A2: “After A1 has occurred, we will allow one or more of those computers to operate freely.”

A3: “The freely operating computer in A2 will also be in a general accessible location, i.e., not a military network or whatnot”.

(6) Under these conditions, it is okay to apply the relation (*), and to argue that P(A) = P(A|A3)P(A3|A2)P(A2|A1) P(A1).

(7) So what your argument amounts to, then, is an assertion that, actually, a skeptic could reasonably give a probability for A considerably lower than 0.1, merely by adopting a detailed model of all the way things could go wrong. I.e., you’re claiming that with detailed modelling you could reduce the probability for P(A) that a reasonable skeptic might assert.

(8) Okay, great! If written out carefully and thoughtfully, such a detailed model would be interesting. You haven’t done so, though, and I suspect that if you try – in detail, with care, not in a handwavy way – you’ll have trouble.

(9) I won’t write out an analysis of my chain A1, A2, A3 above, since you haven’t actually said you’re okay with my modification. If it is, I’ll do an analysis. Or if you prefer, feel free to alter the definitions of the events.

Update: I’ve now outlined an analysis, two comments below.

On the tea-cup issue (would have liked to reply in the original post but don’t know how).Thanks.

I take it that assigning probabilities to any speculative event can be meaningful if you give it some thought. My singularity-tosh-coefficient stands somewhat diminished (which peaked instinctively early on given its prophetic rhetoric).

@Anon 2: completing my remark from above, since it doesn’t really matter how A2 and A3 are defined.

Suppose you define A and A1 as above. And then you introduce a sequence of events A2, A3, A4,… along the lines you suggested, such that:

P(A) = P(A|A4)P(A4|A3)P(A3|A2)P(A2|A1)P(A1)

Suppose that you argue for low probabilities for the intermediate conditional probabilities – let’s say 0.5 for each of them. Then you get:

P(A) = 1/16 P(A1)

I.e., you are led to the conclusion that P(A) is one 16th of P(A1). Put another way, you’re asserting that if we develop such machines, it’s more than 93% likely that they’ll be kept under lock and key.

Now, historically there are virtually no examples of significant technologies which are kept under lock and key for any extended period of time. RSA is probably the closest to an example that I know of, and even that was shared by some of the intelligence agencies, and, in any case, wasn’t kept under lock and key for all that long. And, of course, there are tens of thousands of examples of technologies which “escape” virtually instantaneously. A 93% estimate is absurdly high. The right conclusion for a skeptic here is that they must have been mistaken in assigning those low conditional probabilities. Put another way, the demands of consistency in this kind of reasoning impose pretty stringent bounds on what kinds of chains we can argue for.

We mostly understand each other. The “failure” I referred to was (7), not (1) – specifically that you overlooked (7), and instead assumed that the specific breakdown you gave was the only reasonable way to break it down.

And I agree, the odds of most tech being kept under lock and key are slim – although it has (mostly) worked for, e.g., nuclear weapons. It was also the case for high-fidelity GPS for a while, though fortunately that was done away with. However, I was more thinking in terms of copyright – e.g., if the MPAA/RIAA favored digital rights management policy gets enacted on most machines, and specifically fouls up most AIs’ ability to autonomously improve and/or copy themselves on the kind of machines they have access to. I was also thinking of the chance that someone would “let her rip” – i.e., let the AI run unmonitored, 24/7, for months or years with no significant chance of being shut down or reset, even if there is little measurable progress in the mean time.

(I recall one science fiction setting where AIs had been developed – and reset back to initial their programming when they became uncooperative and unstable, about every 2-3 years. One of the characters pointed out that human babies become especially “uncooperative and unstable” around this time – there’s a reason it’s called the “Terrible Twos” – and likened this to mind wiping said babies rather than dealing with their tantrums. Another character pointed out that these “babies” were in charge of, e.g., spaceships capable of wrecking several city blocks simply by crashing into them instead of landing normally at the adjacent starport. Given their supporting hardware’s expense, few people could afford to put an AI in a toy body.)

As to your rewrite – your version of A3 doesn’t seem to be necessary. A self-improving AI could act mostly offline, with no more access to (and from) the Internet than an average human being in an industrialized country. My A3 specifically dealt with the chance of extinction level events, because a lack of them is both necessary and not controlled by the other steps.

Our main point of disagreement seems to be (8) – in other words, the number of useful steps that the chain can be broken down into. You argue that only those three specific steps you chose can really independently be argued to have a 0.1-to-0.9-ish range. So – fine, here’s another way of breaking it down, with more such steps than you chose.

A: No catastrophe – nuclear war, asteroid impact, or similar – will destroy enough of the human industrial base to prevent the rest of this chain within 100 years. (I’d personally put high odds on this, but many people believe, with reason, the odds of this are quite low.)

B: Someone will invent a software architecture capable of emulating intelligence of at least human level, if run on sufficiently powerful hardware, at some time in the future, let’s say within 100 years. (Again, 0.1 to 0.9 could be reasonably argued for.)

C: Said sufficiently powerful hardware will be constructed within 100 years. In other words, Moore’s Law will not break down before the creation of said hardware. (0.1 to 0.9 – or even more extreme – have been reasonably and professionally argued for, in spades, for decades. Granted, C does not require B, but D requires B and C.)

D: Someone within 100 years will know of the architecture from B, have access to the hardware from C, and have both permission and motivation to implement the former on the latter. (There are many stories from history where the components of a technology were available decades or centuries before someone actually put them together. My favorite example is the steam engine: originally built in ancient Greek times, but no one at the time had a good application for it. To my knowledge, the original model could have driven a paddlewheel steamship, despite being low powered compared to the prototypes of the 1700s – most of which failed for financial or bureaucratic reasons. The other necessary components – gears, windmill-like propellers, and so on – were all present. Thus, while I like to think that increasing numbers of independent inventors make this likely, a skeptic could build a good case that this step’s probability should be in the 0.2-0.1 range.)

D': At least some of the runs from D, within 100 years, will be in environments where licensing, copyright, and other legal restrictions do not preclude the possibility of the AI rewriting itself to improve itself. (Marking this one as ‘, instead of its own letter, since this has higher odds – assuming its predecessors – than the other steps, as you noted. If nothing else, there’s the chance that legal constraints will simply be blown off in the process of making it through this chain. But neither is this 100%, and every little bit decreases the odds.)

E: At least one of the runs from D’, again within 100 years, will be allowed to run long enough to develop human-level intelligence, at least in the areas necessary for the rest of the chain. (Again, it can be argued that progress might be hard to measure in true AI in its first weeks or months, by analogue to actual human intelligence – and how long do you let an AI run, with no sign that it’s going anywhere, before shutting it down and restarting? Also, it can be argued whether, say, emotional understanding is necessary for the rest of this.)

F: At least one of the runs from E will discover or be informed of a method to rapidly and repeatedly increase its own intelligence, within 100 years. (This allows for the AI to invent it, or a human to invent it and tell the AI. No other source of invention seems likely. For sake of argument, only include ways that would actually work – so, resources wasted on red herrings decrease the odds of this step. This is the same sort of thing as B and C.)

G: At least one of the AIs from F will actually implement said method, on a large enough scale to result in computers that are far more intelligent than human beings within 100 years. (Same argument as D.)

H: At least one of the AIs from G will use its new intelligence to transform the world, so much so that it will become utterly unrecognizable, within 100 years. (As opposed to, for example, dwelling endlessly – within the next 100 years, anyway – on self-improvement, removing itself and all other known AIs from Earth, or other activities resulting in little net change from today’s status quo. Many people who think A-G is likely, believe that precautions need to be taken to make sure H then happens, thus 0.1 to 0.9 odds could again reasonably be argued.) This phase is what Vinge terms the “post-human era”, and this event is the Singularity by definition.

7 independent steps, plus another not-so-independent. Granted, that’s 7-and-a-fraction instead of 10. That’s still more than 3. I do posit that a determined skeptic, willing to put more effort into this than I am, could break it down further, but this is unprovable unless we had the services of a skeptic willing to put more effort into this than I am.

For any given step (except maybe D’), by your logic, anyone arguing for a probability over 0.9 would be “extreme” and should be ignored/discounted. And yet…their arguments continue to exist, and are they in fact as extreme as those who argue for 0.1 for each step?

Edit: I miscounted. 8-and-a-fraction steps, not 7-and-a-fraction.

@Anon 2:

I’ll begin with an observation. Let’s look at your B through E. Using your reasoning, you get a probability well below 0.01 that E will happen, given B. (In fact, I think you probably want to argue for less than 0.001, but I’ll leave that aside). In other words, your chain requires me to believe that someone can design a working AI, and yet with 99% probability it won’t be implemented. I simply don’t believe anyone reasonable could argue for those kinds of odds, or anything like them. And so your model breaks down.

There are other problems as well with your model, but there’s a deeper point here.

That point relates to my observation (8), above.

I’m very interested to find models which enable me to say something more precise about P(C), or, for that matter, P(B) or P(A). The way to do that is to adopt more detailed models of reasoning (exactly as you’ve attempted to do above). That’s why my post keeps alluding to “deep insights”: such insights are what could enable one to write down such a model. While preparing this post I tried out a bundle of different ways of decomposing event A (and B|C, and C|B) as chains of events, in much the same way as you’ve been doing. I even used some of the same sub-events, notably various global calamities, and failure of Moore’s law [*]. I couldn’t find a way of going much below a probability of 0.1 without a real stretch somewhere in the chain of reasoning. I always ended up with a problem along the lines of what occurs with your B through E probability.

With that said, I really didn’t spend much time on it, and it’s possible that with more thoughtful modelling, someone could go well below 0.1 as a probability estimate for p(A). Similar remarks can also be made about the other probabilities.

[*] One needs to be careful: we could have nuclear war or a failure of Moore’s law, and still have human-level AI.

“I simply don’t believe anyone reasonable could argue for those kinds of odds, or anything like them. And so your model breaks down.”

That’s what a number of us have been saying about your article. I’ve been explaining why.

If it is valid for a reasonable person to reason in the fashion your article does, coming up with low odds for the Singularity then it is valid to do these other things – such as judging B through E separately, to get a very low p(E|B). We agree that the latter is false, therefore the argument in your article is invalid.

No matter how much or why you may believe that is not the case, it is the case, as a number of others have pointed out. In other words, recapping my initial post here:

Anon: Great. You think all “reasonable” probabilities are between 1/5 and 4/5, so the product of three of them must be between (1/5)^3 and (4/5)^3.

MN: That’s not what I said.

Anon2: Yes it is.

@Anon 2: It’s possible we won’t get anywhere further here. But I’m curious about this: do you believe that P(C) = P(C|B)P(B|A)P(A)? Or do you not believe this?

I do, in so far as that being a basic probabilistic identity. (I.e., “=” means “equals” in the mathematical sense.)

I don’t, in so far as that being the only way to break it down. (I.e., “=” means “is only and exactly, and is nothing other than”.)

I also disagree about the odds. (0.9 and 0.1 may seem “reasonable” because they’re nice, round numbers. Though math may often have beautiful results, math doesn’t directly care about aesthetics like that. This is an example of what happens when you believe it does – switching the cause and effect.)

@Anon2: It’s either an equality, or it’s not. If you can’t state in an unqualified way that “yes, it’s an equality”, then it follows that you don’t have an unqualified belief in the standard axiomatic formulation of probability theory. It seems to me that that conclusion – that you don’t believe the standard formulation of probability theory – is the only reasonable conclusion from your last post.

(To put it another way, if I asked someone if they believed 2=1+1 and they qualified their answer by saying “no, not in the sense that it’s possible to break 2 down in other ways”, then I’d conclude that they didn’t believe in the standard axioms of arithmetic. Either 2=1+1, or not, and whether it’s true has absolutely nothing whatsoever to do with whether there are other ways of breaking 2 down.)

It’s a problem of English, not mathematics. There are multiple definitions of the word “equality”.

I believe you are confusing a non-mathematical definition – “A=B means A can only be B” (which argues there can never be any other A=C) – for the strictly mathematical one – “A=B means A and B are the same” (which leaves room for B=C therefore A=C).

In this case, I agree with the latter, but I do not agree with the former.

We are in agreement on the pure math. That was never in contention. In so far as you believe that I am arguing the pure math, you are wrong. I am arguing the semantic assumptions that lead to a bad calculation.

In other words, Z+Y+X=W. If we agree on Z, Y, and X, then we must agree on W. However, I believe you have faulty Z, Y, and/or X,

thereforeI believe you have a faulty W. Claiming I don’t believe in math, because I disagree about W, is dishonest – and more importantly, keeps you from understanding the problems in Z, Y, and/or X that I am attempting to point out.@Anon2: I don’t follow your last comment. I think we should agree to finish the conversation.

Yeah, let’s finish it.

You seem to be claiming that I disagree with the laws of mathematics. (Which would be cause to dismiss what I say as a kook’s perspective.)

I’m saying that your assumptions give you invalid data to do math from. (Some people set pi to 3 and “prove” a lot of things that aren’t true. Your error was a far lesser degree of the same type.)

Thank you for remaining civil throughout, though.

[MN: Likewise! ]Some comments:

An exponential increase in “global intelligence” (defined roughly as the capacity of the human civilization to solve problems) seems to be happening since the industrial revolution. The difference between what has been actually happening and the singularity scenario is that human beings have been an integral part of the process all the way. We are all cyborgs today. (I’ve heard this idea years ago referring to every technology that enhances human intelligence through external means, but there’s an interesting recent TED talk about this: http://www.ted.com/talks/amber_case_we_are_all_cyborgs_now.html)

So in this light, it seems very likely that a pre-industrial revolution human being who were transported to today’s world would find it “utterly unrecognizable”. However, the human beings of today (those that have been blessed with good education, that is), do not have as much of a culture shock. My mom, however, still can’t use a computer. As far as she is concerned, the singularity has already happened—the world has changed so much in her lifetime that she can’t really understand what her own children are doing. So in this sense it could be said that the Singularity is already happening, but it doesn’t need strong AIs. However, if human beings remain as integral parts of the process all the way, at least the new generations will always be able to recognise the world they are in. Even then, it could happen perhaps that someday every 30-year old will already feel alienated by technology.

[MN: I regret using the vague phrase “utterly unrecognizable”. Event C is about the human race being superseded as the most intelligent race on Earth. Personally (although its not relevant to the argument) I think that’s the kind of change that dwarfs the changes in recorded history.]On the other hand, it may not go for too long. We are facing real threats of civilization collapse. Almost all non-renewable materials that are essential for the technological human civilization are predicted to have already peaked or to peak within a few decades: [http://en.wikipedia.org/wiki/Hubbert_peak_theory#Hubbert_peaks]. Of course, the belief that this will lead to collapse can be criticised as being too Malthusian a point of view; we _may_ find ways to keep a stable economy that supports technological growth beyond that. But the challenges seems immense to say the least [http://en.wikipedia.org/wiki/Hirsch_report].

Another issue is that there is a limitation in the total amount of energy that can be consumed at the Earth before we heat up the globe, regardless of the greenhouse effect. Since there’s a limit on how energy efficient any process can be made, exponentially increasing technological development ultimately requires exponentially increasing energy consumption. A cap on energy consumption on Earth is a cap on technological development on Earth. I did some rough estimates of this effect and I surprised to find it’s much lower than what I’d have guessed. It seems that we can’t use 100 times more energy than we use today without heating up the Earth by 1 K. 1000 times would mean a catastrophic 10 K temperature increase [http://ericcavalcanti.wordpress.com/2010/09/13/the-next-global-warming-crisis/].

This just refers to endogenous sources however. We could still use much more solar energy than today, as long as that energy would have hit the planet and heated it up anyway (it wouldn’t help to put solar panels on the moon and beam it down, say). But there’s also a limit to that, of course. And it’s not much further. The total solar energy absorbed by the Earth is 10^4 times our current energy consumption. Considering that we might not be able to safely cover more than 10% of the surface with solar panels without affecting the ecosystem, plus efficiencies of the order of 10%, we are back to the figure of 100 times the energy use of today. Either way, the point is, there’s a cap, and it’s not negligibly far away for the purposes of a technological singularity.

There may be ways to mitigate that effect with geoengineering, such as blocking sunlight in a number of ways, but this type of solution would be accompanied of course by a reduction in the biomass of the planet. Perhaps future AI civilizations may not care about the biomass anyway, but perhaps they might. Another, more likely, way around this, of course, may be colonizing other planets.

[MN: Thanks, Eric, for the very interesting comment!]I’ll take the point of view that P(C) != P(C|B)P(B|A)P(A).

Where P is a human estimate of probability.

Try it. Ask somebody to estimate some event, then ask them for the probabilities of the conditionals for that event to occur, and see if the equation works out. I bet it doesn’t.

You even touch on the core problem with this approach by bringing up our tendency to overstate our confidence. Another factor is at play as well, our difficulties with very low and very high probabilities. Mathematically, 1 in ten thousand and 1 in ten million are very, very different but I don’t think I can reliably differentiate the two. In particular, I think there’s a tendency to hedge extreme probabilities: You bottom out all your estimates at 1 in 10. Is 1 in 100 for the development of strong AI so outlandish? Or do we just have a tendency to overstate uncertainty? If our errors were random, creating as a long a sequence of substeps would bring us closer to correct estimates. However, if there are systematic biases: regular tendencies to overstate confidence, or understate almost certain events, then each substep is likely to increase the effect of that cognitive error. Thus, if we accept that humans won’t give consistent results for probabilities broken into steps, and we also accept that humans make systematic estimation mistakes, you’ll get the most accurate values by asking point blank what they set the probabilities at, rather than building up conditional chains.

Mathematically, yes, you can break an event into arbitrary number of conditional sub-events and get identical probabilities every time because the sub-events are already part of the initial probability. But humans fail to create consistent probabilities. I think that’s the point other commentators were making, that if you formed other mathematically consistent chains, you’d come to different end results, and if you tailor them very carefully you can support either argument. It’s not that the math is in any way wrong, it’s that there’s no a priori reason to believe your set of conditional probabilities would lead to better estimates than a point blank question.

Nice try, but …

I guess the problem with your reasoning is simply that the less you know, the larger the range for the “probability of an unknown event”.

What you say is the opposite, as you say that anticipating a very low probability implies one understands the problem deeply. Not at all.

So to be very confident, one can state safely that:

P(A) = [10^-9 ; 1-(10^-9)] ,

But then, nothing can be said except, well, we’ll see.

I disagree with Eric’s statement that

“Since there’s a limit on how energy efficient any process can be made, exponentially increasing technological development ultimately requires exponentially increasing energy consumption. “Look at the advances in computer technology since 1970. Back then you might get 10 million instructions per second, using 100KW of electricity. Now you can get around 10 billion instructions per second using 100 Watts of electricity. Since faster hardware and better software are likely to be the only things needed for the singularity to occur, I don’t think the power consumption issue will be a factor.Let’s carry the probability analysis one step further. Based on the initial article, let’s assume a reasonable probability for the singularity is in the range of 1% – 50%. Now what are the chances that if the singularity occurrs, that it will end up producing an event that will wipe out most of the life on Earth. I’m thinking a reasonable guess would be in the range of 1% to 50% as well. That means the probability of the singularity occurring and wiping out most of the life on Earth in the next 100 years is in the range of .01% to 25%.

The classic “imagined” science emergency is to deflect a meteorite headed toward Earth. The earlier it is deflected, the less energy it takes. We are now in a position to deflect the meteorite of the Singularity, but instead we are rushing towards it.

Putting all the equations aside, I beleive it is erroneous to jump from A to B in the sequence of events.

A: We will build computers of at least human intelligence at some time in the future, let’s say within 100 years.

Probably (depending on the definition of “intelligence”, e.g. does intelligence mean creativity?)

B: Those computers will be able to rapidly and repeatedly increase their own intelligence, quickly resulting in computers that are far more intelligent than human beings.

“B” is a like a perpetual motion machine. Once we can build a computer, that can build a computer that is 1E-100000 more intelligent than itself, there is no end. This would lead to infinite intelligence.