The Wikipedia Paradox

To determine whether any given subject deserves an entry, Wikipedia uses the criterion of notability. This lead to an interesting question:

Question 1: What’s the most notable subject that’s not notable enough for inclusion in Wikipedia?

Let’s assume for now that this question has an answer (“The Answer”), and call the corresponding subject X. Now, we have a second question whose answer is not at all obvious.

Question 2: Is subject X notable merely by being The Answer?

If the answer to Question 2 is “no”, then there’s no problem, and we can all go home.

If the answer to Question 2 is “yes”, well, we have a contradiction, and in a manner similar to the interesting number paradox, it follows that Question 1 must have no answer, and so every conceivable subject must meet Wikipedia’s notability criterion.

Take that, deletionists!

Here’s the amusing thing: whether the answer to Question 2 is yes or no depends on where I publish this analysis. If I publish it on my blog and no-one pays any attention, the answer to Question 2 is, most Wikipedians would likely agree, “no”.

But suppose I went to great trouble to convene a conference series on The Answer, was able to convince leading logicians and philosophers to take part, writing papers about The Answer, convinced a prestigious journal to publish the proceedings, arranged media coverage, and so on. The Answer would then certainly have exceeded Wikipedia’s notability guidelines, and thus the answer to Question 2 would be “yes”.

In other words, whether this is a paradox or not depends on where it’s been published 🙂

(This line of thought was inspired by a lunchtime conversation two years ago with a group of physicists. I don’t remember who, or I’d spread the blame.)

Update: A number of people have made comments along the lines of “But aren’t you assuming a well-ordering” / “What if the most notable article isn’t unique” and so on. It’s easy to modify Question 1 to deal with this: all that’s needed is (a) for the set of non-notable subjects to be well-defined; and (b) for there to be some way to pick out a unique one from that set. Point (a) is, of course, debatable, but outside the scope of the game, which starts by assuming that the Notability policy is well-defined to start with. With that, point (b) follows because the set of possible subjects on Wikipedia is a subset of the set of unicode strings, and is thus countable.

44 comments

  1. Nice, indeed. But you cannot assume that the answer to question 2 is “yes”, so I guess it’s not really a blow of any kind, just an amusing facet 🙂

  2. *lol*

    Str-rrr-ike!
    Thanks for paying this tribute to the respect every sentient being should express before any lemma, sentient or not… 😉

    BTW: Notability is a construction based upon assumptions, that IMHO held true ONLY in the time before Wikipedia existed. This notion of notability is an expression of money-power over media-induced attention – as we have WP, that’s all more or less bunk.

    Again – thanks for making my day! 😀

  3. Very amusing, but nothing more than a joke 😉

    Let’s assume, that x is the largest negative real. From the axioms of order it follows, that x/2 is also negative, but x/2 > x! therefore we can (correctly) conclude, that there is no such number. Therefore you (falsely) conclude, that there in fact are no negative numbers!

    In other words: You assume, that the order-relation of notability is a well-order, an order, in which every subset has a greatist element. But this assumption doesn’t has to be right (as I tried to illustrate with the “commen” order-relation on the real numbers).

    But as I said, very funny indeed 😉 And interesting nevertheless, maybe one could eliminate that fault… If the set of all topics would be finite, for example, every order-realation has to be a well-order, if I’m not mistaken. 😉 But for now, let’s just settle for “funny – but not logically correct” 😉

  4. “it follows that Question 1 must have no answer, and so every conceivable subject must meet Wikipedia’s notability criterion.”
    For this conlusion, it needs to be proven, that the set of all subjects can be well ordered by notability, right ? 🙂

  5. The paradoxon’s solution lies within the assumption that such an Answer exists. That existence is not at all obvious. In fact I’ve an argument suggesting the opposite.

    You cannot compare each topic’s notability with one another so the ordering applied isn’t a well ordering. It follows quite frankly that the existence of maxima/minima of a finite set is not given by nessesity. In a different wording, one needen’t to find such a minimal criminal.

    Suppose hoever one could do this, so we’d have a bijective projection of the number of Topics together with notability (T,N) onto the real numbers with number-comparison. (R,<)

    As for instance each complex number (C) can be a topic, you'd have at least one Body of which it is proven that one isn't able to build a well ordering on top of it, which is consistent with the bodies axiom's. As the Space of all topics must be considerably larger (C \subset T), to me it seems kinda hopeless to find any projection alike the one that would be required 🙂

    Nevertheless i'd agree that there arises some kind of problem. (as does always when you allow self-references in logical systems)

  6. What about the case that Question 1 does not have a _unique_ answer? In that case “The Answer” would be a (possibly infinite) list of subjects. But in Question 2 you assume that you can identify a single subject X with The Answer. I suppose you could still force The List (or at least the fact of its existence) to become notable, but that would not imply that every single subject on it became notable.
    Maybe you need to assume some kind of notability-well-ordering on the set of subjects. But this leads to the question: What is the most notable topic of all? (42, I guess…)

  7. Uhm yeah. Just use the lower bound then:

    Question 1′: What’s the _least_ notable subject that’s notable enough for inclusion in Wikipedia?

    And you won’t run into any problems.

  8. Nice, good fun! Two thoughts occur:

    1. Presumably the levels of interest/research/events etc. in all topics are in constant flux. If one ran conference on, debated about, and composed papers on subject X presumably that would be sufficient new activity in its sphere to raise it above the notability bar?

    2. Boiling a question down to a binary choice is nearly always reductio ad absurdum. Could not the answer to the question be ‘no, but a significant article pointing towards the title of the most notable of unnoted topics could be.’

  9. Although, to be fair, perhaps a more pertinent point is that no subject should be deemed inadmissible to an encyclopedia of the nature of wikip. anyway.

  10. The answer to the second question is most likely “no”, because there is an subject X for different fields (e.g. mathematics, politicians from Italy, beer, …). That would quite a lot subjects X, so being a subject X makes it not more notable.
    Also you are assuming notability is exactly measurable. I don’t think it is. And that would mean, there is no subject X but an area X where the subjects are not definitely notable but may be. And therefor the answer to question 2 is “no”.

  11. You silently assume that notability imposes a partially order between subjects. Otherwise there would be no single “most notable subject that’s not notable enough for inclusion in Wikipedia”.

    But for deletionists notability is a boolean function.

    So, although the Wikipedia Paradox is funny for “us”, I won’t persuade any of “them”.

  12. All other nitpicking apart, I am sure you didn’t fail to notice that your inductive step is rather costly:

    >>But suppose I went to great trouble to convene a conference series on The Answer

    In the interesting number paradox, the next almost-interesting number becomes interesting the very moment the current number has its interestingness acknowledged. Unfortunately, doing conferences on The Answer will only establish relevance for one Answer (and the process of finding The Answer itself, which is the induction scheme).

    I’m afraid, these darn deletionists do still have us on.

  13. Isn’t it The Answer as a concept that becomes important in the latter case rather than what The Answer actually is? Because the focus shifts from finding “the” Answer to finding an Answer and then beginning the search again. So it follows that the search is what is actually important.

  14. Here’s is the deletionist version of the same questions:

    Question -1: What’s the LEAST notable subject that is included in Wikipedia?

    Question -2: Is subject X actually NON-notable, merely by being The Answer to question -1?

    (This is related to the proof that all numbers are boring. Let n be the smallest interesting number. Who cares?)

    There’s another problem besides partial ordering. Both your questions and mine assume a certain independence of notability. But in fact, subject X may be Wikipedia-notable only because there is no Wikipedia article about subject Y, and vice versa.

    [Jeff: I don’t see why the answer to your Question -2 would ever be yes. Your other point is interesting. I think it can be defeated by introducing equivalence classes of subjects paired like your X and Y, and asking the question about those equivalence classes. That’s getting pretty Byzantine, though, and isn’t as good a joke as the original question.]

  15. I would argue that by creating the conference, you are altering the properties of subject X such that it no longer meets the criteria by which you chose it. However, we have now developed an interesting algorithm to cause any subject to become Wikipedia-worthy.

  16. What you’d be promoting with the conferences, etc., would be the structure of the paradox, not necessarily the referent of the paradox. Conferences on “the answer” might be Wikipedia-notable, but some topic fulfilling the requirements for “the answer” would still not necessarily be Wikipedia-notable.

    In other words, you’re confusing the referer and the referent, a trivial mistake made by most non-programmers.

  17. Regarding your update: I don’t think it works. Your point (b) syas that there is an ordering, but that is not enough. There are likely to be zillions of possible orderings, with no apparent reason why one sepecific ordering should be chosen. The “alphabetically first non-notable topic” or something like that does not seem to be an obvious candidate for inclusion.

    You need a an ordering whose properties match with the concept of “notability”, but that concept in it self might not have the properties to make ordering possible. Your aunts’ dog and my aunts’ dog might be intrinsically equally notable, just as points A and B on a plane can both have the same distance form point C.

    [MN: No, I don’t. With the idea of the updated version of the post in mind, I can pick any (fixed) ordering, and then simply lobby to make the corresponding “The Answer” notable, as already described in the post. I do agree that the joke is funnier if one uses an ordering that captures notability, but it’s not logically necessary for the argument.]

  18. I agree with Julius; there’s no paradox in “The Answer” being notable but The Answer being non-notable.

    [MN: I quite agree that there’s no logical paradox. But the situation I’ve sketched out in the post is intended to be a situation where both are notable; in your terminology, the conference series makes both “The Answer” and The Answer notable. That’s perfectly reasonable.]

  19. [..] and then simply lobby to make the corresponding “The Answer” notable
    True, but for then you can just pick a random not covered topic, lobby until it is included, pick a new random topic, lobby, etc. If there is a finite set of topics, eventually all topics could be in there.

    [MN: Yes, indeed, that’s part of the conclusion, as stated in the post.]

    The assumption in your post is that lobbying will be somewhat easier for a topic that somehow stands out among the non-covered topics. Just taking a random ordering won’t help your lobbying efforts 🙂

    [MN: Nowhere do I make that assumption. The argument is that the topic will be notable merely by virtue of being The Answer.]

  20. Generally, these sorts of things are resolved by cutting the Gordian knot—someone says “OK, this paradox is silly, let’s call this article (non-)notable and that’s it.” Common sense is supposed to prevail over slavish application of rules, and so there’s even a rule “Ignore all rules” (which admittedly has a clause in there “where it helps the encyclopedia”, but that’s so that people don’t try to apply that rule strictly, either!).

    This particular situation would probably be resolved as non-notable: the community generally takes a dim view of notability games. One time, a pair of artists tried to make a Wikipedia page about itself (“Wikipedia Art“, it was called) and the community shot it down relatively quickly, despite the authors even having gotten a bunch of publicity for their little stunt, timed to appear right after the article itself (so as to try to justify both the page and the news).

    When thinking about Wikipedia’s rules, realize that you’ll never, ever see a truly recursive pattern, because it will get stopped by someone saying “this is ridiculous” before it can iterate very far.

    [MN: Agreed. ]

  21. With that, point (b) follows because the set of possible subjects on Wikipedia is a subset of the set of unicode strings, and is thus countable.

    This argument is incorrect. The fact that a set is countable doesn’t guarantee that it has a “largest member”.

    For instance, the set of rationals between 0 and 1 is countable, because we can list them all:
    1/2
    1/3
    2/3
    1/4
    3/4
    1/5
    2/5
    3/5
    4/5

    But there is no largest member in this set.

    [MN: Point (b) is that there be a way to pick out a unique, i.e., well-defined, member from the set. It don’t say anything about it being a maximum, or a minimum, or anything else like that, and it’s not needed for the argument to go through. Of course, as I said above, it’s not as amusing as “the most notable topic not notable enough to be in Wikipedia”, but still leads to the same conclusion: everything’s notable.]

  22. If you abandon notability as your ordering scheme… then, sure, you can find some ordering that gives you a clearly-defined “first member not in the set of notable articles”. But there are infinitely many such orderings, with infinitely many answers – so what you’ve really done here is to shift the problem from proving the notability of the article to proving the notability of your chosen ordering scheme. I’m not sure this really helps 🙂

    [MN: I can just pick an arbitrary one – say, lexicographic – and go with it, and start promoting The Answer. Whether or not the ordering was notable would be incidental to establishing the notability of The Answer.]

  23. Isn’t this a reworking of Bertrand Russell’s barber paradox. The example used in Logicomix (highly recommended graphic novel) is of a book which lists non-self-referential books – should it include itself thereby entering the paradox.

    Similarly a Wikipedia page of subjects not covered by Wikipedia would negate itself.

  24. Notability exists on layers. For example, what and who is notable in Western US history? Given a yes answer to one subject, say gold minding, or the building of an economy in a well defined region of the west, based on gold or other forms of money or trade. a second level notability topic exists IFF it directly connects to the first layer in a meaningful manner. That is, no notable subjects independently exists, in a journal (as several threads have proposed), or in the real world. Real world, and the economic or social worth of a product, and the economic/social context of an individual providing meaningful leadership provided the gold standard, QED. Milo Gardner

  25. Notability exists in mathematical layers as well.The historical threads that first built numeration, arithmetic, algebra, geometry, weights and measures,and higher mathematical topic are notable. The gold standard in math history is not provided by journals and modern paradigms concerning modern mathematics.

    The ancient math history gold standard that qualified a topic as notable are the ancient texts that report one or more numeration, arithmetic, algebra, geometry, weights and measures or higher math foundation. Let the ancient texts speak for themselves, absent modern censors, or revisionists – who which history had taken a different course.

    For example, Archimedes creation of calculus was born outside of the modern view of the ‘limit theorem’. Archimedes calculus did not primarily use the method of exhaustion (though fragments of the modern idea are reported his his finding the area/volume of a section of a parabola). Dijksterus documents in “Archimedes’, that Heiberg showed in 1906 that Archimedes converted an 1/4 geometic (infinite) series

    A + A/4 + A/16 + A/64 + … + A/rn + …

    (the modern method of exhaustion fragment)

    to a (finite) Egyptian fraction series

    A + A/4 + A/12

    as used from 4,000 BCE to 1454 AD (within Fibonacci’s 1202 AD Liber Abaci – Europe’s arithmetic book for 252 years)..

    Milo Gardner

  26. It’s probably tedious to continue this post, but whatever.

    > Question 2: Is subject X notable merely by being The Answer?

    Very clearly not. There is no self-referentiality in the notability guidelines, nothing to make this even close to being possibly ‘yes’.

    > But suppose I went to great trouble to convene a conference series on The Answer, was able to convince leading logicians and philosophers to take part, writing papers about The Answer, convinced a prestigious journal to publish the proceedings, arranged media coverage, and so on. The Answer would then certainly have exceeded Wikipedia’s notability guidelines, and thus the answer to Question 2 would be “yes”.

    Another commenter points out an ambiguity in the wording – is the conference and all the coverage on the Answer or on the paradox? If the latter, then the Answer still has not notability. If the former, then nothing objectionable has happened and there is no paradox.

    Let us imagine that the Answer is the very interesting podesta political system used in medieval Italy (https://secure.wikimedia.org/wikipedia/en/wiki/Podest%C3%A0). If you convene a conference, generate lots of new coverage, new research papers, etc. then why wouldn’t WP cover podesta, especially if it was on the edge to begin with? I don’t see any issue at all there. Every notable idea or historical event or person was non-notable at some point.

    Someone might object, ‘But this feels like “gaming” Wikipedia – cynically manipulating what it will and will not include – manufacturing Notability.’

    But you could manufacture notability just as well by going to your local public space and shooting 30 people to death, but no one seriously objects to articles on Cho or Hasan. Manufacturing notability is what happens as time moves on; as the expression goes, ‘shit happens’. (And besides, if you are devoting your resources to publicizing and researching one Answer, you are thereby not doing so all the other possible Answers.)

  27. Hello Mr Nielsen,

    my name is Christian Bahls, i am member of MOGiS e.V.
    The deletion of the MOGiS Article in the german wikipedia
    lead to the discussion about relevancy in Germany.

    By accident i have to admit that i am also Mathmatician
    And i have to admit i loughed my head off 🙂

    And you know what ..
    .. we are actually doing the induction step of your proof 🙂
    http://news.google.com/news?q=mogis%20wikipedia%20-mike
    http://www.google.com/search?hl=de&q=mogis+wikipedia+-mike&btnG=Suche&lr=&aq=f&oq=
    http://de.wikipedia.org/wiki/Benutzer:Hei_ber/Mogis

    yours
    Christian Bahls

  28. The paradox breaks down because unlike numbers, notability has a non-zero cost. Here’s how:

    Suppose X is the most notable non-notable topic (assuming we can designate one non-notable topic as such)

    Is X notable merely by being such? No – because to be notable, it isn’t enough to merely have a given property. It is necessary to have obtained attention from the world (“be deemed worthy of notice”) for having that property. And to have that evidenced by being written about substantively, in “reliable sources”.

    So just having the property isn’t enough. On to the fallback case “…suppose I went to great trouble to convene a conference series on The Answer, was able to convince leading logicians and philosophers to take part, writing papers about The Answer, convinced a prestigious journal to publish the proceedings, arranged media coverage, and so on. The Answer would then certainly have exceeded Wikipedia’s notability guidelines…”

    The problem here is, yes you could. But not for each and every topic, because there is a practical limit to conference calling, publicity generation, and interest the world is likely to pay, and ultimately time people and resources, that will limit your ability. Eventually and probably fairly quickly, you will find yourself hitting topics that are “the most notable non-notable” for which you will_not_ have been able to obtain world attention.

    It all comes down to the fact that, unlike numbers, notability depends upon attention paid to a property, not merely having the property alone… and attention is limited and hard to obtain.

    QED.

  29. Cute.
    but….
    Even If all this was true – after a few iterations – surely quickly enough – the next article about a subject because it is the next least interesting will be far less interesting then some of the other dropped articles and will stop adding (possibly reduce) interest in that subject.

    Even if the above is false and the logic of your argument was perfect – this still couldn’t hold since wikipedia is a noisy system…
    Wikipedia entries have some noise in them for many reasons (e.g. outdated information, inaccuracies, marketing professionals,r deliberate sabotage etc.).
    There are mechanisms aimed at reducing the magnitude of this noise – moderation is one but the more significant (I think) force is that the whole of internet is allowed to correct (and insert) the errors/inaccuracies.
    Adding lots of entries will greatly reduce the reader/entry ratio of wikipedia – thus reducing the potential for error correction and increasing the noise in the system – possibly eventually rendering it useless.
    There are a few additional interesting factors but I have a limited attention span and have to wrap up – so my final point is:
    consider pi (you know 3.141592653589… with the circles and all).
    If articles ARE of diminishing interest – in this illustration diminishing magnitude of least significant digit of pi – then for each purpose the relevant accuracy is limited and for basic research, every-day purpose and curiosity – 3.14159 is probably quite enough and you don’t have to go any further… it’ll only make everything cumbersome.

    <

Comments are closed.