Open Architecture Democracy

by Michael Nielsen on April 17, 2010

The singer Avril Lavigne’s third hit was a ballad titled “I’m With You”. Let me pose what might seem a peculiar question: should the second word in her song title – “With” – be capitalized or uncapitalized? This seems a matter of small moment, but to some people it matters a great deal. In 2005 an edit war broke out on Wikipedia over whether “With” should be capitalized or not. The discussion drew in a dozen people, took more than a year to play out, and involved 4,000 words of discussion. During that time the page oscillated madly back and forth between capitalizing and not capitalizing “With”.

This type of conflict is not uncommon on Wikipedia. Other matters discussed at great length in similar edit wars include the true diameter of the Death Star in Return of the Jedi – is it 120, 160 or 900 kilometers in diameter? When one says that U2 “are a band” should that really be “U2 is a band”? Should the page for “Iron Maiden” point by default to the band or to the instrument of torture? Is Pluto really a planet? And so on.

Don’t get me wrong. Wikipedia works remarkably well, but the cost in resolving these minor issues can be very high. Let me describe for you an open source collaboration where problems like this don’t occur. It’s a programming competition run by a company called Mathworks. Twice a year every year since 1999 Mathworks has run a week-long competition involving more than one hundred programmers from all over the world. At the start of the week a programming problem is posed. A typical problem might be something like the travelling salesman problem – given a list of cities, find the shortest tour that lets you visit all of those cities. The competitors don’t just submit programs at the end of the week, they can (and do) submit programs all through the week. The reason they do this is because when they submit their program it’s immediately and automatically scored. This is done by running the program on some secret test inputs that are known only to the competition organizers. So, for example, the organizers might run the program on all the capital cities of the countries in Europe. The score reflects both how quickly the program runs, and how short a tour of the cities it finds. The score is then posted to a leaderboard. Entries come in over the whole week because kudos and occasional prizes go to people at the top of the leaderboard.

What makes this a collaboration is that programs submitted to the competition are open. Once you submit your program anyone else can come along and simply download the code you’ve just submitted, tweak a single line, and resumbit it as their own. The result is a spectacular free-for-all. Contestants are constantly “stealing” one another’s code, making small tweaks that let them leapfrog to the top of the leaderboard. Some of the contestants get hooked by the instant feedback, and work all week long. The result is that the winning entry is often fantastically good. After the first contest, in 1999, the contest co-ordinator, Ned Gulley, said: “no single person on the planet could have written such an optimized algorithm. Yet it appeared at the end of the contest, sculpted out of thin air by people from around the world, most of whom had never met before.”

Both Wikipedia and the Mathworks competition use open source patterns of development, but the difference is striking. In the Mathworks competition there is an absolute, objective measure of success that’s immediately available – the score. The score acts as a signal telling every competitor where the best ideas are. This helps the community aggregate all the best ideas into a fantastic final product.

In Wikipedia, no such objective signal of quality is available. What allows Wikipedia to function is that on most issues of contention – like whether “With” should be capitalized – there’s only a small community of interest. A treaty can be beaten out by members of that community that allows them to reach consensus and move forward. Constructing such treaties takes tremendous time and energy, and sometimes devolves into neverending flame wars, but most of the time it works okay. But while this kind of treaty-making might scale to tens or even hundreds of people, we don’t yet know how to make it scale to thousands. Agreement doesn’t scale.

Many of the crucial problems of governance have large communities of interest, and it can be very difficult to get even two people to agree on tiny points of fact, much less values. As a result, we can’t simply open source policy documents in a location where they can be edited by millions of people. But, purely as a thought experiment, imagine you had a way of automatically scoring policy proposals for their social utility. You really could set up a Policyworks where millions of people could help rewrite policy, integrating the best ideas from an extraordinarily cognitively diverse group of people.

The question I have is how we can develop tools that let us scale such a process to thousands or even millions of people? How can we get the full benefit of cognitive diversity in problem-solving, without reaching deadlock? Are there clever new ways we can devise for signalling quality in the face of incomplete or uncertain information? We know some things about how to do this in small groups: it’s the art of good facilitation and good counselling. Is it possible to develop scalable mechanisms of agreement so we can open source key problems of governance?

Let me conclude by floating a brief, speculative idea for a Policyworks. In the one minute I have left there’s not time to even begin discussing the problems with the idea, let alone potential solutions. But hopefully it contains the kernel of something interesting. The idea is to allow open editing of policy documents, in much the same way the Mathworks competition allows open editing of computer programs. But each time you make an edit, it’s sent to a randomly selected jury of your peers – say 50 of them. They’re invited to score your contribution, and perhaps offer feedback. They don’t all need to score it – just a few (say 3) is enough to start getting useful information about whether your contribution is an improvement or not. And, perhaps with some tweaking to prevent abuse, and to help ensure fair scoring, such a score might be used as a reliable way of signalling quality in the face of incomplete or uncertain information. My suspicion is that – as others have said of Wikipedia – this may be one of those ideas that works better in practice than it does in theory.

This post is based on some brief remarks I made about open architecture democracy at the beginning of a panel on the subject, moderated by Tad Homer-Dixon, and with co-panelists Hassan Masum and Mark Tovey. One day, I hope to expand this into a much more thorough treatment.

  1. JLD permalink

    Agreement doesn’t scale.

    It is very fortunate that agreement doesn’t scale because when it does it’s for the worse.
    The lowest common denominator, which brings “entropic death”, nothing moves anymore.

  2. ripero permalink

    Just fyi, another long discussion on Wikipedia is whether Pau Casals should redirect to Pablo Casals or the reverse: .

  3. Alessandro permalink

    The MathWorks competition reminds me of the the Netflix Prize, In the latter, collaboration was not explicit in the rules, but in fact it did help the winning team. See and, in particular, see Simon Funk’s post “Try This at Home” in the forum of the competition.
    Apropos of AT&T, did you know that OEIS is going to be wikized soon?

  4. Alessandro – Thanks for the tip about the OEIS. Very interesting. I guess the OEIS has always been sort of like a wiki, in a very slow and manual way.

  5. Alessandro permalink

    sorry, I forgot to include the reference:

  6. Michael, thank you for the terrifically interesting topic … and thank you Allessandro, for the pointer to OEIS.

    I had not visited OEIS in several years, and I found they had added an entrancing five-minute animated movie of sequences (with music). This movie brings home very forcibly the (necessarily) arbitrary nature of the ordering of the sequences in the OEIS database.

    Which brings me to the point—if there is no one logically “natural” ordering of the OEIS sequences, how can there be a logically “natural” ordering of policies? Or a natural ordering of narratives (which are the foundation of policy)?

    The sole good answer that I know is Andy Groves’ … “Let chaos reign, then rein-in chaos” … prudent compromise, in other words.

    The OEIS has been outstandingly successful at this difficult balancing act … we can hope that our planet does similarly well at reining-in the chaos of the coming century …

    All this being immensely fun to contemplate … while watching Tony Noe’s wonderful movie of the OEIS sequences!

    URL: “”

  7. Ricardo Pietrobon permalink

    Michael, interesting point, but not sure if technology is the answer or the main problem. The problem, it seems, is agreeing on what the goal should be. And for that, there is no way science or technology can help. For example, if we agree that reaching consensus is a way to get to a goal, this is not a final solution but an assumption that consensus is the answer. In multiple circumstances, consensus might not be the best answer.

  8. If you base quality on the opinions of a random sampling of other users, you’re optimizing for what the average reader wants to hear. That’s not necessarily the truth. I know plenty of ideologues who refuse to believe the leader of another political party could possibly have done anything good (or there own, anything wrong). Or even just choosing between a complicated, but precisely true sentence or a new version that reads better but is a generalization of the issue: for an encyclopedia you want the former, but will most people grok the subtle details, or just naturally prefer the more readable sentence?

    I’m more inclined to trust an activate debate then a poll for finding the truth. Yes, it does mean people get caught up in trivialities. But that just appears to be a universal truth, things like the capitalization of ‘with’ have no objective right answer (making debate hard), and are simple enough that anyone can express a cogent opinion. A better solution might just be to provide easy access to popular dissenting opinion: maybe a mouseover of the ‘With’ that pops up a summary of the argument for why the other capitalization should have been used. That way the main page can present the ‘truth’ Wikipedia has arrived at, but provide ample opportunity for the user to access less popular views of the issue.

  9. JLD permalink

    interesting point, but not sure if technology is the answer or the main problem.

    Yes, this is the flawed assumption: that there exist an unique “right” solution to any political problem.
    I would rather lean toward H. L. Mencken’s:
    “For every complex problem there is an answer that is clear, simple, and wrong.”

  10. Paul, thanks for your comment. You begin: “If you base quality on the opinions of a random sampling of other users, you’re optimizing for what the average reader wants .

    I perhaps wasn’t clear enough on this point. The point is that this is an interactive process, and so will actually modify what readers want to hear. Imagine there’s an education policy that’s being edited. My home state (Queensland, Australia) has some extremely remote regions with unusual local issues that are sometimes dealt with inadequately by policy. Someone injecting local knowledge won’t be saying what people want to hear, but I’ve little doubt they’d be well received. (“Oh, we didn’t realize that some children have to take time off to help during the cattle round up.”)

    I.e., I’m not proposing a raw poll, but rather something that would look rather more like an active discussion.

  11. “Yes, this is the flawed assumption: that there exist an unique “right” solution to any political problem.”

    Nowhere do I make such an assumption. I only speculate that it might be possible to systematically do a little (maybe a lot) better than current decision-making. I presume you don’t deny that’s possible. (If you deny any notion of better or worse decisions, then you need to deny that countries like Denmark and Sweden have systematically better decision-making than countries like North Korea and Zimbabwe. I presume that’s not the case.)

  12. Paul – the idea of a presentation of alternative points of view is a nice one. There has been work done on colourized versions of Wikipedia that let people see where particular issues are contentious, i.e., they colour hotspots. Variations on this idea might be powerful in the policy context. On a related note, it might also be possible to integrate ideas like distributed version control / git / github, which make it easy to see when people are exploring different lines of thinking on an issue. git / github does a very nice job of this in open source software, presenting graphs which can show how a project has evolved in different directions.

  13. Isn’t it characteristic of many (most?) human activities, that the process is just as important as the result?

    Otherwise, novels and plays would be *much* shorter … and also, much sadder. The following are famous stories told from the “end result” point-of-view.

    (1) The whale is OK, but the ship sinks.
    (2) Jim is freed; Huck and Tom are glad about it.
    (3) Romeo and Juliet die from a misunderstanding.
    (4) The Standard Model works, but no-one’s happy with it.

    From this point of view, “The Future of Science” is *really* all about whether it science is/becomes an community activity that (say) 10^7–10^8 human beings can do simultaneously, enjoyably, and productively.

    If we’re not thinking that on that scale (or larger) about the future of science—and about the the future of *all* major professions—in the 21st century, then aren’t we thinking too small?

  14. Very enjoyable read, including the comments.

    I don’t really have anything to add. The MathWorks contest is fascinating though, as does the idea of trying to improve our group-decision making process.

    Off topic, anyone think OEIS would accept the sequence representing the order of sequences Tony Noe used in his OEIS video? 🙂 We could call it the “the one sequence to rule them all.”

  15. John – you might enjoy checking out Galaxy Zoo, which currently has about 10^{6.5} participants, and rising rapidly. They’re doing some extremely interesting science.

  16. Michael, I checked-out Galaxy Zoo … it has *both* interesting science *and* an interesting community (these two being equally interesting to me). Both the science and the community are evolving rapidly (of course) … it will be plenty interesting to watch their future progress.

    Perhaps we can expect similar progress at the microscale? Here I have in mind the concluding paragraphs of Ed Wilson’s Naturalist:

    “If I could do it all over again, and relive my vision in the twenty-first century, I would be a microbial ecologist. Ten billion bacteria live in a gram of ordinary soil, a mere pinch held between thumb and forefinger. They represent thousands of species, almost none of which are known to science.”

    “Into that world I would go with the aid of modern microscopy and molecular analysis. I would cut my way through clonal forests sprawled across grains of sand, travel in an imagined submarine through drops of water proportionally the size of lakes, and track predators and prey in order to discover new life ways and alien food webs. All this, and I need venture no more than ten paces outside my laboratory building. The jaguars, ants, and orchids would still occupy distant forests in all their splendor, but now they would be joined by an even stranger and vastly more complex living world virtually without end.”

    “For one more turn around I would keep alive the little boy of Paradise Beach who found wonder in a scyphozoan jellyfish and a barely glimpsed monster of the deep.”
    Heck … galaxies … microbes … both are evolved systems!

  17. By the way (as a follow-up to the above), I see that Ed Wilson has embedded the above ideas in a *novel* titled (what else?) Ant Hill. This is good news for anyone who (like me) collects fictional narratives by scientists (Wilson/Szilard), mathematicians (Wiener), and engineers (von Braun) … because yes, they all wrote fiction … and yes, this fiction illuminates their professional work.

    Now, it is true that the future of science will be largely determined by physical law, mathematical logic, and historical circumstance. But “largely determined” is a very different matter from “wholly determined”, and surely it is true that this difference resides very largely in the scientific narratives we choose to tell.

    That is why I’ll definitely be buying Wilson’s novel (which I just today heard about) … it’s certain to be largely about Wilson’s vision of the future of science … and (to judge by the reviews) Wilson communicates that vision in his novel at least as effectively, and perhaps more broadly, as he does in his traditional writings.

  18. Two other examples of community directed effort with the nice property of having objective success/failure are the Polymath projects (community mathematical proofs of unsolved problems) and Kasparaov vs. The World (community chess against a world chess champion). These tended to involve a single or small group of leaders directing a large group of people in there exploration of possibilities. In math, different groups could pursue proof strategies and return with new insights into the problem to share with the group. In chess, different groups could think deeply about what the impact of individual moves, and come back with promising leads or warnings about long term problems with a move. Wikipedia operates like this in a global sense of division of labor between articles, and to some degree within an article with people focusing on individual sections.

    This community exploration of a very large state space (be it chess board or galatic maps) seems to work well. What’s less clear is how to approach cases where you can’t easily divide the problem, like policy changes. On rereading the article I understand better where you’re coming from, trying to promote discussion among smaller groups on the way to making a major decision like the capitalization of ‘with’. I’m reminded of studies that found that discussions where everyone has the same ideology leads the participants to hold more extreme views, while diverse groups leads to more moderate views. An interesting structure might be a ‘tournament debate’ around the issue: Groups of say, 7 people discuss an issue and try to come to a consensus. Then they elect the person who best represents and articulates their views to go on to another round of debates. As you move up towards a top group, you’re hopefully getting the most persuasive speakers on the topic, and at each level those speakers are having to refine and defend there ideas, and picking up good data and ideas from the members of each group.

  19. Paul – I’m not sure policy is going to be all that much harder to modularize than (say) a lot of open source projects. I certainly agree that it’ll require a great deal of work, but that’s not a priori a killer. One of the things I find interesting about many open source communities is how much time they spend discussing exactly this question (how best to divide problems up).

    On polarization and the limits of deliberation: I definitely recommend Cass Sunstein’s book “Infotopia” (which I’ve written about here before). It reviews a great deal of the literature on the subject. Scott Page’s work on cognitive diversity is also interesting.

  20. Carrie Ashendel permalink

    Well said. If you haven’t seen it, MixedInk uses just this idea to help people collaboratively write mission statements, speeches, policies, recipes, etc. It seems fairly inactive in recent years, but I would love, love, love to see adaptations of the idea along the lines of what you’re talking about.

    They have a pretty good video about how it works on their site,

  21. Thanks for the kind words, Carrie, and also for the link, which I hadn’t seen, and which is extremely interesting. Some of the sample documents they come up with are pretty good!

  22. Just to keep this (wonderful) topic on “simmer” … my son and I this morning exchanged URL’s for three YouTube videos that (loosely) bear upon the topic of “The Future of Science”:

    (1) Technology ain’t the whole story: Space monkey- it’s not a planet, it’s our home, URL: “”

    (2) Mathematical sense may not make human sense: Weird Al Yankovic – Bob, URL: “”

    (3) Natural orderings are not unique: The OEIS Movie, URL: “”

    Here the point is that the “Future of Science” now being born may well be wonderful — IMHO *is* very likely to be more wonderful than we can easily imaging — and yet that scientific future perhaps will *not* be technology-driven, mathematically logical, or well-ordered.

  23. JLD permalink

    yet that scientific future perhaps will *not* be technology-driven, mathematically logical, or well-ordered.

    Nor too cloying either I hope! 😀

  24. Is there no room in this discussion for reputation?

    Open source projects usually work in a hierarchical way. There are two things going on here – one is just authority – those at the top have the power to direct the project and this is kept in check by the fact that everyone has the right to try their luck at fork the project.

    But there’s something else going on. The ethos and structure of open source projects tend to ensure that those in authority are very well thought of by others in the project and outside it. So you have two systems mutually reinforcing each other.

    Now straight hierarchy is never going to be accepted or acceptable in an open source policy community, or if it is, it won’t be acceptable outside the community. So unless it’s an ‘official’ community – the bureaucracy or a parliamentary political party for instance – its policy work won’t count for much in the community.

    But might not it be possible for people in communities to agree to some extent on who has a generally good reputation in the community for knowing what they were talking about, for being open minded and rigorous in discussion etc? Advogato operates a ‘web-of-trust’ reputation system in which users receive privileges based on who endorses them – a bit like a Google ranking for people.

    If one used something like that then we would move away from a pure participatory democracy towards something more structured, and a little more like representative democracy. Those with a higher reputation get more say in decisions (perhaps more votes, though one might have more complex rules than that).

    Thus where we have someone pointing out that students in remote Qld may need time off to deal with their cattle, the person pointing out the issue may not have a high score entitling them to lots of votes, but there would be some ‘authority’ in the community which could highlight their contribution and ensure it got appropriate standing.

    It’s not clear how stable such reputation could be. Where values are important, where one can’t fall back on the NPOV, such a community might fork into two or three broadly recognisable ideological communities – classical liberal, conservative, social democrat – with each community having a relatively stable system of reputation. Even if they’re not a unitary source of the ‘right’ solution, such communities might do a great deal of useful policy work.

  25. Nicholas,

    I certainly think such ideas are fascinating and worth exploring.

    One sobering issue is that in most online communities at present, such reputation systems don’t scale very well and are easily gamed. Things like the Digg and Slashdot karma systems really aren’t very good, beyond a certain community size, in part because they’re easily gamed. It’s an interesting challenge to devise a system that does scale.

    A more successful system, albeit, for a problem different than reputation, is Google’s system for ranking webpages. Despite the existence of a massive industry of people trying to game the system, Google still manages to do pretty well most of the time at sorting wheat from chaff. But it does have the problem that (a) the algorithm isn’t public; and (b) it fails pretty badly some of the time.

    A point Jimmy Wales has made is that reputation / karma measures may work to reduce community spirit. He makes the analogy to going to work with a number hung around your neck, and asks what effect that would have on team spirit. (Not good, one presumes). I’m not sure I quite agree with this, but take his point: quantifying reputation brings with it many problems of its own. The regular “karma fights” that break out in online forums (“why’d I get modded down!”) are an example of this.

    Rereading the above, it’s all rather pessimistic. I should say that I quite agree with you that this is a very interesting problem to be thinking about, and potentially very valuable in open government.

  26. Hmm, well I’m not too sure about Wales’ view either. Does a surgeon gain or lose intrinsic motivation from the fact that his reputation is on the line? Adam Smith – an authority on Web 2.0 by the way 😉 – thought it was the ultimate motivator – and that people wanted money, not for the material things it bought, but because it bunked them up the pecking order.

    Is it to supply the necessities of nature? The wages of the meanest labourer can supply them. To be observed, to be attended to, to be taken notice of with sympathy, complacency, and approbation, are all the advantages which we can propose to derive from it. It is the vanity, not the ease or the pleasure, which interests us.

    So yes, fights break out, but that’s the corollary of the fact that they care about these things – which is why they bother with the exercise in the first place. One might go further and say that the ‘spirit’ of a project is a major determinant of its success, and that spirit, that culture, is typically set by the founders of it and maintained by its leaders. It can be a very participatorily democratic one if that suits the project but if it doesn’t it can’t. The culture and leadership have to be different. Different folks; different strokes different asks: different tasks.

  27. Nicholas,

    It’s possible we largely agree – certainly I agree entirely with your last four sentences. But just to clarify what I was saying: I believe a competitive labour market is usually a good thing, in the large. But I also believe that within teams where the focus is intended to be constructive collaboration, competition for status often destroys co-operation. People get more caught up in getting the credit for every tiny scrap of contribution than they do in actually getting the job done in the best possible way. In the best collaborative experiences I’ve ever had all the people in the group to some extent (usually only partial) stopped identifying “good” with their individual good, and started identifying it with the group good. Note that I’m certainly not advocating a collectivist approach here, wherein a central authority mandates that people put the group interest first. That never works, and is a recipe for disaster. I’m talking about a situation wherein people independently come to the decision to put the group interest first. When that happens, groups can accomplish remarkable things.

  28. Yes, agreed – we’re both after the ‘sweet spot’. At least for the kinds of projects I’m thinking of, I think my spot is at least sweeter than the alternative where your voice counts the same no matter what your reputation. I think some reputational system is OK so long as it is perceived as pretty fair and so long as it doesn’t divert people to heavily from the quality of the work – as you say. In that case it can be an additional motive to do a good job – along with intrinsic motivation there’s pride in getting a good reputation. The rub is that it can also interfere with intrinsic and group oriented motivation as you say . . . .

