April 2010 – Michael Nielsen

The singer Avril Lavigne’s third hit was a ballad titled “I’m With You”. Let me pose what might seem a peculiar question: should the second word in her song title – “With” – be capitalized or uncapitalized? This seems a matter of small moment, but to some people it matters a great deal. In 2005 an edit war broke out on Wikipedia over whether “With” should be capitalized or not. The discussion drew in a dozen people, took more than a year to play out, and involved 4,000 words of discussion. During that time the page oscillated madly back and forth between capitalizing and not capitalizing “With”.

This type of conflict is not uncommon on Wikipedia. Other matters discussed at great length in similar edit wars include the true diameter of the Death Star in Return of the Jedi – is it 120, 160 or 900 kilometers in diameter? When one says that U2 “are a band” should that really be “U2 is a band”? Should the page for “Iron Maiden” point by default to the band or to the instrument of torture? Is Pluto really a planet? And so on.

Don’t get me wrong. Wikipedia works remarkably well, but the cost in resolving these minor issues can be very high. Let me describe for you an open source collaboration where problems like this don’t occur. It’s a programming competition run by a company called Mathworks. Twice a year every year since 1999 Mathworks has run a week-long competition involving more than one hundred programmers from all over the world. At the start of the week a programming problem is posed. A typical problem might be something like the travelling salesman problem – given a list of cities, find the shortest tour that lets you visit all of those cities. The competitors don’t just submit programs at the end of the week, they can (and do) submit programs all through the week. The reason they do this is because when they submit their program it’s immediately and automatically scored. This is done by running the program on some secret test inputs that are known only to the competition organizers. So, for example, the organizers might run the program on all the capital cities of the countries in Europe. The score reflects both how quickly the program runs, and how short a tour of the cities it finds. The score is then posted to a leaderboard. Entries come in over the whole week because kudos and occasional prizes go to people at the top of the leaderboard.

What makes this a collaboration is that programs submitted to the competition are open. Once you submit your program anyone else can come along and simply download the code you’ve just submitted, tweak a single line, and resumbit it as their own. The result is a spectacular free-for-all. Contestants are constantly “stealing” one another’s code, making small tweaks that let them leapfrog to the top of the leaderboard. Some of the contestants get hooked by the instant feedback, and work all week long. The result is that the winning entry is often fantastically good. After the first contest, in 1999, the contest co-ordinator, Ned Gulley, said: “no single person on the planet could have written such an optimized algorithm. Yet it appeared at the end of the contest, sculpted out of thin air by people from around the world, most of whom had never met before.”

Both Wikipedia and the Mathworks competition use open source patterns of development, but the difference is striking. In the Mathworks competition there is an absolute, objective measure of success that’s immediately available – the score. The score acts as a signal telling every competitor where the best ideas are. This helps the community aggregate all the best ideas into a fantastic final product.

In Wikipedia, no such objective signal of quality is available. What allows Wikipedia to function is that on most issues of contention – like whether “With” should be capitalized – there’s only a small community of interest. A treaty can be beaten out by members of that community that allows them to reach consensus and move forward. Constructing such treaties takes tremendous time and energy, and sometimes devolves into neverending flame wars, but most of the time it works okay. But while this kind of treaty-making might scale to tens or even hundreds of people, we don’t yet know how to make it scale to thousands. Agreement doesn’t scale.

Many of the crucial problems of governance have large communities of interest, and it can be very difficult to get even two people to agree on tiny points of fact, much less values. As a result, we can’t simply open source policy documents in a location where they can be edited by millions of people. But, purely as a thought experiment, imagine you had a way of automatically scoring policy proposals for their social utility. You really could set up a Policyworks where millions of people could help rewrite policy, integrating the best ideas from an extraordinarily cognitively diverse group of people.

The question I have is how we can develop tools that let us scale such a process to thousands or even millions of people? How can we get the full benefit of cognitive diversity in problem-solving, without reaching deadlock? Are there clever new ways we can devise for signalling quality in the face of incomplete or uncertain information? We know some things about how to do this in small groups: it’s the art of good facilitation and good counselling. Is it possible to develop scalable mechanisms of agreement so we can open source key problems of governance?

Let me conclude by floating a brief, speculative idea for a Policyworks. In the one minute I have left there’s not time to even begin discussing the problems with the idea, let alone potential solutions. But hopefully it contains the kernel of something interesting. The idea is to allow open editing of policy documents, in much the same way the Mathworks competition allows open editing of computer programs. But each time you make an edit, it’s sent to a randomly selected jury of your peers – say 50 of them. They’re invited to score your contribution, and perhaps offer feedback. They don’t all need to score it – just a few (say 3) is enough to start getting useful information about whether your contribution is an improvement or not. And, perhaps with some tweaking to prevent abuse, and to help ensure fair scoring, such a score might be used as a reliable way of signalling quality in the face of incomplete or uncertain information. My suspicion is that – as others have said of Wikipedia – this may be one of those ideas that works better in practice than it does in theory.

You can subscribe to my blog here.

This post is based on some brief remarks I made about open architecture democracy at the beginning of a panel on the subject, moderated by Tad Homer-Dixon, and with co-panelists Hassan Masum and Mark Tovey. One day, I hope to expand this into a much more thorough treatment.

Month: April 2010

Open Architecture Democracy