On scaling up the Polymath project
Tim Gowers has an interesting post on the problem of scaling up the Polymath project to involve more contributors. Here are a few comments on the start of Tim’s post. I’ll return to the remainder of the post tomorrow:
As I have already commented, the outcome of the Polymath experiment differed in one important respect from what I had envisaged: though it was larger than most mathematical collaborations, it could not really be described as massive. However, I haven’t given up all hope of a much larger collaboration, and in this post I want to think about ways that that might be achieved.
As discussed in my earlier post, I think part of the reason for the limited size was the short time-frame of the project. The history of open source software suggests that building a large community usually takes considerably more time than Polymath had available – Polymath’s community of contributors likely grew faster than open source projects like Linux and Wikipedia. In that sense, Polymath’s limited scale may have been in part a consequence of its own rapid success.
With that said, it’s not clear that the Polymath community could have scaled up much further, even had it taken much longer for the problem to be solved, without significant changes to the collaborative design. The trouble with scaling conversation is that as the number of people participating goes up, the effort required to track the conversation also goes up. The result is that beyond a certain point, participants are no longer working on the problem at hand, but instead simply trying to follow the conversation (c.f. Brooks’ law). My guess is that Polymath was near that limit, and, crucially, was beyond that limit for some people who would otherwise like to have been involved. The only way to avoid this problem is to introduce new social and technical means for structuring the conversation, limiting the amount of attention participants need to pay to each other, and so increasing the scale at which conversation can take place. The trick is to do this without simultaneously destroying the effectiveness of the medium as a means of problem-solving.
(As an aside, it’s interesting to think about what properties of a technological platform make it easy to rapidly assemble and grow communities. I’ve noticed, for example, that the communities in FriendFeed rooms can grow incredibly rapidly, under the right circumstances, and this growth seems to be a result of some very particular and clever features of the way information is propagated in FriendFeed. But that’s a discussion for another day.)
First, let me say what I think is the main rather general reason for the failure of Polymath1 to be genuinely massive. I had hoped that it would be possible for many people to make small contributions, but what I had not properly thought through was the fact that even to make a small contribution one must understand the big picture. Or so it seems: that is a question I would like to consider here.
One thing that is undeniable is that it was necessary to have a good grasp of the big picture to contribute to Polymath1. But was that an essential aspect of any large mathematical collaboration, or was it just a consequence of the particular way that Polymath1 was organized? To make this question more precise, I would like to make a comparison with the production of open-source software (which was of course one of the inspirations for the Polymath idea). There, it seems, it is possible to have a large-scale collaboration in which many of the collaborators work on very small bits of code that get absorbed into a much larger piece of software. Now it has often struck me that producing an elaborate mathematical proof is rather like producing a complex piece of software (not that I have any experience of the latter): in both cases there is a clearly defined goal (in one case, to prove a theorem, and in the other, to produce a program that will perform a certain task); in both cases this is achieved by means of a sequence of strings written in a formal language (steps of the proof, or lines of code) that have to obey certain rules; in both cases the main product often splits into smaller parts (lemmas, subroutines) that can be treated as black boxes, and so on.
This makes me want to ask what it is that the producers of open software do that we did not manage to do.
Here’s two immediate thoughts inspired by that question, both of which are ways large open-source projects (a) reduce barriers to entry, and (b) limit the amount of attention required from potential contributors.
Clear separation of what is known from how it is known: In some sense, to get involved in an open source project, all you need do is understand the current source code. (In many projects, the code is modular, which means you may only need to understand a small part of the code.) You don’t need to understand all the previous versions of the code, or read through all the previous discussion that led to those versions. By contrast, it was, I think, somewhat difficult to follow the Polymath project without also following a considerable fraction of the discussion along the way.
Bugtracking: One of the common answers to the question “How can I get involved in open source?” is “Submit a bug report to your favourite open source project’s bugtracking system”. The next step up the ladder is: “Fix a bug in the bugtracking system”. Bugtracking systems are a great way of providing an entry point for new contributors, because they narrow the scope of problems down, limiting what a new contributor needs to learn, and how many other contributors they need to pay attention to. Of course, many bugs will be beyond a beginning contributor to fix. But it’s easy to browse through the bug database to find something within your ability to solve. While I don’t think bugtracking is quite the right model for doing mathematics, it’s possible that a similar system for managing problems of limited scope may help in projects like Polymath.