How changing the technology of collaboration can change the nature of collaboration

Update: Ilya (who made the video) reports in comments that some fraction of the effect described below is an artifact. It’s hard to say how much. Here’s Ilya’s comment:

Michael, apparently one of the reasons you see the explosion in commits is because Git correctly attributes the changeset to the author. In [Subversion] days, the patch would be submitted by some author, but then one of the core team members would merge it in (under his/her name). Basically, same workflow with Git, but with proper attribution.

Having said that, I think seeing other people commit and get their changes merged in also encourages other developers to join in on the fray!

So it may or may not be that what’s said in the post is true. But the video shown isn’t evidence for it. A pity. It’d be nice to have a clearly visualizable demonstration of this general phenomenon.

Ilya Grigorik recently pointed me to a great example which shows vividly how even relatively modest changes in the technology of collaboration can change the nature of collaboration. The example is from an open source project called Ruby on Rails. Ruby on Rails is a web development framework famous within the web development community – it’s been used to develop well-known sites such as twitter – but, unlike, say, Linux, it’s largely unknown outside its own community. The original developer of Ruby on Rails is a programmer named David Heinemeier Hansson who for a long time worked on the framework on his own, before other people gradually began to join him.

The short video below shows the history of the collaboration graphically – what you see are pieces of code being virtually shuttled backward and forward between different contributors to the collaboration. There’s no need to watch the whole video, although it’s fun to do so: in the bottom right of the video you’ll see a date ticking over, and you can simply fast forward to January 2008, and watch until June 2008. Here’s the video:

(Edit: It’s better in high definition at Vimeo. As it is, it’s hard to see the dates – the relevant part of the video is roughly from 4:00 to 5:30.)

What you see, very vividly, is that in April 2008, a qualitative change occurs in the collaboration. Before April, you see a relatively slow and stately development process. After April, that process explodes with vastly more contributors and activity. What happened in April was this: the Ruby on Rails developers changed the tool they used to share code. Before April they used a tool called Subversion. In April of 2008 they switched to a new tool called Git (managed through Github). As changes go, this was similar to a group of writers changing the wiki software they use to collaborate on a shared set of documents. What’s interesting is that the effect on the collaboration was so dramatic, out of proportion to our everyday experience; it’s almost as though Ernest Hemingway had gotten a qualitative improvement in his writing by changing the type of pen he used to write.

I won’t say much here about the details of what caused the change. Briefly, Git and Github are a lot more social than Subversion, making it easier for people to go off and experiment with code on their own, to merge useful changes back in, and to track the activity of other people. Git was, in fact, developed by Linus Torvalds, to help Linux development scale better.

The background to all this is that I’ve been collecting some thoughts about the ongoing Polymath project, an experiment in open source mathematics, and the question of how projects like Polymath can be scaled up further. I’ll have more to say about than in future posts, but for now it seemed worth describing this striking example of how changes in technology can result in changes in the nature of collaboration.

11 comments

  1. Michael, apparently one of the reasons you see the explosion in commits is because Git correctly attributes the changeset to the author. In SVN days, the patch would be submitted by some author, but then one of the core team members would merge it in (under his/her name). Basically, same workflow with Git, but with proper attribution.

    Having said that, I think seeing other people commit and get their changes merged in also encourages other developers to join in on the fray!

  2. Very cool. I wonder if you could add any other metrics to this? For example, did the codebase grow substantially after the transition (not that bigger equals better but it would be interesting). Did the total number of people to start using RoR increase? I know a number of people that stopped using RoR in 2008 because of problems with scaling things up. Is there anyway to estimate the quality of the end product of all that code shuffling?
    Again – very cool visualization, thanks for spotting and sharing it!

  3. I’m with Benjamin–is there some other metric that can show that switching to git indeed had an impact? A shame that it’s not dramatic as you originally thought, but a very worthy theory to investigate. Thanks for linking the very cool data visualization!

  4. Steve and Benjamin – Yeah, I agree, Benjamin’s suggestion is a good one. Something that’s probably not too difficult to get is time series data for the size of the codebase. It’d be interesting to look at how that evolved over time. It may have some discontinuities in it for spurious reasons though, as large pre-written chunks of code get added in or taken out of Rails.

    Ilya, I don’t suppose you happen to have that easily available?

  5. Well, the size of the commits is already taken into consideration in the rendered video. Larger commits get larger ‘bubbles’. As far as ‘quality’ goes, let me know if you find a metric for that. 😉

  6. Michael, I hope you and some of your researcher friends might have time to look at a new site, called Sci-Mate. The site contains a variety of Web 2.0 tools, but more importantly, is intended to be itself a collaborative effort run and developed by researchers.

  7. How about the number of people to download the binaries? That says something about its value – not very much, but probably more than the size of the codebase.

  8. I read the May Physics World and the arguments rang bells with me. I have attended many seminars and conferences, and always felt the need to an immediately-published Web Proceedings, screen-edited in user-friendly mode, with e-mail contact opportunity with author, plus optional pdf download, if one really needs it in pint, which is rare.

    If this were to be standard practice, the type of networking you have in mind would be encouraged.

    The practice of waiting months for a print version is destructive, as is the practice of giving only the powerpoint. One wants the actual paper, recognising that it usually is draft, work in progress, so as to comment constructively, perhaps to place on record the comments one has made in the real presence.

    RoyJ

Comments are closed.