{"id":591,"date":"2009-03-30T23:07:49","date_gmt":"2009-03-31T03:07:49","guid":{"rendered":"http:\/\/michaelnielsen.org\/blog\/?p=591"},"modified":"2009-03-31T00:25:49","modified_gmt":"2009-03-31T04:25:49","slug":"conscious-modularity-and-scaling-open-collaboration","status":"publish","type":"post","link":"https:\/\/michaelnielsen.org\/blog\/conscious-modularity-and-scaling-open-collaboration\/","title":{"rendered":"Conscious modularity and scaling open collaboration"},"content":{"rendered":"<p>I&#8217;ve recently been reviewing the history of open source software, and one thing I&#8217;ve been struck by is the enormous effort many open source projects put it into making their development <em>modular<\/em>.  They do this so work can be divided up, making it easier to scale the collaboration, and so get the benefits of diverse expertise and more aggregate effort.<\/p>\n<p>I&#8217;m struck by this because I&#8217;ve sometimes heard sceptics of open science assert that software has a natural modularity which makes it easy to scale open source software projects, but that difficult science problems often have less natural modularity, and this makes it unlikely that open science will scale.<\/p>\n<p>It looks to me like what&#8217;s really going on is that the open sourcers have adopted a posture of <em>conscious modularity<\/em>. They&#8217;re certainly not relying on any sort of natural modularity, but are instead working hard to achieve and preserve a modular structure. Here are three striking examples: <\/p>\n<ul>\n<li> The open source Apache webserver software was   <a href=\"http:\/\/en.wikipedia.org\/wiki\/Apache_HTTP_Server#History_and_name\">originally<\/a>   a fork of a <a href=\"http:\/\/en.wikipedia.org\/wiki\/NCSA_HTTPd\">public     domain webserver<\/a> developed by the US National Center for   Supercomputing Applications (NCSA).  The NCSA project was largely   abandoned in 1994, and the group that became Apache took over.  It   quickly became apparent that the old code base was far too   monolithic for a distributed effort, and the code base was   completely redesigned and overhauled to make it modular.\n<li> In September 1998 and June 2002 crises arose in Linux because of   community unhappiness at the slow rate new code contributions were   being accepted into the kernel.  In some cases contributions from   major contributors were being ignored completely.  The problem in   both 1998 and 2002 was that an overloaded Linus Torvalds was   becoming a single point of failure.  The situation was famously   summed up in 1998 by Linux developer Larry McVoy, who said simply   <a href=\"http:\/\/lkml.indiana.edu\/hypermail\/linux\/kernel\/9809.3\/0957.html\">&#8220;Linus     doesn&#8217;t scale&#8221;<\/a>.  This was a phrase repeated in a 2002   call-to-arms by Linux developer   <a href=\"http:\/\/lwn.net\/2002\/0131\/a\/patch-penguin.php3\">Rob Landley<\/a>.   The resolution in both cases was major re-organization of the   project that allowed tasks formerly managed by Torvalds to be split   up among the Linux community.  In 2002, for instance, Linux switched   to an entirely new way of managing code, using a package called   BitKeeper, designed in part to make modular development easier.\n<li> One of the Mozilla projects is an issue tracking system   (bugzilla), designed to make modular development easy, and which   Mozilla uses to organize development of the Firefox web browswer.   Developing bugzilla is a considerable overhead for Mozilla, but it&#8217;s   worth it to keep development modular.\n<\/ul>\n<p> The right lesson to learn from open source software, I think, is that it may be darned hard to achieve modularity in software development, but it can be worth it to reap the benefits of large-scale collaboration. Some parts of science may not be &#8220;naturally&#8221; modular, but that doesn&#8217;t mean they can&#8217;t be made modular with conscious effort on the part of scientists.  It&#8217;s a problem to be solved, not to give up on.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I&#8217;ve recently been reviewing the history of open source software, and one thing I&#8217;ve been struck by is the enormous effort many open source projects put it into making their development modular. They do this so work can be divided up, making it easier to scale the collaboration, and so get the benefits of diverse&hellip; <a class=\"more-link\" href=\"https:\/\/michaelnielsen.org\/blog\/conscious-modularity-and-scaling-open-collaboration\/\">Continue reading <span class=\"screen-reader-text\">Conscious modularity and scaling open collaboration<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[],"tags":[],"class_list":["post-591","post","type-post","status-publish","format-standard","hentry","entry"],"_links":{"self":[{"href":"https:\/\/michaelnielsen.org\/blog\/wp-json\/wp\/v2\/posts\/591","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/michaelnielsen.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/michaelnielsen.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/michaelnielsen.org\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/michaelnielsen.org\/blog\/wp-json\/wp\/v2\/comments?post=591"}],"version-history":[{"count":0,"href":"https:\/\/michaelnielsen.org\/blog\/wp-json\/wp\/v2\/posts\/591\/revisions"}],"wp:attachment":[{"href":"https:\/\/michaelnielsen.org\/blog\/wp-json\/wp\/v2\/media?parent=591"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/michaelnielsen.org\/blog\/wp-json\/wp\/v2\/categories?post=591"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/michaelnielsen.org\/blog\/wp-json\/wp\/v2\/tags?post=591"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}