Is scientific publishing about to be disrupted?

Part I: How Industries Fail

Until three years ago, the oldest company in the world was the construction company Kongo Gumi, headquartered in Osaka, Japan. Kongo Gumi was founded in 578 CE when the then-regent of Japan, Prince Shotoku, brought a member of the Kongo family from Korea to Japan to help construct the first Buddhist temple in Japan, the Shitenno-ji. The Kongo Gumi continued in the construction trade for almost one and a half thousand years. In 2005, they were headed by Masakazu Kongo, the 40th of his family to head Kongo Gumi. The company had more than 100 employees, and 70 million dollars in revenue. But in 2006, Kongo Gumi went into liquidation, and its assets were purchased by Takamatsu Corporation. Kongo Gumi as an independent entity no longer exists.

How is it that large, powerful organizations, with access to vast sums of money, and many talented, hardworking people, can simply disappear? Examples abound – consider General Motors, Lehman Brothers and MCI Worldcom – but the question is most fascinating when it is not just a single company that goes bankrupt, but rather an entire industry is disrupted. In the 1970s, for example, some of the world’s fastest-growing companies were companies like Digital Equipment Corporation, Data General and Prime. They made minicomputers like the legendary PDP-11. None of these companies exist today. A similar disruption is happening now in many media industries. CD sales peaked in 2000, shortly after Napster started, and have declined almost 30 percent since. Newspaper advertising revenue in the United States has declined 30 percent in the last 3 years, and the decline is accelerating: one third of that fall came in the last quarter.

There are two common explanations for the disruption of industries like minicomputers, music, and newspapers. The first explanation is essentially that the people in charge of the failing industries are stupid. How else could it be, the argument goes, that those enormous companies, with all that money and expertise, failed to see that services like iTunes and Last.fm are the wave of the future? Why did they not pre-empt those services by creating similar products of their own? Polite critics phrase their explanations less bluntly, but nonetheless many explanations boil down to a presumption of stupidity. The second common explanation for the failure of an entire industry is that the people in charge are malevolent. In that explanation, evil record company and newspaper executives have been screwing over their customers for years, simply to preserve a status quo that they personally find comfortable.

It’s true that stupidity and malevolence do sometimes play a role in the disruption of industries. But in the first part of this essay I’ll argue that even smart and good organizations can fail in the face of disruptive change, and that there are common underlying structural reasons why that’s the case. That’s a much scarier story. If you think the newspapers and record companies are stupid or malevolent, then you can reassure yourself that provided you’re smart and good, you don’t have anything to worry about. But if disruption can destroy even the smart and the good, then it can destroy anybody. In the second part of the essay, I’ll argue that scientific publishing is in the early days of a major disruption, with similar underlying causes, and will change radically over the next few years.

Why online news is killing the newspapers

To make our discussion of disruption concrete, let’s think about why many blogs are thriving financially, while the newspapers are dying. This subject has been discussed extensively in many recent articles, but my discussion is different because it focuses on identifying general structural features that don’t just explain the disruption of newspapers, but can also help explain other disruptions, like the collapse of the minicomputer and music industries, and the impending disruption of scientific publishing.

Some people explain the slow death of newspapers by saying that blogs and other online sources [1] are news parasites, feeding off the original reporting done by the newspapers. That’s false. While it’s true that many blogs don’t do original reporting, it’s equally true that many of the top blogs do excellent original reporting. A good example is the popular technology blog TechCrunch, by most measures one of the top 100 blogs in the world. Started by Michael Arrington in 2005, TechCrunch has rapidly grown, and now employs a large staff. Part of the reason it’s grown is because TechCrunch’s reporting is some of the best in the technology industry, comparable to, say, the technology reporting in the New York Times. Yet whereas the New York Times is wilting financially [2], TechCrunch is thriving, because TechCrunch’s operating costs are far lower, per word, than the New York Times. The result is that not only is the audience for technology news moving away from the technology section of newspapers and toward blogs like TechCrunch, the blogs can undercut the newspaper’s advertising rates. This depresses the price of advertising and causes the advertisers to move away from the newspapers.

Unfortunately for the newspapers, there’s little they can do to make themselves cheaper to run. To see why that is, let’s zoom in on just one aspect of newspapers: photography. If you’ve ever been interviewed for a story in the newspaper, chances are a photographer accompanied the reporter. You get interviewed, the photographer takes some snaps, and the photo may or may not show up in the paper. Between the money paid to the photographer and all the other costs, that photo probably costs the newspaper on the order of a few hundred dollars [3]. When TechCrunch or a similar blog needs a photo for a post, they’ll use a stock photo, or ask their subject to send them a snap, or whatever. The average cost is probably tens of dollars. Voila! An order of magnitude or more decrease in costs for the photo.

Here’s the kicker. TechCrunch isn’t being any smarter than the newspapers. It’s not as though no-one at the newspapers ever thought “Hey, why don’t we ask interviewees to send us a polaroid, and save some money?” Newspapers employ photographers for an excellent business reason: good quality photography is a distinguishing feature that can help establish a superior newspaper brand. For a high-end paper, it’s probably historically been worth millions of dollars to get stunning, Pulitzer Prizewinning photography. It makes complete business sense to spend a few hundred dollars per photo.

What can you do, as a newspaper editor? You could fire your staff photographers. But if you do that, you’ll destroy the morale not just of the photographers, but of all your staff. You’ll stir up the Unions. You’ll give a competitive advantage to your newspaper competitors. And, at the end of the day, you’ll still be paying far more per word for news than TechCrunch, and the quality of your product will be no more competitive.

The problem is that your newspaper has an organizational architecture which is, to use the physicists’ phrase, a local optimum. Relatively small changes to that architecture – like firing your photographers – don’t make your situation better, they make it worse. So you’re stuck gazing over at TechCrunch, who is at an even better local optimum, a local optimum that could not have existed twenty years ago:


local_optimum.jpg

Unfortunately for you, there’s no way you can get to that new optimum without attempting passage through a deep and unfriendly valley. The incremental actions needed to get there would be hell on the newspaper. There’s a good chance they’d lead the Board to fire you.

The result is that the newspapers are locked into producing a product that’s of comparable quality (from an advertiser’s point of view) to the top blogs, but at far greater cost. And yet all their decisions – like the decision to spend a lot on photography – are entirely sensible business decisions. Even if they’re smart and good, they’re caught on the horns of a cruel dilemma.

The same basic story can be told about the dispruption of the music industry, the minicomputer industry, and many other disruptions. Each industry has (or had) a standard organizational architecture. That organizational architecture is close to optimal, in the sense that small changes mostly make things worse, not better. Everyone in the industry uses some close variant of that architecture. Then a new technology emerges and creates the possibility for a radically different organizational architecture, using an entirely different combination of skills and relationships. The only way to get from one organizational architecture to the other is to make drastic, painful changes. The money and power that come from commitment to an existing organizational architecture actually place incumbents at a disadvantage, locking them in. It’s easier and more effective to start over, from scratch.

Organizational immune systems

I’ve described why it’s hard for incumbent organizations in a disrupted industry to change to a new model. The situation is even worse than I’ve described so far, though, because some of the forces preventing change are strongest in the best run organizations. The reason is that those organizations are large, complex structures, and to survive and prosper they must contain a sort of organizational immune system dedicated to preserving that structure. If they didn’t have such an immune system, they’d fall apart in the ordinary course of events. Most of the time the immune system is a good thing, a way of preserving what’s good about an organization, and at the same time allowing healthy gradual change. But when an organization needs catastrophic gut-wrenching change to stay alive, the immune system becomes a liability.

To see how such an immune system expresses itself, imagine someone at the New York Times had tried to start a service like Google News, prior to Google News. Even before the product launched they would have been constantly attacked from within the organization for promoting competitors’ products. They would likely have been forced to water down and distort the service, probably to the point where it was nearly useless for potential customers. And even if they’d managed to win the internal fight and launched a product that wasn’t watered down, they would then have been attacked viciously by the New York Times’ competitors, who would suspect a ploy to steal business. Only someone outside the industry could have launched a service like Google News.

Another example of the immune response is all the recent news pieces lamenting the death of newspapers. Here’s one such piece, from the Editor of the New York Times’ editorial page, Andrew Rosenthal:

There’s a great deal of good commentary out there on the Web, as you say. Frankly, I think it is the task of bloggers to catch up to us, not the other way around… Our board is staffed with people with a wide and deep range of knowledge on many subjects. Phil Boffey, for example, has decades of science and medical writing under his belt and often writes on those issues for us… Here’s one way to look at it: If the Times editorial board were a single person, he or she would have six Pulitzer prizes…

This is a classic immune response. It demonstrates a deep commitment to high-quality journalism, and the other values that have made the New York Times great. In ordinary times this kind of commitment to values would be a sign of strength. The problem is that as good as Phil Boffey might be, I prefer the combined talents of Fields medallist Terry Tao, Nobel prize winner Carl Wieman, MacArthur Fellow Luis von Ahn, acclaimed science writer Carl Zimmer, and thousands of others. The blogosophere has at least four Fields medallists (the Nobel of math), three Nobelists, and many more luminaries. The New York Times can keep its Pulitzer Prizes. Other lamentations about the death of newspapers show similar signs of being an immune response. These people aren’t stupid or malevolent. They’re the best people in the business, people who are smart, good at their jobs, and well-intentioned. They are, in short, the people who have most strongly internalized the values, norms and collective knowledge of their industry, and thus have the strongest immune response. That’s why the last people to know an industry is dead are the people in it. I wonder if Andrew Rosenthal and his colleagues understand that someone equipped with an RSS reader can assemble a set of news feeds that renders the New York Times virtually irrelevant? If a person inside an industry needs to frequently explain why it’s not dead, they’re almost certainly wrong.

What are the signs of impending disruption?

Five years ago, most newspaper editors would have laughed at the idea that blogs might one day offer serious competition. The minicomputer companies laughed at the early personal computers. New technologies often don’t look very good in their early stages, and that means a straightup comparison of new to old is little help in recognizing impending dispruption. That’s a problem, though, because the best time to recognize disruption is in its early stages. The journalists and newspaper editors who’ve only recognized their problems in the last three to four years are sunk. They needed to recognize the impending disruption back before blogs looked like serious competitors, when evaluated in conventional terms.

An early sign of impending disruption is when there’s a sudden flourishing of startup organizations serving an overlapping customer need (say, news), but whose organizational architecture is radically different to the conventional approach. That means many people outside the old industry (and thus not suffering from the blinders of an immune response) are willing to bet large sums of their own money on a new way of doing things. That’s exactly what we saw in the period 2000-2005, with organizations like Slashdot, Digg, Fark, Reddit, Talking Points Memo, and many others. Most such startups die. That’s okay: it’s how the new industry learns what organizational architectures work, and what don’t. But if even a few of the startups do okay, then the old players are in trouble, because the startups have far more room for improvement.

Part II: Is scientific publishing about to be disrupted?

What’s all this got to do with scientific publishing? Today, scientific publishers are production companies, specializing in services like editorial, copyediting, and, in some cases, sales and marketing. My claim is that in ten to twenty years, scientific publishers will be technology companies [4]. By this, I don’t just mean that they’ll be heavy users of technology, or employ a large IT staff. I mean they’ll be technology-driven companies in a similar way to, say, Google or Apple. That is, their foundation will be technological innovation, and most key decision-makers will be people with deep technological expertise. Those publishers that don’t become technology driven will die off.

Predictions that scientific publishing is about to be disrupted are not new. In the late 1990s, many people speculated that the publishers might be in trouble, as free online preprint servers became increasingly popular in parts of science like physics. Surely, the argument went, the widespread use of preprints meant that the need for journals would diminish. But so far, that hasn’t happened. Why it hasn’t happened is a fascinating story, which I’ve discussed in part elsewhere, and I won’t repeat that discussion here.

What I will do instead is draw your attention to a striking difference between today’s scientific publishing landscape, and the landscape of ten years ago. What’s new today is the flourishing of an ecosystem of startups that are experimenting with new ways of communicating research, some radically different to conventional journals. Consider Chemspider, the excellent online database of more than 20 million molecules, recently acquired by the Royal Society of Chemistry. Consider Mendeley, a platform for managing, filtering and searching scientific papers, with backing from some of the people involved in Last.fm and Skype. Or consider startups like SciVee (YouTube for scientists), the Public Library of Science, the Journal of Visualized Experiments, vibrant community sites like OpenWetWare and the Alzheimer Research Forum, and dozens more. And then there are companies like WordPress, Friendfeed, and Wikimedia, that weren’t started with science in mind, but which are increasingly helping scientists communicate their research. This flourishing ecosystem is not too dissimilar from the sudden flourishing of online news services we saw over the period 2000 to 2005.

Let’s look up close at one element of this flourishing ecosystem: the gradual rise of science blogs as a serious medium for research. It’s easy to miss the impact of blogs on research, because most science blogs focus on outreach. But more and more blogs contain high quality research content. Look at Terry Tao’s wonderful series of posts explaining one of the biggest breakthroughs in recent mathematical history, the proof of the Poincare conjecture. Or Tim Gowers recent experiment in “massively collaborative mathematics”, using open source principles to successfully attack a significant mathematical problem. Or Richard Lipton’s excellent series of posts exploring his ideas for solving a major problem in computer science, namely, finding a fast algorithm for factoring large numbers. Scientific publishers should be terrified that some of the world’s best scientists, people at or near their research peak, people whose time is at a premium, are spending hundreds of hours each year creating original research content for their blogs, content that in many cases would be difficult or impossible to publish in a conventional journal. What we’re seeing here is a spectacular expansion in the range of the blog medium. By comparison, the journals are standing still.

This flourishing ecosystem of startups is just one sign that scientific publishing is moving from being a production industry to a technology industry. A second sign of this move is that the nature of information is changing. Until the late 20th century, information was a static entity. The natural way for publishers in all media to add value was through production and distribution, and so they employed people skilled in those tasks, and in supporting tasks like sales and marketing. But the cost of distributing information has now dropped almost to zero, and production and content costs have also dropped radically [5]. At the same time, the world’s information is now rapidly being put into a single, active network, where it can wake up and come alive. The result is that the people who add the most value to information are no longer the people who do production and distribution. Instead, it’s the technology people, the programmers.

If you doubt this, look at where the profits are migrating in other media industries. In music, they’re migrating to organizations like Apple. In books, they’re migrating to organizations like Amazon, with the Kindle. In many other areas of media, they’re migrating to Google: Google is becoming the world’s largest media company. They don’t describe themselves that way (see also here), but the media industry’s profits are certainly moving to Google. All these organizations are run by people with deep technical expertise. How many scientific publishers are run by people who know the difference between an INNER JOIN and an OUTER JOIN? Or who know what an A/B test is? Or who know how to set up a Hadoop cluster? Without technical knowledge of this type it’s impossible to run a technology-driven organization. How many scientific publishers are as knowledgeable about technology as Steve Jobs, Sergey Brin, or Larry Page?

I expect few scientific publishers will believe and act on predictions of disruption. One common response to such predictions is the appealing game of comparison: “but we’re better than blogs / wikis / PLoS One / …!” These statements are currently true, at least when judged according to the conventional values of scientific publishing. But they’re as irrelevant as the equally true analogous statements were for newspapers. It’s also easy to vent standard immune responses: “but what about peer review”, “what about quality control”, “how will scientists know what to read”. These questions express important values, but to get hung up on them suggests a lack of imagination much like Andrew Rosenthal’s defense of the New York Times editorial page. (I sometimes wonder how many journal editors still use Yahoo!’s human curated topic directory instead of Google?) In conversations with editors I repeatedly encounter the same pattern: “But idea X won’t work / shouldn’t be allowed / is bad because of Y.” Well, okay. So what? If you’re right, you’ll be intellectually vindicated, and can take a bow. If you’re wrong, your company may not exist in ten years. Whether you’re right or not is not the point. When new technologies are being developed, the organizations that win are those that aggressively take risks, put visionary technologists in key decision-making positions, attain a deep organizational mastery of the relevant technologies, and, in most cases, make a lot of mistakes. Being wrong is a feature, not a bug, if it helps you evolve a model that works: you start out with an idea that’s just plain wrong, but that contains the seed of a better idea. You improve it, and you’re only somewhat wrong. You improve it again, and you end up the only game in town. Unfortunately, few scientific publishers are attempting to become technology-driven in this way. The only major examples I know of are Nature Publishing Group (with Nature.com) and the Public Library of Science. Many other publishers are experimenting with technology, but those experiments remain under the control of people whose core expertise is in others areas.

Opportunities

So far this essay has focused on the existing scientific publishers, and it’s been rather pessimistic. But of course that pessimism is just a tiny part of an exciting story about the opportunities we have to develop new ways of structuring and communicating scientific information. These opportunities can still be grasped by scientific publishers who are willing to let go and become technology-driven, even when that threatens to extinguish their old way of doing things. And, as we’ve seen, these opportunites are and will be grasped by bold entrepreneurs. Here’s a list of services I expect to see developed over the next few years. A few of these ideas are already under development, mostly by startups, but have yet to reach the quality level needed to become ubiquitous. The list could easily be continued ad nauseum – these are just a few of the more obvious things to do.

Personalized paper recommendations: Amazon.com has had this for books since the late 1990s. You go to the site and rate your favourite books. The system identifies people with similar taste, and automatically constructs a list of recommendations for you. This is not difficult to do: Amazon has published an early variant of its algorithm, and there’s an entire ecosystem of work, much of it public, stimulated by the Neflix Prize for movie recommendations. If you look in the original Google PageRank paper, you’ll discover that the paper describes a personalized version of PageRank, which can be used to build a personalized search and recommendation system. Google doesn’t actually use the personalized algorithm, because it’s far more computationally intensive than ordinary PageRank, and even for Google it’s hard to scale to tens of billions of webpages. But if all you’re trying to rank is (say) the physics literature – a few million papers – then it turns out that with a little ingenuity you can implement personalized PageRank on a small cluster of computers. It’s possible this can be used to build a system even better than Amazon or Netflix.

A great search engine for science: ISI’s Web of Knowledge, Elsevier’s Scopus and Google Scholar are remarkable tools, but there’s still huge scope to extend and improve scientific search engines [6]. With a few exceptions, they don’t do even basic things like automatic spelling correction, good relevancy ranking of papers (preferably personalized), automated translation, or decent alerting services. They certainly don’t do more advanced things, like providing social features, or strong automated tools for data mining. Why not have a public API [7] so people can build their own applications to extract value out of the scientific literature? Imagine using techniques from machine learning to automatically identify underappreciated papers, or to identify emerging areas of study.

High-quality tools for real-time collaboration by scientists: Look at services like the collaborative editor Etherpad, which lets multiple people edit a document, in real time, through the browser. They’re even developing a feature allowing you to play back the editing process. Or the similar service from Google, Google Docs, which also offers shared spreadsheets and presentations. Look at social version control systems like Git and Github. Or visualization tools which let you track different people’s contributions. These are just a few of hundreds of general purpose collaborative tools that are lightyears beyond what scientists use. They’re not widely adopted by scientists yet, in part for superficial reasons: they don’t integrate with things like LaTeX and standard bibliographical tools. Yet achieving that kind of integration is trivial compared with the problems these tools do solve. Looking beyond, services like Google Wave may be a platform for startups to build a suite of collaboration clients that every scientist in the world will eventually use.

Scientific blogging and wiki platforms: With the exception of Nature Publishing Group, why aren’t the scientific publishers developing high-quality scientific blogging and wiki platforms? It would be easy to build upon the open source WordPress platform, for example, setting up a hosting service that makes it easy for scientists to set up a blog, and adds important features not present in a standard WordPress installation, like reliable signing of posts, timestamping, human-readable URLs, and support for multiple post versions, with the ability to see (and cite) a full revision history. A commenter-identity system could be created that enabled filtering and aggregation of comments. Perhaps most importantly, blog posts could be made fully citable.

On a related note, publishers could also help preserve some of the important work now being done on scientific blogs and wikis. Projects like Tim Gowers’ Polymath Project are an important part of the scientific record, but where is the record of work going to be stored in 10 or 20 years time? The US Library of Congress has taken the initiative in preserving law blogs. Someone needs to step up and do the same for science blogs.

The data web: Where are the services making it as simple and easy for scientists to publish data as it to publish a journal paper or start a blog? A few scientific publishers are taking steps in this direction. But it’s not enough to just dump data on the web. It needs to be organized and searchable, so people can find and use it. The data needs to be linked, as the utility of data sets grows in proportion to the connections between them. It needs to be citable. And there needs to be simple, easy-to-use infrastructure and expertise to extract value from that data. On every single one of these issues, publishers are at risk of being leapfrogged by companies like Metaweb, who are building platforms for the data web.

Why many services will fail: Many unsuccessful attempts at implementing services like those I’ve just described have been made. I’ve had journal editors explain to me that this shows there is no need for such services. I think in many cases there’s a much simpler explanation: poor execution [8]. Development projects are often led by senior editors or senior scientists whose hands-on technical knowledge is minimal, and whose day-to-day involvement is sporadic. Implementation is instead delegated to IT-underlings with little power. It should surprise no one that the results are often mediocre. Developing high-quality web services requires deep knowledge and drive. The people who succeed at doing it are usually brilliant and deeply technically knowledgeable. Yet it’s surprisingly common to find projects being led by senior scientists or senior editors whose main claim to “expertise” is that they wrote a few programs while a grad student or postdoc, and who now think they can get a high-quality result with minimal extra technical knowledge. That’s not what it means to be technology-driven.

Conclusion: I’ve presented a pessimistic view of the future of current scientific publishers. Yet I hope it’s also clear that there are enormous opportunities to innovate, for those willing to master new technologies, and to experiment boldly with new ways of doing things. The result will be a great wave of innovation that changes not just how scientific discoveries are communicated, but also accelerates the way scientific discoveries are made.

Notes

[1] We’ll focus on blogs to make the discussion concrete, but in fact many new forms of media are contributing to the newspapers’ decline, including news sites like Digg and MetaFilter, analysis sites like Stratfor, and many others. When I write “blogs” in what follows I’m usually referring to this larger class of disruptive new media, not literally to conventional blogs, per se.

[2] In a way, it’s ironic that I use the New York Times as an example. Although the New York Times is certainly going to have a lot of trouble over the next five years, in the long run I think they are one of the newspapers most likely to survive: they produce high-quality original content, show strong signs of becoming technology driven, and are experimenting boldly with alternate sources of content. But they need to survive the great newspaper die-off that’s coming over the next five or so years.

[3] In an earlier version of this essay I used the figure 1,000 dollars. That was sloppy – it’s certainly too high. The actual figure will certainly vary quite a lot from paper to paper, but for a major newspaper in a big city I think on the order of 200-300 dollars is a reasonable estimate, when all costs are factored in.

[4] I’ll use the term “companies” to include for-profit and not-for-profit organizations, as well as other organizational forms. Note that the physics preprint arXiv is arguably the most successful publisher in physics, yet is neither a conventional for-profit or not-for-profit organization.

[5] This drop in production and distribution costs is directly related to the current move toward open access publication of scientific papers. This movement is one of the first visible symptoms of the disruption of scientific publishing. Much more can and has been said about the impact of open access on publishing; rather than review that material, I refer you to the blog “Open Access News”, and in particular to Peter Suber’s overview of open access.

[6] In the first version of this essay I wrote that the existing services were “mediocre”. That’s wrong, and unfair: they’re very useful services. But there’s a lot of scope for improvement.

[7] After posting this essay, Christina Pikas pointed out that Web of Science and Scopus do have APIs. That’s my mistake, and something I didn’t know.

[8] There are also services where the primary problem is cultural barriers. But for the ideas I’ve described cultural barriers are only a small part of the problem.

Acknowledgments: Thanks to Jen Dodd and Ilya Grigorik for many enlightening discussions.

About this essay: This essay is based on a colloquium given June 11, 2009, at the American Physical Society Editorial Offices. Many thanks to the people at the APS for being great hosts, and for many stimulating conversations.

Further reading:

Some of the ideas explored in this essay are developed at greater length in my book Reinventing Discovery: The New Era of Networked Science.

You can subscribe to my blog here.

My account of how industries fail was influenced by and complements Clayton Christensen’s book “The Innovator’s Dilemma”. Three of my favourite blogs about the future of scientific communication are “Science in the Open”, “Open Access News” and “Common Knowledge”. Of course, there are many more excellent sources of information on this topic. A good source aggregating these many sources is the Science 2.0 room on FriendFeed.

Update on the polymath project

A few brief comments on the first iteration of the polymath project, Tim Gowers’ ongoing experiment in collaborative mathematics:

  • The project is remarkably active, with nearly 300 substantive mathematical comments in just the first week. It shows few signs of slowing down.
  • It’s perhaps not (yet) a “massively” collaborative project, but many mathematicians are contributing – a quick pass over the comments suggests that so far 14 or so people have made substantive mathematical contributions, and it seems likely that number will rise further. Unsurprisingly, that number already rises considerably if you include people who have made comments on the collaborative process.
  • Regardless of the outcome of the project, I expect that many beginning research students in mathematics will find this a great resource for understanding what research is about. It’s a way of seeing research mathematicians as they work – trying ideas out, making occcasional errors, backtracking, and so on. I suspect many students will find this incredibly enlightening. To pick just one example of why this may be, my experience is that many beginning students assume that the key to research success lies in having great leaps of insight to solve difficult problems. The discussion shows something quite different: you see excellent mathematicians following up every little lead, trying out many different approaches to problems, seeing many, many ideas fail, and gradually aggregating small insights, as a bigger picture only very slowly emerges.
  • The discussion so far has been courteous and professional in the highest degree. I suspect such courteous and professional behaviour greatly increases the chances of success in such a collaboration. I’m reminded of the famous Hardy-Littlewood rules for collaboration. Tim Gowers’ rules of collaboration have something of the same flavour.
  • One might say that this courtesy and professionalism is only to be expected, given the many professional mathematicians participating. Unfortunately, it’s not difficult to find excellent blogs run by professional scientists where the comment sections are notably less courteous and professional. I’ll omit examples.
  • Initially, I wasn’t so sure about the idea of using the linear medium of blog comments to run such a project. It seemed restrictive to use anything less than a multi-threaded forum, if forum software could be found that was geared towards mathematics. (Something like Google Groups would be good, but it doesn’t provide any way to display mathematics, so far as I’m aware.) The linear format has worked much better than I thought it would. Although at times it makes the discussion difficult to follow, the linear format has the benefit of preventing the conversation (and the collaborative community) from fracturing too much. This may be something to think about for future projects.
  • Many large-scale collaborative projects make it easy for late entrants to make a contribution. For example, in the Kasparov versus the World chess game, new participants could enter late in the game and come up to speed quickly. This was in part because of the nature of chess (only the current board matters, not past positions), but it was also partially because of the public analysis tree maintained for much of the game by Irina Krush. This acted as a key reference point for World Team decisions, and summarized much of the then-current best thinking about the game. In a similar way, many open source projects encourage late entry, with new participants able to jump in after looking at the existing code base (analogous to the state of the chess board), and the project wiki (analogous to the analysis tree). As the polymath project continues, I hope similar points of entry will enable outsiders to follow what is happening, and to contribute, without necessarily having to follow the entire discussion to that point.

Open notebook quantum information

Tobias Osborne has decided to take the plunge, becoming (so far as I know) the first person explicitly taking an open notebook approach to quantum information and related areas. He has three posts up; all three concern quantum analogues to Boolean formulae.

The Logic of Collective Action

It is a curious fact that one of the seminal works on open culture and open science was published in 1965 (2nd edition 1971), several decades before the modern open culture and open science movements began in earnest. Mancur Olson’s book “The Logic of Collective Action” is a classic of economics and political science, a classic that contains much of interest for people interested in open science.

At the heart of Olson’s book is a very simple question: “How can collective goods be provided to a group?” Here, a “collective good” is something that all participants in the group desire (though possibly unevenly), and that, by its nature, is inherently shared between all the members of the group.

For example, airlines may collectively desire a cut in airport taxes, since such a cut would benefit all airlines. Supermarkets may collectively desire a rise in the market price of bread; such a rise would be, to them, a collective good, since it would be by its nature shared. Most of the world’s countries desire a stable climate, even if they are not necessarily willing to individually take the action necessary to ensure a stable climate. Music-lovers desire a free and legal online version of the Beatles’ musical repertoire. Scientists desire shared access to scientific data, e.g., from the Sloan Digital Sky Survey or the Allen Brain Atlas.

What Olson shows in the book is that although all parties in a group may strongly desire and benefit from a particular collective good (e.g., a stable climate), under many circumstances they will not take individual action to achieve that collective good. In particular, they often find it in their individual best interest to act against their collective interest. The book has a penetrating analysis of what conditions can cause individual and collective interests to be aligned, and what causes them to be out of alignement.

The notes in the present essay are much more fragmented than my standard essays. Rather than a single thesis, or a few interwoven themes, these are more in the manner of personal working notes, broken up into separate fragments, each one exploring some idea presented by Olson, and explaining how (if at all) I see it relating to open science. I hope they’ll be of interest to others who are interested in open science. I’m very open to discussion, but please do note that what I present here is a greatly abbreviated version (and my own interpretation) of what is merely part of what Olson wrote, omitting many important caveats that he discusses in detail; for the serious open scientist, I strongly recommend reading Olson’s book, as well as some of the related literature.

Why individuals may not act to obtain a collective good: Consider a situation in which many companies are all producing some type of widget, with each company’s product essentially indistinguishable from that being produced by the other companies. Obviously, the entire group of companies would benefit from a rise in the market price of the widget; such a rise would be for them a collective good. One way that price could rise would be for the supply of the widget to be restricted. Despite this fact, it is very unlikely that any single company will act on their own to restrict their supply of widgets, for their restriction of supply is likely to have a substantial negative impact on their individual profit, but a negligible impact on the market price.

This analysis is surprisingly general. As a small player in a big pond, why voluntarily act to provide a collective good, when your slice of any benefit will be quite small (e.g., due to an infinitesimal rise in prices), but the cost to you is quite large? A farmer who voluntarily restricted output to cause a rise in the price of farm products (a collective good for farmers) would be thought a loon by their farming peers, because of (not despite) their altruistic behaviour. Open scientists will recognize a familiar problem: a scientist who voluntarily shares their best ideas and data (making it a collective good for scientists) in a medium that is not yet regarded as scientifically meritorious does not do their individual prospects any good. One of the major questions of open science is how to obtain this collective good?

Small groups and big players: Olson points out that the analysis of the last two paragraphs fails to hold in the case of small groups, or in any situation where there are one or more “big players”. To see this, let’s return to the case of a restriction in supply leading to a rise in market price. Suppose a very large company decides to restrict supply of a good, perhaps causing a drop in supply of 1 percent. Suppose that the market responds with a 4 percent rise in price. Provided the company has greater than one quarter market share, the result will actually be an increase in profitability for the company. That is, in this case the company’s individual interest and the collective interest are aligned, and so the collective interest can be achieved through voluntary action on the part of the company.

This argument obviously holds only if one actor is sufficiently large that the benefit they reap from the collective good is sufficient, on its own, to justify their action. Furthermore, the fact that the large company takes this action by no means ensures that smaller companies will engage in the same action on behalf of the collective good, although the smaller companies will certainly be happy to reap the benefit of the larger company’s actions; Olson speaks, for this reason, of an “exploitation of the great by the small”. Indeed, notice that the impact of this strategy is to cause the market share of the large company to shrink slightly, moving them closer to a world in which their indiviudal benefit from collective action no longer justifies voluntary action on their part. (This shrinkage in market share also acts as a disincentive for them to act initially, despite the fact that in the short run profits will rise; this is a complication I won’t consider here.)

An closely related example may be seen in open source software. Many large companies – perhaps most famously, IBM and Sun – invest enormous quantities of money in open source software. Why do they provide this collective good for programmers and (sometimes) consumers? The answer is not as simple as the answer given in the last paragraph, because open source software is not a pure collective good. Many companies (including IBM and Sun) have developed significant revenue streams associated with open source, and they may benefit in other ways – community goodwill, and the disruption to the business models of competitors (e.g., Microsoft). Nonetheless, it seems likely that at least part of the reason they pour resources into open source is because purchasing tens of thousands of Windows licenses each year costs a company like IBM millions or tens of millions of dollars. At that scale, they can benefit substantially by instead putting that money to work making Linux better, and then using Linux for their operating system needs; the salient point is that because of IBM’s scale, it’s a large enough sum of money that they can expect to significantly improve Linux.

There is a similarity to some of the patterns seen in open data. Many open data projects are very large projects. I would go so far as to speculate that a quite disproportionate fraction of open data projects are very large projects – out of at most hundreds (more likely dozens) of projects funded at the one hundred million dollar plus level, I can think offhand of several that have open data; I’d be shocked if a similar percentage of “small science” experiments have open data policies.

Why is this the case? A partial explanation may be as follows. Imagine you are heading a big multi-institution collaboration that’s trying to get a one hundred million dollar experiment funded. You estimate that adopting an open data policy will increase your chances by three percent – i.e., it’s worth about 3 million dollars to your project. (I doubt many people really think quite this way, but in practice it probably comes to the same thing.) Now, making the data publicly available will increase the chances of outsiders “scooping” members of the collaboration. But the chance of this happening for any single member of the collaboration is rather small, especially if there is a brief embargo period before data is publicly released. By contrast, for a small experiment run in a single lab, the benefits of open data are much smaller, but the costs are comparable.

This analysis can be slotted into a more sophisticated three-part analysis. First, the person running the collaboration often isn’t concerned about being scooped themselves. This isn’t always true, but it is often true, for the leader or leaders of such projects often become more invested in the big picture than they are in making individual discoveries. They will instead tend to view any discovery from data produced by the project as a victory for the project, regardless of who actually makes the discovery. To the extent that the leadership is unconcerned about being scooped, they therefore have every incentive to go for open data. Second, if someone wants to join the collaboration, while they have researvations about an open data policy, they may also feel that it is worth giving up exclusive rights over data in exchange for a more limited type of exclusive access to a much richer data set. Third, as I argued in the previous paragraph, the trade-offs involved in open data are in any case more favourable for large collaborations than they are in small experiments.

Olson’s analysis suggests asking whether it might be easier to transition to a more open scientific culture in small, relatively close-knit research communities? If a community has only a dozen or so active research groups, might a few of those groups decide to “go open”, and then perhaps convince their peers to do so as well? With passionate, persuasive and generous leadership maybe this would be possible.

When is collective action possible? Roughly speaking, Olson identifies the following possibilities:

  • When it is made compulsory. This is the case in many trade unions, with Government taxes, and so on.
  • When social pressure is brought to bear. This is usually more effective in small groups that are already bound by a common interest. With suitable skills, it can also have an impact in larger groups, but this is usually much harder to achieve.
  • When it is people’s own best interests, and so occurs voluntarily. Olson argues that this mostly occurs in small groups, and that there is a tendency for “exploitation of the great by the small”. More generally, he argues that in a voluntary situation while some collective action may take place, the level is usually distinctly suboptimal.
  • When people are offered some other individual incentive. Olson offers many examples: one of the more amusing was the report that some trade unions spend more than ten percent of their budget on Christmas parties, simply to convince their members that membership is worthwhile.

Many of these ideas will already be familiar in the context of open science. Compulsion can be used to force people to share openly, as in the NIH public access policy. Alternately, by providing ways of measuring scientific contributions made in the open, it is possible to incentivize researchers to take a more open approach. This has contributed to the success of the preprint arXiv, with citation services such as Citebase making it straightforward to measure the impact a preprint is having.

This use of incentives means that the provision of open data (and other open knowledge) can gradually change from being a pure collective good to being a blend of a collective and a non-collective good. It becomes non-collective in the sense that the individual sharing the data derives some additional (unshared) benefit due to the act of sharing.

A similar transition occurred early in the history of science. As I have told elsewhere, early scientists such as Galileo, Hooke and Newton often went to great lengths to avoid sharing their scientific discoveries with others. They preferred to hoard their discoveries, and continue working in secret. The reason, of course, was that at the time shared results were close to a pure collective good; there was little individual incentive to share. With the introduction of the journal system, and the gradual professionalization of science, this began to change, with individuals having an incentive to share. Of course, that change only occurred very gradually, over a period of many decades. Nowadays, we take the link between publication and career success for granted, but that was something early journal editors (and others) had to fight for.

Similarly, online media are today going through a grey period. For example, a few years back, blogging was in many ways quite a disreputable activity for a scientist, fine for a hobby, but certainly not seen as a way of making a serious scientific contribution. It’s still a long way from being mainstream, but I think there are many signs that it’s becoming more accepted. As this process continues, online open science will shift from being a pure collective good to being a blend of a collective and non-collective good. As Olson suggests, this is a good way to thrive!

So, what use are networked tools for science? I’m occasionally asked: “If networked tools are so good for science, why haven’t we seen more aggressive adoption of those tools by scientists? Surely that shows that we’ve already hit the limits of what can be done, with email, Skype, and electronic journals?” Underlying this question is a presumption, the presumption that if the internet really has the potential to be as powerful a tool for science as I and others claim, then surely we scientists would have gotten together already to achieve it. More generally, it’s easy to presume that if a group of people (e.g., scientists) have a common goal (advancing science), then they will act together to achieve that goal. What’s important about Olson’s work is that it comprehensively shows the flaws in this argument. A group of people may all benefit greatly from some collective action, yet be unable to act together to achieve it. Olson shows that far from being unusual, this is in many ways to be expected.

Further reading

I’m writing a book about “The Future of Science”; this post is part of a series where I try out ideas from the book in an open forum. A summary of many of the themes in the book is available in this essay. If you’d like to be notified when the book is available, please send a blank email to the.future.of.science@gmail.com with the subject “subscribe book”. I’ll email you to let you know in advance of publication. I will not use your email address for any other purpose! You can subscribe to my blog here.

Doing science online

This post is the text for an invited after-dinner talk about doing science online, given at the banquet for the Quantum Information Processing 2009 conference, held in Santa Fe, New Mexico, January 12-16, 2009.

Good evening.

Let me start with a few questions. How many people here tonight know what a blog is?

How many people read blogs, say once every week or so, or more often?

How many people actually run a blog themselves, or have contributed to one?

How many people read blogs, but won’t admit it in polite company?

Let me show you an example of a blog. It’s a blog called What’s New, run by UCLA mathematician Terence Tao. Tao, as many of you are probably aware, is a Fields-Medal winning mathematician. He’s known for solving many important mathematical problems, but is perhaps best known as the co-discover of the Green-Tao theorem, which proved the existence of arbitrarily long arithmetic progressions of primes.

Tao is also a prolific blogger, writing, for example, 118 blog posts in 2008. Popular stereotypes to the contrary, he’s not just sharing cat pictures with his mathematician buddies. Instead, his blog is a firehose of mathematical information and insight. To understand how valuable Tao’s blog is, let’s look at a example post, about the Navier-Stokes equations. As many of you know, these are the standard equations used by physicists to describe the behaviour of fluids, i.e., inside these equations is a way of understanding an entire state of matter.

The Navier-Stokes equations are notoriously difficult to understand. People such as Feynman, Landau, and Kolmogorov struggled for years attempting to understand their implications, mostly without much success. One of the Clay Millenium Prize problems is to prove the existence of a global smooth solution to the Navier-Stokes equations, for reasonable initial data.

Now, this isn’t a talk about the Navier-Stokes equations, and there’s far too much in Terry Tao’s blog post for me to do it justice! But I do want to describe some of what the post contains, just to give you the flavour of what’s possible in the blog medium.

Tao begins his post with a brief statement explaining what the Clay Millenium Problem asks. He shares the interesting tidibt that in two spatial dimenions the solution to the problem is known(!), and asks why it’s so much harder in three dimensions. He tells us that the standard answer is turbulence, and explains what that means, but then says that he has a different way of thinking about the problem, in terms of what he calls supercriticality. I can’t do his explanation justice here, but very roughly, he’s looking for invariants which can be used to control the behaviour of solutions to the equations at different length scales. He points out that all the known invariants give weaker and weaker control at short length scales. What this means is that the invariants give us a lot of control over solutions at long length scales, where things look quite regular, but little control at short length scales, where you see the chaotic variation characteristic of turbulence. He then surveys all the known approaches to proving global existence results for nonlinear partial differential equations — he says there are just three broad approaches – and points out that supercriticality is a pretty severe obstruction if you want to use one of these approaches.

The post has loads more in it, so let me speed this up. He describes the known invariants for the equations, and what they can be used to prove. He surveys and critiques existing attempts on the problem. He makes six suggestions for ways of attacking the problem, including one which may be interesting to some of the people in this audience: he suggests that pseudorandomness, as studied by computer scientists, may be connected to the chaotic, almost random behaviour that is seen in the solutions the Navier-Stokes equations.

The post is filled to the brim with clever perspective, insightful observations, ideas, and so on. It’s like having a chat with a top-notch mathematician, who has thought deeply about the Navier-Stokes problem, and who is willingly sharing their best thinking with you.

Following the post, there are 89 comments. Many of the comments are from well-known professional mathematicians, people like Greg Kuperberg, Nets Katz, and Gil Kalai. They bat the ideas in Tao’s post backwards and forwards, throwing in new insights and ideas of their own. It spawned posts on other mathematical blogs, where the conversation continued.

That’s just one post. Terry Tao has hundreds of other posts, on topics like Perelman’s proof of the Poincare conjecture, quantum chaos, and gauge theory. Many posts contain remarkable insights, often related to open research problems, and they frequently stimulate wide-ranging and informative conversations in the comments.

That’s just one blogger. There are, of course, many other top-notch mathematician bloggers. Cambridge’s Tim Gowers, another Fields Medallist, also runs a blog. Like Tao’s blog, it’s filled with interesting mathematical insights and conversation, on topics like how to use Zorn’s lemma, dimension arguments in combinatorics, and a thought-provoking post on what makes some mathematics particularly deep.

Alain Connes, another Fields Medallist, is also a blogger. He only posts occasionally, but when he does his posts are filled with interesting mathematical tidbits. For example, I greatly enjoyed this post, where he talks about his dream of solving one of the deepest problems in mathematics – the problem of proving the Riemann Hypothesis – using non-commutative geometry, a field Connes played a major role in inventing.

Berkeley’s Richard Borcherds, another Fields Medallist, is also a blogger, although he is perhaps better described as an ex-blogger, as he hasn’t updated in about a year.

I’ve picked on Fields Medallists, in part because at least four of the 42 living Fields Medallists have blogs. But there are also many other excellent mathematical blogs, including blogs from people closely connected to the quantum information community, like Scott Aaronson, Dave Bacon, Gil Kalai, and many others.

Let me make a few observations about blogging as a medium.

It’s informal.

It’s rapid-fire.

Many of the best blog posts contain material that could not easily be published in a conventional way: small, striking insights, or perhaps general thoughts on approach to a problem. These are the kinds of ideas that may be too small or incomplete to be published, but which often contain the seed of later progress.

You can think of blogs as a way of scaling up scientific conversation, so that conversations can become widely distributed in both time and space. Instead of just a few people listening as Terry Tao muses aloud in the hall or the seminar room about the Navier-Stokes equations, why not have a few thousand talented people listen in? Why not enable the most insightful to contribute their insights back?

You can also think of blogs as a way of making scientific conversation searchable. If you type “Navier-Stokes problem” into Google, the third hit is Terry Tao’s blog post about it. That means future mathematicians can easily benefit from his insight, and that of his commenters.

You might object that the most important papers about the Navier-Stokes problem should show up first in the search. There is some truth to this, but it’s not quite right. Rather, insofar as Google is doing its job well, the ranking should reflect the importance and significance of the respective hits, regardless of whether those hits are papers, blog posts, or some other form. If you look at this way, it’s not so surprising that Terry Tao’s blog post is near the top. As all of us know, when you’re working on a problem, a good conversation with an insightful colleague may be worth as much (and sometimes more) than reading the classic papers. Furthermore, as search engines become better personalized, the search results will better reflect your personal needs; in a search utopia, if Terry Tao’s blog post is what you most need to see, it’ll come up first, while if someone else’s paper on the Navier-Stokes problem is what you most need to see, then that will come up first.

I’ve started this talk by discussing blogs because they are familiar to most people. But ideas about doing science in the open, online, have been developed far more systematically by people who are explicitly doing open notebook science. People such as Garrett Lisi are using mathematical wikis to develop their thinking online; Garrett has referred to the site as “my brain online”. People such as chemists Jean-Claude Bradley and Cameron Neylon are doing experiments in the open, immediately posting their results for all to see. They’re developing ideas like lab equipment that posts data in real time, posting data in formats that are machine-readable, enabling data mining, automated inference, and other additional services.

Stepping back, what tools like blogs, open notebooks and their descendants enable is filtered access to new sources of information, and to new conversation. The net result is a restructuring of expert attention. This is important because expert attention is the ultimate scarce resource in scientific research, and the more efficiently it can be allocated, the faster science can progress.

How many times have you been obstructed in your research by the need to prove or disprove a small result that is a little outside your core expertise, and so would take you days or weeks, but which you know, of a certainty, the right person could resolve in minutes, if only you knew who that person was, and could easily get their attention. This may sound like a fantasy, but if you’ve worked on the right open source software projects, you’ll know that this is exactly what happens in those projects – discussion forums for open source projects often have a constant flow of messages posing what seem like tough problems; quite commonly, someone with a great comparative advantage quickly posts a clever way to solve the problem.

If new online tools offer us the opportunity to restructure expert attention, then how exactly might it be restructured? One of the things we’ve learnt from economics is that markets can be remarkably effective ways of efficiently allocating scarce resources. I’ll talk now about an interesting market in expert attention that has been set up by a company named InnoCentive.

To explain InnoCentive, let me start with an example involving an Indian not-for-profit called the ASSET India Foundation. ASSET helps at-risk girls escape the Indian sex industry, by training them in technology. To do this, they’ve set up training centres in several large cities across India. They’ve received many requests to set up training centres in smaller towns, but many of those towns don’t have the electricity needed to power technologies like the wireless routers that ASSET uses in its training centers.

On the other side of the world, in the town of Waltham, just outside Boston, is the company InnoCentive. InnoCentive is, as I said, an online market in expert attention. It enables companies like Eli Lilly and Proctor and Gamble to pose “Challenges” over the internet, scientific research problems they’d like solved, with a prize for solution, often many thousands of dollars. Anyone in the world can download a detailed description of the Challenge, and attempt to win the prize. More than 160,000 people from 175 countries have signed up for the site, and prizes for more than 200 Challenges have been awarded.

What does InnoCentive have to do with ASSET India? Well, ASSET got in touch with the Rockefeller Foundation, and explained their desire for a low-cost solar-powered wireless router. Rockefeller put up 20,000 in prize money to post an InnoCentive Challenge to design a suitable wireless router. The Challenge was posted for two months at InnoCentive. 400 people downloaded the Challenge, and 27 people submitted solutions. The prize was awarded to a 31-year old Texan software engineer named Zacary Brown, who delivered exactly the kind of design that ASSET was looking for; a prototype is now being built by engineering students at the University of Arizona.

Let’s come back to the big picture. These new forms of contribution – blogs, wikis, online markets and so forth – might sound wonderful, but you might reasonably ask whether they are a distraction from the real business of doing science? Should you blog, as a young postdoc trying to build up a career, rather than writing papers? Should you contribute to Wikipedia, as a young Assistant Professor, when you could be writing grants instead? Crucially, why would you share ideas in the manner of open notebook science, when other people might build on your ideas, maybe publishing papers on the subjects you’re investigating, but without properly giving you credit?

In the short term, these are all important questions. But I think a lot of insight into these questions can be obtained by thinking first of the long run.

At the beginnning of the 17th century, Galileo Galilei constructed the first astronomical telescope, looked up at the sky, and turned his new instrument to Saturn. He saw, for the first time in human history, Saturn’s astonishing rings. Did he share this remarkable discovery with the rest of the world? He did not, for at the time that kind of sharing of scientific discovery was unimaginable. Instead, he announced his discovery by sending a letter to Kepler and several other early scientists, containing a latin anagram, “smaismrmilmepoetaleumibunenugttauiras”. When unscrambled this may be translated, roughly, as “I have discovered Saturn three-formed”. The reason Galileo announced his discovery in this way was so that he could establish priority, should anyone after him see the rings, while avoiding revealing the discovery.

Galileo could not imagine a world in which it made sense for him to freely share a discovery like the rings of Saturn, rather than hoarding it for himself. Certainly, he couldn’t share the discovery in a journal article, for the journal system was not invented until more than 20 years after Galileo died. Even then, journals took decades to establish themselves as a legitimate means of sharing scientific discoveries, and many early scientists looked upon journals with some suspicion. The parallel to the suspicion many scientists have of online media today is striking.

Think of all the knowledge we have, which we do not share. Theorists hoard clever observations and questions, little insights which might one day mature into a full-fledged paper. Entirely understandably, we hoard those insights against that day, doling them out only to trusted friends and close colleagues. Experimentalists hoard data; computational scientists hoard code. Most scientists, like Galileo, can’t conceive of a world in which it makes sense to share all that information, in which sharing information on blogs, wikis, and their descendents is viewed as being (potentially, at least) an important contribution to science.

Over the short term, things will only change slowly. We are collectively very invested in the current system. But over the long run, a massive change is, in my opinion, inevitable. The advantages of change are simply too great.

There’s a story, almost certainly apocryhphal, that the physicist Michael Faraday was approached after a lecture by Queen Victoria, and asked to justify his research on electricity. Faraday supposedly replied “Of what use is a newborn baby?”

Blogs, wikis, open notebooks, InnoCentive and the like aren’t the end of online innovation. They’re just the beginning. The coming years and decades will see far more powerful tools developed. We really will enormously scale up scientific conversation; we will scale up scientific collaboration; we will, in fact, change the entire architecture of expert attention, developing entirely new ways of navigating data, making connections and inferences from data, and making connections between people.

When we look back at the second half of the 17th century, it’s obvious that one of the great changes of the time was the invention of modern science. When historians look back at the early part of the twentyfirst century, they will also see several major changes. I know many of you in this room believe that one of those changes will be related to the sustainability of how humans live on this planet. But I think there are at least two other major historical changes. The first is the fact that this is the time in history when the world’s information is being transformed from an inert, passive, widely separated state, and put into a single, unified, active system that can make connections, that brings that information alive. The world’s information is waking up.

The second of those changes, closely related to the first, is that we are going to change the way scientists work; we are going to change the way scientists share information; we are going to change the way expert attention itself is allocated, developing new methods for connecting people, for organizing people, for leveraging people’s skills. They will be redirected, organized, and amplified. The result will speed up the rate at which discoveries are made, not in one small corner of science, but across all of science.

Quantum information and computation is a wonderful field. I was touched and surprised by the invitation to speak tonight. I have, I think, never felt more honoured in my professional life. But, I trust you can understand when I say that I am also tremendously excited by the opportunities that lie ahead in doing science online.

Further reading

I’m writing a book about “The Future of Science”; this post is part of a series where I try out ideas from the book in an open forum. A summary of many of the themes in the book is available in this essay. If you’d like to be notified when the book is available, please send a blank email to the.future.of.science@gmail.com with the subject “subscribe book”. I’ll email you to let you know in advance of publication. I will not use your email address for any other purpose! You can subscribe to my blog here.

When can the long tail be leveraged?

In 2006, Chris Anderson, the editor-in-chief of Wired magazine, wrote a bestselling book about an idea he called the long tail. The long tail is nicely illustrated by the bookselling business. Until recently, the conventional wisdom in bookselling was to stock only bestsellers. But internet bookstores such as Amazon.com take a different approach, stocking everything in print. According to Anderson, about a quarter of Amazon’s sales come from the long tail of books outside the top 100,000 bestselling titles (see here for the original research). While books in the long tail don’t individually sell many copies, they greatly outnumber the bestsellers, and so what they lack in individual sales they make up in total sales volume.

The long tail attracted attention because it suggested a new business model, selling into the long tail. Companies like Amazon, Netflix, and Lulu have built businesses doing just that. It also attracted attention because it suggested that online collaborations like Wikipedia and Linux might be benefitting greatly from the long tail of people who contribute just a little.

The problem if you’re building a business or online collaboration is that it can be difficult to tell whether participation is dominated by the long tail or not. Take a look at these two graphs:



The first graph is an idealized graph of Amazon’s book sales versus the sales rank, [tex]r[/tex], of the book. The second graph is an idealized graph of the number of edits made by the [tex]r[/tex]th most prolific contributor to Wikipedia. Superficially, the two graphs look similar, and it’s tempting to conclude that both graphs have a long tail. In fact, the two have radically different behaviour. In this post I’ll describe a general-purpose test that shows that Amazon.com makes it (just!) into the long tail regime, but in Wikipedia contributions from the short head dominate. Furthermore, this difference isn’t just an accident, but is a result of design decisions governing how people find information and make contributions.

Let’s get into more detail about the specifics of the Amazon and Wikipedia cases, before turning to the big picture. The first graph above shows the function

[tex]a / r^{0.871},[/tex]

where [tex]a[/tex] is a constant of proportionality, and [tex]r[/tex] is the rank of the book. The exponent is chosen to be [tex]0.871[/tex] because as of 2003 that makes the function a pretty close approximation to the number of books sold by Amazon. For our analysis, it doesn’t much matter what the value of [tex]a[/tex] is, so we won’t worry about pinning it down. All the important stuff is contained in the [tex]r^{0.871}[/tex] in the denominator.

The second graph shows the function

[tex]a / r^{1.7}.[/tex]

As with the Amazon sales formula, the Wikipedia edit formula isn’t exact, but rather is an approximation. I extracted the formula from a blog post written by a researcher studying Wikipedia at the Xerox PARC Augmented Cognition Center. I mention this because they don’t actually determine the exponent 1.7 themselves – I backed it out from one of their graphs. Note that, as for the Amazon formula, [tex]a[/tex] is a constant of proportionality whose exact value doesn’t matter. There’s no reason the values of the Wikipedia [tex]a[/tex] and the Amazon [tex]a[/tex] should be the same; I’m using the same letter in both formulas simply to avoid a profusion of different letters.

(A little parenthetical warning: figuring out power law exponents is a surprisingly subtle problem. It’s possible that my estimate of the exponent in the last paragraph may be off. See, e.g., this paper for a discussion of some of the subtleties, and references to the literature. If someone with access to the raw data wants to do a proper analysis, I’d be interested to know the results. In any case, we’ll see that the correct value for the exponent would need to be wildly different from my estimate before it could make any difference to the qualitative conclusions we’ll reach.)

Now suppose the total number of different books Amazon stocks in their bookstore is [tex]N[/tex]. We’ll show a bit later that the total number of books sold is given approximately by:

[tex]7.75 \times a \times N^{0.129}.[/tex]

The important point in this formula is that as [tex]N[/tex] increases the total number of books sold grows fairly rapidly. Double [tex]N[/tex] and you get a nearly ten percent increase in total sales. There’s a big benefit to being in the business of the long tail of books.

Let’s move to the second graph, the number of Wikipedia edits. If the total number of editors is [tex]N[/tex], then we’ll show below that the total number of edits made is approximately

[tex]2.05 \times a – O\left( \frac{a}{N^{1.7}} \right).[/tex]

The important point here is that, in contrast to the Amazon example, as [tex]N[/tex] increases it makes little difference to the total number of edits made. In Wikipedia, the total number of edits is dominated by the short head of editors who contribute a great deal.

A general rule to decide whether the long tail or the short head dominates

Let’s generalize the above discussion. We’ll find a simple general rule that can be used to determine whether the long tail or the short head dominates. Suppose the pattern of participation is governed by a power law distribution, with the general form

[tex]\frac{a}{r^b},[/tex]

where [tex]a[/tex] and [tex]b[/tex] are both constants. Both the Amazon and Wikipedia data can be described in this way, and it turns out that many other phenomena are described similarly – if you want to dig into this, I recommend the review papers on power laws by Mark Newman and Michael Mitzenmacher.

Let’s also suppose the total number of “participants” is [tex]N[/tex], where I use the term participants loosely – it might mean the total number of books on sale, the total number of contributors to Wikipedia, or whatever is appropriate to the situation. Our interest will be in summing the contributions of all participants.

When [tex]b < 1[/tex], the sum over all values of [tex]r[/tex] is approximately [tex]\frac{a N^{1-b}}{1-b}.[/tex] Thus, this case is tail-dominated, with the sum continuing to grow reasonably rapidly as [tex]N[/tex] grows. As we saw earlier, this is the case for Amazon's book sales, so Amazon really is a case where the long tail is in operation. When [tex]b = 1[/tex], the total over all values of [tex]r[/tex] is approximately [tex]a \log N.[/tex] This also grows as [tex]N[/tex] grows, but extremely slowly. It's really an edge case between tail-dominated and head-dominated. Finally, when [tex]b > 1[/tex], the total over all values of [tex]r[/tex] is approximately

[tex]a\zeta(b)-O\left(\frac{a}{N^b}\right),[/tex]

where [tex]\zeta(b)[/tex] is just a constant (actually, the Riemann zeta function, evaluated at [tex]b[/tex]), and the size of the corrections is of order [tex]a/N^b[/tex]. It follows that for large [tex]N[/tex] this approaches a constant value, and increasing the value of [tex]N[/tex] has little effect, i.e., this case is head-dominated. So, for example, it means that the great majority of edits to Wikipedia really are made by a small handful of dedicated contributors.

There is a caveat to all this discussion, which is that in the real world power laws are usually just an approximation. For many real world cases, the power law breaks down at the end of the tail, and at the very head of the distribution. The practical implication is that the quantitative values predicted by the above formula may be somewhat off. In practice, though, I don’t think this caveat much matters. Provided the great bulk of the distribution is governed by a power law, this analysis gives insight into whether it’s dominated by the head or by the tail.

Implications

If you’re developing a long tail business or collaboration, you need to make sure the exponent [tex]b[/tex] in the power law is less than one. The smaller the exponent, the better off you’ll be.

How can you make the exponent as small as possible? In particular, how can you make sure it’s smaller than the magic value of one? To understand the answer to this question, we need to understand what actually determines the value of the exponent. There’s some nice simple mathematical models explaining how power laws emerge, and in particular how the power law exponent emerges. At some point in the future I’d like to come back and discuss those in detail, and what implications they have for site architecture. This post is already long enough, though, so let me make just make three simple comments.

First, focus on developing recommendation and search systems which spread attention out, rather than concentrating it in the short head of what’s already popular. This is difficult to do without sacrificing quality, but there’s some interesting academic work now being done on such recommendation systems – see, for example, some of the work described in this recent blog post by Daniel Lemire.

Second, in collaborative projects, ensure a low barrier to entry for newcomers. One problem Wikipedia faces is a small minority of established Wikipedians who are hostile to new editors. It’s not common, but it is there. This drives newcomers away, and so concentrates edits within the group of established editors, effectively increasing the exponent in the power law.

Third, the essence of these and other similar recommendations is that they are systematic efforts to spread attention and contribution out, not one-off efforts toward developing a long tail of sales or contributions. The problem with one-off efforts is that they do nothing to change the systematic architectural factors which actually determine the exponent in the power law, and it is that exponent which is the critical factor.

Further reading

I’m writing a book about “The Future of Science”; this post is part of a series where I try out ideas from the book in an open forum. A summary of many of the themes in the book is available in this essay. If you’d like to be notified when the book is available, please send a blank email to the.future.of.science@gmail.com with the subject “subscribe book”. I’ll email you to let you know in advance of publication. I will not use your email address for any other purpose! You can subscribe to my blog here.

The role of open licensing in open science

The open science movement encourages scientists to make scientific information freely available online, so other scientists may reuse and build upon that information. Open science has many striking similarities to the open culture movement, developed by people like Lawrence Lessig and Richard Stallman. Both movements share the idea that powerful creative forces are unleashed when creative artifacts are freely shared in a creative commons, enabling other people to build upon and extend those artifacts. The artifact in question might be a set of text documents, like Wikipedia; it might be open source software, like Linux; or open scientific data, like the data from the Sloan Digital Sky Survey, used by services such as Galaxy Zoo. In each case, open information sharing enables creative acts not conceived by the originators of the information content.

The advocates of open culture have developed a set of open content licenses, essentially a legal framework, based on copyright law, which strongly encourages and in some cases forces the open sharing of information. This open licensing strategy has been very successful in strengthening the creative commons, and so moving open culture forward.

When talking to some open science advocates, I hear a great deal of interest and enthusiasm for open licenses for science. This enthusiasm seems prompted in part by the success of open licenses in promoting open culture. I think this is great – with a few minor caveats, I’m a proponent of open licenses for science – but the focus on open licenses sometimes bothers me. It seems to me that while open licenses are important for open science, they are by no means as critical as they are to open culture; open access is just the beginning of open science, not the end. This post discusses to what extent open licenses can be expected to play a role in open scientific culture.

Open licenses and open culture

Let me review the ideas behind the licensing used in the open culture movement. If you’re familiar with the open culture movement, you’ll have heard this all before; if you haven’t, hopefully it’s a useful introduction. In any case, it’s worth getting all this fixed in our heads before addressing the connection to open science.

The obvious thing for advocates of open culture to do is to get to work building a healthy public domain: writing software, producing movies, writing books and so on, releasing all that material into the public domain, and encouraging others to build upon those works. They could then use a moral suasion argument to encourage others to contribute back to the public domain.

The problem is that many people and organizations don’t find this kind of moral suasion very compelling. Companies take products from the public domain, build upon them, and then, for perfectly understandable reasons, fiercely protect the intellectual property they produce. Disney was happy to make use of the old tale of Cinderella, but they take a distinctly dim view of people taking their Cinderella movie and remixing it.

People like Richard Stallman and Lawrence Lessig figured out how to add legal teeth to the moral suasion argument. Instead of relying on goodwill to get people to contribute back to the creative commons, they invented a new type of licensing that compels people to contribute back. There’s now a whole bunch of such open licenses – the various varieties of the GNU Public License (GPL), Creative Commons licenses, and many others – with various technical differences between them. But there’s a basic idea of viral licensing that’s common to many (though not all) of the open licenses. This is the idea that anyone who extends a product released under such a license must release the extension under the same terms. Using such an open license is thus a lot like putting material into the public domain, in that both result in content being available in the creative commons, but the viral open licenses differ from the public domain in compelling people to contribute back into the creative commons.

The consequences of this compulsion are interesting. In the early days of open licensing, the creative commons grew slowly. As the amount of content with an open license grew, though, things began to change. This has been most obvious in software development, which was where viral open licenses first took hold. Over time it became more tempting for software developers to start development with an existing open source product. Why develop a new product from scratch, when you can start with an existing codebase? This means that you can’t use the most obvious business model – limit distribution to executable files, and charge for them – but many profitable open source companies have shown that alternate business models are possible. The result is that as time has gone on, even the most resolutely closed source companies (e.g., Microsoft) have found it difficult to avoid infection by open source. The result has been a gradually accelerating expansion of the creative commons, an expansion that has enabled extraordinary creativity.

Open licenses and open science

I’m not sure what role licensing will play in open science, but I do think there are some clear signs that it’s not going to be as central a role as it’s played in open culture.

The first reason for thinking this is that a massive experiment in open licensing has already been tried within science. By law, works produced by US Federal Government employees are, with some caveats, automatically put into the public domain. Every time I’ve signed a “Copyright Transfer” agreement with an academic journal, there’s always been in the fine print a clause exclusing US Government employees from having to transfer copyright. You can’t give away what you don’t own.

This policy has greatly enriched the creative commons. And it’s led to enormous innovation – for example, I’ve seen quite a few mapping services that build upon US Government data, presumably simply because that data is in the public domain. But in the scientific realm I don’t get the impression that this is doing all that much to promote the growth of the same culture of mass collaboration as open licenses are enabling.

(A similar discussion can be had about open access journals. The discucssion there is more complex, though, because (a) many of the journals have only been open access for a few years, and (b) the way work is licensed varies a lot from journal to journal. That’s why I’ve focused on the US Government.)

The second reason for questioning the centrality of open licenses is the observation that the main barriers to remixing and extension of scientific content aren’t legal barriers. They are, instead, cultural barriers. If someone copies my work, as a scientist, I don’t sue them. If I were to do that, it’s in any case doubtful that the courts would do more than slap the violator on the wrist – it’s not as though they’ll directly make money. Instead, there’s a strong cultural prohibition against such copying, expressed through widely-held community norms about plagiarism and acceptable forms of attribution. If someone copies my work, the right way to deal with it is to inform their colleagues, their superiors, and so on – in short, to deal with it by cultural rather than legal means.

That’s not to say there isn’t a legal issue here. But it’s a legal issue for publishers, not individual scientists. Many journal publishers have business models which are vulnerable to systematic large-scale attempts to duplicate their content. Someone could, for example, set up a “Pirate Bay” for scientific journal articles, making the world’s scientific articles freely available. That’s something those journals have to worry about, for legitimate short-term business reasons, and copyright law provides them with some form of protection and redress.

My own opinion is that over the long run, it’s likely that the publishers will move to open access business models, and that will be a good thing for open science. I might be wrong about that; I can imagine a world in which that doesn’t happen, yet certain varieties of open science still flourish. Regardless of what you think about the future of journals, the larger point is that the legal issues around openness are only a small part of a much larger set of issues, issues which are mostly cultural. The key to moving to a more open scientific system is changing scientist’s hearts and minds about the value and desirability of more openly sharing information, not reforming the legal rights under which they publish content.

So, what’s the right approach to licensing? John Wilbanks has argued, persuasively in my opinion, that data should be held in the public domain. I’ve sometimes wondered if this argument shouldn’t be extended beyond data, to all forms of scientific content, including papers, provided (and this is a big “provided”) the publisher’s business interests can be met in way that adequately serves all parties. After all, if the scientific community is primarily a reputation economy, built around cultural norms, then why not simply remove the complication of copyright from the fray?

Now, I should say that this is speculation on my part, and my thinking is incomplete on this set of issues. I’m most interested to hear what others have to say! I’m especially interested in efforts to craft open research licenses, like the license Victoria Stodden has been developing. But I must admit that it’s not yet clear to me why, exactly, we need such licenses, or what interests they serve.

Further reading

I’m writing a book about “The Future of Science”; this post is part of a series where I try out ideas from the book in an open forum. A summary of many of the themes in the book is available in this essay. If you’d like to be notified when the book is available, please send a blank email to the.future.of.science@gmail.com with the subject “subscribe book”. I’ll email you to let you know in advance of publication. I will not use your email address for any other purpose! You can subscribe to my blog here.

How Are the Mighty Fallen

Joshua Gans writes to point out a paper he wrote with George Shepherd, “How Are the Mighty Fallen: Rejected Classic Articles by Leading Economists” (link to pdf), about the experience some leading economists have had with peer review:

We asked over 140 leading economists, including all living winners of the Nobel Prize and John Bates Clark Medal, to describe instances in which journals rejected their papers. We hit a nerve. More than 60 percent responded, many with several blistering pages. Paul Krugman expressed the tone of many letters: “Thanks for the opportunity to let off a bit of steam.”

The paper is extremely readable (and entertaining), if you have any interest in peer review. Among other tidbits: an extraordinary list of rejected papers, many of them among the classics of economics; the estimate from Krugman that 60% of his papers are rejected on first try; the remarkable story of George Akerlof’s Nobel Prize-Winning paper “The Market for Lemons”, rejected by three separate journals before being published; the two rejections of the famous Black-Scholes options pricing paper, also Nobel Prize-Winning; Krugman’s comment that “I am having a terrible time with my current work on economic geography: referees tell me that it’s obvious, it’s wrong, and anyway they said it years ago.” There’s much more.

Addendum: Joshua also pointed me to a retrospective on the article (pdf here), which makes for interesting followup reading.

Three myths about scientific peer review

What’s the future of scientific peer review? The way science is communicated is currently changing rapidly, leading to speculation that the peer review system itself might change. For example, the wildly successful physics preprint arXiv is only very lightly moderated, which has led many people to wonder if the peer review process might perhaps die out, or otherwise change beyond recognition.

I’m currently finishing up a post on the future of peer review, which I’ll post in the near future. Before I get to that, though, I want to debunk three widely-believed myths about peer review, myths which can derail sensible discussion of the future of peer review.

A brief terminological note before I get to the myths: the term “peer review” can mean many different things in science. In this post, I restrict my focus to the anonymous peer review system scientific journals use to decide whether to accept or reject scientific papers.

Myth number 1: Scientists have always used peer review

The myth that scientists adopted peer review broadly and early in the history of science is surprisingly widely believed, despite being false. It’s true that peer review has been used for a long time – a process recognizably similar to the modern system was in use as early as 1731, in the Royal Society of Edinburgh’s Medical Essays and Observations (ref). But in most scientific journals, peer review wasn’t routine until the middle of the twentieth century, a fact documented in historical papers by Burnham, Kronick, and Spier.

Let me give a few examples to illustrate the point.

As a first example, we’ll start with the career of Albert Einstein, who wasn’t just an outstanding scientist, but was also a prolific scientist, publishing more than 300 journal articles between 1901 and 1955. Many of Einstein’s most ground-breaking papers appeared in his “miracle year” of 1905, when he introduced new ways of understanding space, time, energy, momentum, light, and the structure of matter. Not bad for someone unable to secure an academic position, and working as a patent clerk in the Swiss patent office.

How many of Einstein’s 300 plus papers were peer reviewed? According to the physicist and historian of science Daniel Kennefick, it may well be that only a single paper of Einstein’s was ever subject to peer review. That was a paper about gravitational waves, jointly authored with Nathan Rosen, and submitted to the journal Physical Review in 1936. The Physical Review had at that time recently introduced a peer review system. It wasn’t always used, but when the editor wanted a second opinion on a submission, he would send it out for review. The Einstein-Rosen paper was sent out for review, and came back with a (correct, as it turned out) negative report. Einstein’s indignant reply to the editor is amusing to modern scientific sensibilities, and suggests someone quite unfamiliar with peer review:

Dear Sir,

We (Mr. Rosen and I) had sent you our manuscript for publication and had not authorized you to show it to specialists before it is printed. I see no reason to address the in any case erroneous comments of your anonymous expert. On the basis of this incident I prefer to publish the paper elsewhere.

Respectfully,

P.S. Mr. Rosen, who has left for the Soviet Union, has authorized me to represent him in this matter.

As a second example, consider the use of peer review at the journal Nature. The prestige associated with publishing in Nature is, of course, considerable, and so competition to get published there is tough. According to Nature’s website, only 8 percent of submissions are accepted, and the rest are rejected. Given this, you might suppose that Nature has routinely used peer review for a long time as a way of filtering submissions. In fact, a formal peer review system wasn’t introduced by Nature until 1967. Prior to that, some papers were refereed, but some weren’t, and many of Nature’s most famous papers were not refereed. Instead, it was up to editorial judgement to determine which papers should be published, and which papers should be rejected.

This was a common practice in the days before peer review became widespread: decisions about what to publish and what to reject were usually made by journal editors, often acting largely on their own. These decisions were often made rapidly, with papers appearing days or weeks after submission, after a cursory review by the editor. Rejection rates at most journals were low, with only obviously inappropriate or unsound material being rejected; indeed, for some Society journals, Society members even asserted a “right” to publication, which occasionally caused friction with unhappy editors (ref).

What caused the change to the modern system of near-ubiquitous peer review? There were three main factors. The first was the increasing specialization of science (ref). As science became more specialized in the early 20th century, editors gradually found it harder to make informed decisions about what was worth publishing, even by the relatively relaxed standards common in many journals at the time.

The second factor in the move to peer review was the enormous increase in the number of scientific papers being published (ref). In the 1800s and early 1900s, journals often had too few submissions. Journal editors would actively round up submissions to make sure their journals remained active. The role of many editorial boards was to make sure enough papers were being submitted; if the journal came up short, members of the editorial board would be asked to submit papers themselves. As late as 1938, the editor-in-chief of the prestigious journal Science relied on personal solicitations for most articles (ref).

The twentieth century saw a massive increase in the number of scientists, a much easier process for writing papers, due to technologies such as typewriters, photocopiers, and computers, and a gradually increasing emphasis on publication in decisions about jobs, tenure, grants and prizes. These factors greatly increased the number of papers being written, and added pressure for filtering mechanisms, such as peer review.

The third factor in the move to peer review (ref) was the introduction of technologies for copying papers. It’s just plain editorially difficult to implement peer review if you can’t easily make copies of papers. The first step along this road was the introduction of typewriters and carbon paper in the 1890s, followed by the commercial introduction of photocopiers in 1959. Both technologies made peer review much easier to implement.

Nowadays, of course, the single biggest factor preserving the peer review system is probably social inertia: in most fields of science, a journal that’s not peer-reviewed isn’t regarded as serious, and so new journals invariably promote the fact that they are peer reviewed. But it wasn’t always that way.

Myth number 2: peer review is reliable

Update: Bill Hooker has pointed out that I’m using a very strong sense of “reliable” in this section, holding peer review to the standard that it nearly always picks up errors, is a very accurate gauge of quality, and rarely suppresses innovation. If you adopt a more relaxed notion of reliability, as many but not all scientists and members of the general public do, then I’d certainly back off describing this as a myth. As an approximate filter that eliminates or improves many papers, peer review may indeed function well.

Every scientist has a story (or ten) about how they were poorly treated by peer review – the important paper that was unfairly rejected, or the silly editor who ignored their sage advice as a referee. Despite this, many strongly presume that the system works “pretty well”, overall.

There’s not much systematic evidence for that presumption. In 2002 Jefferson et al (ref) surveyed published studies of biomedical peer review. After an extensive search, they found just 19 studies which made some attempt to eliminate obvious confounding factors. Of those, just two addressed the impact of peer review on quality, and just one addressed the impact of peer review on validity; most of the rest of the studies were concerned with questions like the effect of double-blind reviewing. Furthermore, for the three studies that addressed quality and validity, Jefferson et al concluded that there were other problems with the studies which meant the results were of limited general interest; as they put it, “Editorial peer review, although widely used, is largely untested and its effects are uncertain”.

In short, at least in biomedicine, there’s not much we know for sure about the reliability of peer review. My searches of the literature suggest that we know don’t much more in other areas of science. If anything, biomedicine seems to be unusually well served, in large part because several biomedical journals (perhaps most notably the Journal of the American Medical Association) have over the last 20 years put a lot of effort into building a community of people studying the effects of peer review; Jefferson et al‘s study is one of the outcomes from that effort.

In the absence of compelling systematic studies, is there anything we can say about the reliability of peer review?

The question of reliability should, in my opinion, really be broken up into three questions. First, does peer review help verify the validity of scientific studies; second, does peer review help us filter scientific studies, making the higher quality ones easier to find, because they get into the “best” journals, i.e., the ones with the most stringent peer review; third, to what extent does peer review suppress innovation?

As regards validity and quality, you don’t have to look far to find striking examples suggesting that peer review is at best partially reliable as a check of validity and a filter of quality.

Consider the story of the German physicist Jan Hendrik Schoen. In 2000 and 2001 Schoen made an amazing series of breakthroughs in organic superconductivity, publishing his 2001 work at a rate of one paper every 8 days, many in prestigious journals such as Nature, Science, and the Physical Review. Eventually, it all seemed a bit too good to be true, and other researchers in his community began to ask questions. His work was investigated, and much of it found to be fraudulent. Nature retracted seven papers by Schoen; Science retracted eight papers; and the Physical Review retracted six. What’s truly breathtaking about this case is the scale of it: it’s not that a few referees failed to pick up on the fraud, but rather that the refereeing system at several of the top journals systematically failed to detect the fraud. Furthermore, what ultimately brought Schoen down was not the anonymous peer review system used by journals, but rather investigation by his broader community of peers.

You might object to using this as an example on the grounds that the Schoen case involved deliberate scientific fraud, and the refereeing system isn’t intended to catch fraud so much as it is to catch mistakes. I think that’s a pretty weak objection – it can be a thin line between honest mistakes and deliberate fraud – but it’s not entirely without merit. As a second example, consider an experiment conducted by the editors of the British Medical Journal (ref). They inserted eight deliberate errors into a paper already accepted for publication, and sent the paper to 420 potential reviewers. 221 responded, catching on average only two of the errors. None of the reviewers caught more than five of the errors, and 16 percent no errors at all.

None of these examples is conclusive. But they do suggest that the refereeing system is far from perfect as a means of checking validity or filtering the quality of scientific papers.

What about the suppression of innovation? Every scientist knows of major discoveries that ran into trouble with peer review. David Horrobin has a remarkable paper (ref) where he documents some of the discoveries almost suppressed by peer review; as he points out, he can’t list the discoveries that were in fact suppressed by peer review, because we don’t know what those were. His list makes horrifying reading. Here’s just a few instances that I find striking, drawn in part from his list. Note that I’m restricting myself to suppression of papers by peer review; I believe peer review of grants and job applications probably has a much greater effect in suppressing innovation.

  • George Zweig’s paper announcing the discovery of quarks, one of the fundamental building blocks of matter, was rejected by Physical Review Letters. It was eventually issued as a CERN report.
  • Berson and Yalow’s work on radioimmunoassay, which led to a Nobel Prize, was rejected by both Science and the Journal of Clinical Investigation. It was eventually published in the Journal of Clinical Investigation.
  • Krebs’ work on the citric acid cycle, which led to a Nobel Prize, was rejected by Nature. It was published in Experientia.
  • Wiesner’s paper introducing quantum cryptography was initially rejected, finally appearing well over a decade after it was written.

To sum up: there is very little reliable evidence about the effect of peer review available from systematic studies; peer review is at best an imperfect filter for validity and quality; and peer review sometimes has a chilling effect, suppressing important scientific discoveries.

At this point I expect most readers will have concluded that I don’t much like the current peer review system. Actually, that’s not true, a point that will become evident in my post about the future of peer review. There’s a great deal that’s good about the current peer review system, and that’s worth preserving. However, I do believe that many people, both scientists and non-scientists, have a falsely exalted view of how well the current peer review system functions. What I’m trying to do in this post is to establish a more realistic view, and that means understanding some of the faults of the current system.

Myth: Peer review is the way we determine what’s right and wrong in science

By now, it should be clear that the peer review system must play only a partial role in determing what scientists think of as established science. There’s no sign, for example, that the lack of peer review in the 19th and early 20th century meant that scientists then were more confused than now about what results should be regarded as well established, and what should not. Nor does it appear that the unreliability of the peer review process leaves us in any great danger of collectively coming to believe, over the long run, things that are false.

In practice, of course, nearly all scientists understand that peer review is only part of a much more complex process by which we evaluate and refine scientific knowledge, gradually coming to (provisionally) accept some findings as well established, and discarding the rest. So, in that sense, this third myth isn’t one that’s widely believed within the scientific community. But in many scientists’ shorthand accounts of how science progresses, peer review is given a falsely exaggerated role, and this is reflected in the understanding many people in the general public have of how science works. Many times I’ve had non-scientists mention to me that a paper has been “peer-reviewed!”, as though that somehow establishes that it is correct, or high quality. I’ve encountered this, for example, in some very good journalists, and it’s a concern, for peer review is only a small part of a much more complex and much more reliable system by which we determine what scientific discoveries are worth taking further, and what should be discarded.

Further reading

I’m writing a book about “The Future of Science”; this post is part of a series where I try out ideas from the book in an open forum. A summary of many of the themes in the book is available in this essay. If you’d like to be notified when the book is available, please send a blank email to the.future.of.science@gmail.com with the subject “subscribe book”. I’ll email you to let you know in advance of publication. I will not use your email address for any other purpose! You can subscribe to my blog here.

The economics of scientific collaboration

What economics can tell us about scientific collaboration

In this and several future posts I’m going to discuss what economics can tell us about scientific collaboration.

This may sound like a strange topic. Why should economics tell us anything interesting about scientific collaboration? Most discussions of economics are couched in terms of money, interest rates, prices, and so on. While these are relevant to science in a shallow, who’s-paying-for-this-lab-space kind of way, it’s not obvious we can learn anything deep about scientific collaboration by thinking in economic terms.

At a deeper level, though, economics is about understanding how human beings behave when one or more resources are scarce. How are those resources allocated? Are there more efficient ways they might be allocated? What tradeoffs are incurred?

There is a fundamental scarce resource in science, one whose allocation largely determines how science progresses. That scarce resource is expert attention. Who pays attention to what problems? How long do they spend on those problems? What institutional structures determine the answers to those questions? In short, what determines the architecture of scientific attention?

We can learn interesting things by thinking about these questions using ideas from economics. In this post I pull apart the way scientific collaboration works, and put it back together again within the conceptual framework economists use to understand free trade, using concepts like comparative advantage, opportunity cost, and markets. The reason I’m doing this is because the way we structure scientific attention is currently changing rapidly (by historical standards), as networked tools like wikis, blogs, twitter, email, online databases and friendfeed change the architecture of scientific attention. To understand these changes is in part an economic problem, and the point of this post is to begin developing an economic perspective.

Comparative advantage, opportunity cost, and the benefits of free trade

Scientific collaboration can be viewed as a type of trade in expert attention. I can, for example, trade some of my skill as a theoretical physicist for someone else’s skills as a computational physicist, enabling us to jointly write a paper neither of us could have written alone.

To understand this collaboration-as-trade perspective, let’s review some ideas about trade in the context where trade is most often discussed, namely, free trade of goods. We’ll start with a beautiful simplifed model of free trade, a model that goes back to a famous 1817 book “On the Principles of Political Economy and Taxation”, by the economist David Ricardo. Like many useful models, it leaves out a lot that’s relevant to the real world, but it does capture an essential element of the world, and we can learn a great deal by thinking about the model. In particular, the model demonstrates vividly why all parties involved in free trade can benefit, and is one of the main reasons most economists strongly support free trade.

(A small digression: there’s a family connection in this post, since David Ricardo was my Great-Great-Great-Great-Great-Uncle.)

Here’s the model. Imagine there are just two people in the world, Alice the farmer, and Bob the car technician. Alice is good at producing potatoes, but not much good at assembling cars, while Bob is good at assembling cars, but not so good at producing potatoes. Pretty obviously, both Alice and Bob can benefit if Alice concentrates on producing potatoes, Bob concentrates on assembling cars, and they then trade potatoes for cars. While this is intuitively clear, it’s worth making precise with more concrete details. Let’s suppose the effort required for Alice to assemble a car is equal to the effort she requires to produce 20 tonnes of potatoes. Put another way, each car she assembles has an opportunity cost of 20 tonnes of potatoes, since that’s how much assembling a car will cost her in lost production of potatoes. Similarly, suppose the effort for Bob to assemble a car is equal to the effort he requires to produce 2 tonnes of potatoes. That is, each car has an opportunity cost for Bob of just 2 tonnes of potatoes.

In this situation, Bob has a comparative advantage over Alice in the production of cars, because Bob’s opportunity cost for producing cars is less than Alice’s. Equivalently, Alice has a comparative advantage over Bob in the production of potatoes, for her opportunity cost to produce a tonne of potatoes is 1/20th of a car, which is less than Bob’s opportunity cost of half a car.

Suppose Alice and Bob each concentrate in the areas where they have a comparative advantage, i.e., Alice concentrates on producing potatoes, and Bob concentrates on building cars. They then trade potatoes for cars. Both Alice and Bob benefit if the rate at which they trade is greater than 2 and less than 20 tonnes of potatoes per car, because they both will end up with more cars and potatoes than either could have produced on their own. Furthermore, the greater the comparative advantage, the more both parties benefit. Put another way, the more people specialize, the more possible benefit there is in free trade.

It’s worth stressing that the key factor determing the benefits of trade is comparative advantage, not Alice and Bob’s absolute abilities. It might be, for example, that Bob is a layabout who’s lousy both at assembling cars and producing potatoes. Perhaps he’s only capable of assembling one car (or producing 2 tonnes of potatoes) for every ten days of labour. Alice, despite being a farmer, might actually be better than layabout-Bob at assembling cars, producing one car (or twenty tonnes of potatoes) for every 5 days of labour. Even though Alice has an absolute advantage in producing both cars and potatoes, she and Bob are both better off if they concentrate on the areas where they have a comparative advantage, and then trade. Although this example is contrived, it has many implications in the real world. For example, differences in education and infrastructure mean that people in different countries often have enormous differences in their absolute ability to produce goods. Despite this, people in both countries may still benefit from trade if they all concentrate on areas where they have a comparative advantage.

This is all pretty simple, but it’s not universally understood. Much anti-business rhetoric assumes a zero-sum world in which evil captains of industry exploit the defenseless poor, i.e., if one person benefits from a transaction, the other person must lose. Very often, that’s a bad assumption. Good businesspeople look for transactions where both parties benefit; wouldn’t you prefer doing business with enthusiastic trading partners, rather than people who feel coerced or exploited? Of course, sometimes unethical businesspeople do coerce their trading partners, and sometimes trade between two parties can damage a third – environmental issues like pollution often have this nature. But Ricardo’s model is a good starting point to understand how free trade can work to the benefit of all parties.

Markets as a mechanism for aggregating information about comparative advantage

One question not answered in Ricardo’s model is how the trading rate is set. At what rate between 2 and 20 tonnes of potatoes per car should Alice and Bob trade? There are many possible ways to set the rate. In our society, the standard way is to use money as a medium of exchange, with markets determining the price of the relevant goods.

Let’s suppose Alice and Bob participate in such a market, and that the market price is 10,000 dollars per car, and 1,000 dollars per tonne of potatoes. The market thus provides a mechanism by which Alice and Bob can effectively trade cars for potatoes at a rate of one car for ten tonnes of potatoes. This is within the range where it is beneficial for both of them to trade, and so both may enter the market.

What if, instead, the market price was 5,000 dollars for a car, and 5,000 dollars for a tonne of potatoes? Then the effective trading rate is one car for one tonne of potatoes. Bob will be worse off if he enters the market: he’s better off both making cars and growing potatoes. The result is that Bob will withdraw from the car market, reducing the supply of cars. This will drive the market price of cars up a little, but this probably won’t be enough to change the price enough for Bob to re-enter the market. But if enough people withdraw, then the price of cars will go up a lot, and it will make sense for Bob to re-enter.

Money and markets thus serve several purposes. First, the market determines the price of different goods, and thus effectively sets exchange rates between different goods.

Second, the market price automatically aggregates information about comparative advantage, because the people who enter the market are those with a comparative advantage large enough that they can benefit from being in the market. People with a smaller comparative advantage have no reason to do so.

Third, while it’s possible to set up a barter market without the use of money, it’s obviously a great deal more efficient to use money as an intermediary, since for each type of good in the market, we need only keep track of a single price, rather than exchange rates with all the other types of good.

In fact, digressing briefly, it’s possible to prove that in an efficient barter market, an effective currency does emerge. By efficient, I mean that it’s not possible to increase your holdings by conducting a series of trades in immediate succession, e.g., by trading one ox for two cows, the two cows for one horse, and then the horse for two oxen. If this kind of trade is impossible, then it’s possible to just fix on one type of good – say, cows – as the effective unit of commerce, like the dollar, and peg all trades to that unit. From there it’s a small step to forgo the cows, introducing an abstract entity (i.e., money) to replace them. Furthermore, it’s reasonable to argue that you’d expect efficiency in this kind of market; if the market was inefficient in the way described above, then you’d expect one of the intermediaries in the transaction to realize it, and raise their prices, and so smooth away the inefficiency.

It’s remarkable how effective the market is at aggregating information about comparative advantage in this way. It lets us all take advantage of the combined efforts of millions of individuals, most doing tasks for which they have a considerable comparative advantage. Think about the number of people involved in producing a laptop computer. Tens or hundreds of thousands of people participated directly in designing and producing the components in that laptop; most of those people had considerable (in some cases, enormous) comparative advantage in the skills they contributed. When you buy a laptop, your few hundred dollars buys you the accumulated wisdom from a design history of billions of hours, stretching all the way back to the earliest computers. Beyond that, hundreds of millions of people contribute capital (e.g., via retirement funds) used to build infrastructure like chip foundries. Chances are that anywhere between a few dollars and a few hundred dollars from your retirement fund was invested in the chip foundry that produced the processor for the computer that I’m typing these words on. We’re both benefiting right now from differences in comparative advantage.

By providing a way of identifying and taking advantage of comparative advantage, markets encourage people to specialize, creating even greater disparaties in comparative advantage, and thus producing more mutual benefit. The better the market operates, the stronger this feedback effect becomes. Although it’s currently fashionable to bash markets (and economists), in fact many technologies we take for granted – cars, airliners, computers, telecommunications – would be near impossible without the modern market infrastructure.

Comparative advantage and scientific collaboration

Let’s construct a simple model of scientific collaboration inspired by Ricardo’s model of free trade. The model is, of course, a great oversimplification of how collaboration works; the point isn’t to capture the reality of collaboration exactly, but rather to illuminate some elements.

We’ll imagine two people, Alice and Bob, a physicist and a chemist, respectively. Alice is working on a problem in physics, but as she works an unanticipated problem arises, let’s say in chemistry. Let’s suppose for the sake of argument that the problem requires 100 hours of straightforward physics to solve, and 10 hours of straightforward chemistry. (The real numbers in most significant scientific problems are probably larger, but these numbers make the discussion below a little easier to read.) Unfortunately, Alice isn’t much of a chemist, and it would take her 200 hours to do the chemistry part of the problem, mostly spent bumbling around learning the required material. Alternately, if Bob got involved in the project, he could solve the chemistry problem in just ten hours.

There are two scenarios here. In the first, Alice does all the work, it takes 300 hours, and Alice gets all the credit for the paper published as a result. In the second, Alice does 100 hours of work, Bob does 10 hours of work, and they split the credit. Let’s say Alice ends up as first author on a paper describing the work, and Bob ends up as second author, and let’s further say that Alice gets two thirds of the credit as a result, and Bob gets one third of the credit.

Per hour worked, Alice is much better off in the collaborative scenario, getting two thirds of the reward for only one third of the effort. Bob is probably also better off, although the reason is more subtle: if Bob entered the collaboration freely, then it was presumably because Bob felt this was the best use of his time. This is not always the case – if Bob works for Alice he may have to do the work (or find another job), even though he’d do better science if he concentrated on other projects. This is a case where the trade is not completely free, but rather there is coercion. We’ll assume, though, that no coercion is involved, and that both parties benefit.

Let’s fill the model out a little more. Imagine that Bob’s alternative to collaboration is to go off and write straight-up chemistry papers, on his own, taking 110 hours to write each paper, and getting full credit for the paper. He is still better off working with Alice, for he gets one third of the credit for only 10 hours worth of work. Both Alice and Bob benefit, just as in Ricardo’s model.

Another similarity to Ricardo’s model is that it is comparative, not absolute, advantage which is critical. Let’s imagine Bob is actually a beginning chemistry student, and takes 100 hours to complete the work Alice needs done. He’s still better off working with Alice than working on his own, for on his own it would take 1100 hours to write a chemistry paper. Furthermore, Alice is still better off working with Bob than on her own, for the time she saves on doing chemistry is time she can put to work doing physics.

As an illustration of these ideas in a different context, consider the way many supervisors work with students and postodcs. The supervisors suggest problems, reading materials, likely angles of attack, and so on – all areas in which their experience gives them an enormous absolute advantage, and a sizable comparative advantage. The students do the detailed work in the lab. Many supervisors will have an absolute advantage in such lab work, but it is likely to be much smaller, and so the student likely has a comparative advantage in doing such work. Any time the supervisor spends doing such detailed lab work has an opportunity cost in lost time to be suggesting problems, reading materials and the like for another student.

An important difference between this model and Ricardo’s lies in the way we define the benefit to the parties involved. In the case of Ricardo’s model, the benefit is entirely intrinsic: Alice and Bob both want cars and potatoes. In the scientific case, there’s no intrinsic desire the parties have for “expert attention”. Rather, the benefit lies in the reputational credit derived from publications. This difference complicates the analysis of when it is worth it to collaborate. Instead of a simple trading rate, one must consider the way in which reputational credit is split. It is the ratio of this split to the opportunity cost that determines when it makes sense to collaborate. If Alice got 95 percent of the credit, and Bob only 5 percent of the credit, obviously it would not be in Bob’s interest to collaborate. In a future post, I’ll address this more fully, as well as many other aspects of this model.

For now, let me simply point out the relative lack of mechanisms science has for aggregating information about comparative advantage. Mostly, we do it by word of mouth and personal connection, the same way our ancestors traded goods, and so we don’t get the advantages that come from modern markets.

There are good reasons it’s difficult to set up efficient collaboration markets in expert attention. Creative problems are often highly specialized one-off problems, quite unlike the commodites traded in most markets. Until very recently, markets in such specialized goods were relatively uncommon and rather limited even in the realm of physical goods. This has recently changed, with online markets such as eBay showing that it is possible to set up markets which are highly specialized, provided suitable search and reputational tools are in place.

To the extent such collaboration markets do exist in science, they still operate very inefficiently compared with markets for trade in goods. There are considerable trust barriers that inhibit trading relationship being set up. There is no medium of exchange (c.f. the posts by Shirley Wu and Cameron Neylon’s on this topic). The end result is that mechanisms for identifying and aggregating comparative advantage are downright primitive compared with markets for physical goods.

Perhaps the best existing examples of collaboration markets occur in the open source programming community. No single model is used throughout that community, but for many open source projects the basic model is to set up one or more online fora (email discussion lists, wikis, bug-tracking software, etcetera) which is used to co-ordinate activity. The fora are used to advertise problems people are having, such as bugs they’d like fixed, or features they’d like added. People then volunteer to solve those problems, with self-selection ensuring that work is most often done by people with a considerable comparative advantage. The forum thus acts as a simple mechanism for aggregating information about comparative advantage. While this mechanism is primitive compared with modern markets, the success of open source is impressive, and the mechanisms for aggregating information about comparative advantage in expert attention will no doubt improve.

Let me conclude with a question that’s still puzzling me. As I mentioned before, markets have creative power: without them, it’s unlikely that sophisticated goods like laptops and aeroplanes could exist. I’d like to better understand whether more efficient collaboration markets can cause a similar shift in what scientific problems can be solved. Might scientific problems now regarded as out of reach become accessible with more effective ways of structuring scientific attention?