There is no single future for scientific journals

A question I sometimes hear which I find odd is “What’s the future of scientific journals?” Often – not always, but often – underlying the question is a presumption that there is a single future for journals. The point of view seems to be that we’ve had journals in the past, and now we have this interesting new medium – the internet – so the big question is how journals are going to evolve, or (if slightly more ambitious) what we’re going to replace them with?

This seems to me a peculiar point of view. The origin of the point of view seems to be the fact that paper is a static, relatively inflexible medium. There’s only a limited number of things you can do with paper and a printing press, so scientific publishing to date has ended up concentrated in just a few forms (journals, monographs, textbooks, and a few others). This monolithic character leads to a presumption that scientific communication will continue to evolve in a monolithic way.

The problem with this point of view is that computers and the network are extraordinarily flexible. If you believe AI enthusiasts, computers will eventually end up smarter than us, along pretty much every axis. Imagine a medium that’s smarter, more flexible, and faster than us. What could it be used to do? Of course, the dreams of the AI enthusiasts are quite some ways off. But even now, the internet is an extraordinarly flexible medium. Paper can’t even begin to compare: we’re talking about a single medium that supports World of Warcraft, Intellipedia (collaborative data sharing for spooks), and flash mobs for pillow fighters. We’re not going to have a single future for scientific journals; asking what THE scientific journal of the future will be makes no more sense than asking a programmer what THE program of the future will be. What we will have instead is an increasing number of ways of sharing scientific information, and, in many cases, of doing science. We’re seeing signs of this fragmentation already, from video journals to slide sharing services to all sorts of databases.

There will, of course, be some concentration in particular formats and platforms. Network effects in science are strong – we don’t make discoveries alone, we make them as part of a larger culture of discovery! – and this will drive the broad adoption of shared platforms (and, for that matter, of open standards). But there’s no reason at all to think that there will be just a single platform or standard, not when there’s so much to be gained from multiple approaches.

I should make it clear that I think journals will play a role in all of this. There’s a great deal to be said for having a narrative to explain a new discovery. But we should expect a gradual proliferation in formats and platforms, and (inevitably) for conventional journal articles to recede to be just one of many ways new science is communicated. If that doesn’t happen, then we’re failing to take proper advantage of this new medium. This is what I think successful scientific publishers will do in the future. They’ll be the ones who create the platforms and standards scientists use to communicate science, and, in many cases, to actually do science. But scientific journals don’t have a single future.

Links

Konrad Forstner has a very interesting talk on what he sees as the future of scientific communication.

Nature runs a terrific blog, Nascent, which has frequent discussions of the future of science and scientific communication. Most scientific publishers have their head in the sand about the web. Nature, however, is innovating and experimenting in really interesting ways.

A few more: The Coming Revolution in Scholarly Communications & Cyberinfrastructure, an open access collection of articles by people such as Paul Ginsparg (of arxiv.org), Timo Hannay (Nature), Tony Hey (Microsoft), and many others.

An interesting report by Jon Udell on the use of the web for scientific collaboration. It’s a bit dated in some ways, but in other ways remains very fresh.

Kevin Kelly (founding editor of Wired) speculating on the future of science.

The Django Book, which is a nice example of a book (now published, I believe) that was developed in a very open style, with a web-based commenting s used to provide feedback to the authors as the book was written. I thought about doing something similar with my current book, but concluded that I don’t write in a linear enough style to make it feasible.

An article on open source science from the Harvard Business School.

Fullcodepress, a 24-hour event that’s happening in Sydney as I write. It’s a very cool collaborative project, where two teams are competing to build a fully functional website for a non-profit in 24 hours. Similar in concept to the Startup Weekends that are now springing up all over the place. What, exactly, can a group of human beings achieve when they come together and co-operate really intensively for 24 or 48 hours? Surprisingly much, seems to be the answer.

A thoughtful essay on the problems associated with all the social data people are now putting on the web. Starts from the (common) observation that it would be a lot more useful if it were more publicly available rather than locked up in places like Flickr, Amazon, Facebook, etc, and then makes many insightful observations about how to move to a more open system.

How to read a blog. This is a riff on one of my all-time favourite books, How to read a book, by Mortimer Adler.

Micropublication and open source research

This is an extract from my (very early) draft book on the way the internet is changing how science is done.

I would like to legitimize a new kind of proof: `The Incomplete Proof’. The reason that the output of mathematicians is so meager is that we only publish that tiny part of our work that ended up in complete success. The rest goes to the recycling bin. […] Why did [the great mathematician] Paul Cohen stop publishing at the age of 30? My guess is that he was trying, and probably still is, to prove [the Riemann Hypothesis]. I would love to be able to see his `failed attempts’. […] So here is my revolutionary proposal. Publish all your (good) thoughts and ideas, regardless of whether they are finished or not.
Doron Zeilberger

Imagine you are reading a research article. You notice a minor typo in the article, which you quickly fix, using a wiki-like editing system to create a new temporary “branch” of the article – i.e., a copy of the article, but with some modifications that you’ve made. The original authors of the article are notified of the branch, and one quickly contacts you to thank you for the fix. The default version of the article is now updated to point to your branch, and your name is automatically added to a list of people who have contributed to the article, as part of a complete version history of the article. This latter information is also collected by an aggregator which generates statistics about contributions, statistics which you can put into your curriculum vitae, grant applications, and so on.

Later on while reading, you notice a more serious ambiguity, an explanation that could be interpreted in several inconsistent ways. After some time, you figure out which explanation the authors intend, and prepare a corrected version of the article in a temporary branch. Once again, the original authors are notified. Soon, one contacts you with some queries about your fix, pointing out some subtleties that you’d failed to appreciate. After a bit of back and forth, you revise your branch further, until both you and the author agree that the result is an improvement on both the original article and on your first attempt at a branch. The author approved default version of the article is updated to point to the improved version, and you are recognized appropriately for your contribution.

Still later, you notice a serious error in the article – maybe a flaw in the logic, or a serious error of omission material to the argument – which you don’t immediately see how to fix. You prepare a temporary branch of the article, but this time, rather than correcting the error, you insert a warning explaining the existence and the nature of the error, and how you think it affects the conclusions of the article.

Once again, the original authors are notified of your branch. This time they aren’t so pleased with your modifications. Even after multiple back and forth exchanges, and some further revisions on your part, they disagree with your assessment that there is an error. Despite this, you remain convinced that they are missing your point.

Believing that the situation is not readily resolvable, you create a more permanent branch of the article. Now there are two branches of the article visible to the public, with slightly differing version histories. Of course, these version histories are publicly accessible, and so who contributed what is a matter of public record, and there is no danger that there will be any ambiguity about the origins of the new material, nor about the origin of the disagreement between the two branches.

Initially, most readers look only at the original branch of the article, but a few look at yours as well. Favourable commentary and a gradual relative increase in traffic to your branch (made suitably visible to potetial readers) encourages still more people to read your version preferentially. Your branch gradually becomes more highly visible, while the original fades. Someone else fixes the error you noticed, leading to your branch being replaced by a still further improved version, and still more traffic. After some months, reality sets in and the original authors come around to your point of view, removing their original branch entirely, leaving just the new improved version of the article. Alternately, perhaps the original authors, alarmed by their dimunition, decide to strike back with a revised version of their article, explaining in detail why you are wrong.

These stories illustrate a few uses of micropublication and open source research. These are simple ideas for research publication, but ones that have big consequences. The idea of micropublication is to enable publication in smaller increments and more diverse formats than in the standard scientific research paper. The idea of open source research is to open up the licensing model of scientific publication, providing more flexible ways in which prior work can be modified and re-used, while ensuring that all contributions are fully recognized and acknowledged.

Let’s examine a few more potential applications of micropublication and open source research.

Imagine you are reading an article about the principles of population control. As you read, you realize that you can develop a simulator which illustrates in a vivid visual form one of the main principles described in the article, and provides a sandbox for readers to play with and better understand that principle. After dropping a (favourably received) note to the authors, and a little work, you’ve put together a nice simulation. After a bit of back and forth with the authors, a link to your simulation is now integrated into the article. Anyone reading the article can now click on the relevant equation and will immediately see your simulation (and, if they like, the source code). A few months later, someone takes up your source code and develops the simulation further, improving the reader experience still further.

Imagine reading Einstein’s original articles on special relativity, and being able to link directly to simulations (or, even better, fully-fledged computer games) that vividly demonstrate the effects of length contraction, time dilation, and so on. In mathematical disciplines, this kind of content enhancement might even be done semi-automatically. The tools could gradually integrate the ability to make inferences and connections – “The automated reasoning software has discovered a simplification of Equation 3; would you like to view the simplification now?”

Similar types of content enhancement could, of course, be used in all disciplines. Graphs, videos, explanations, commentary, background material, data sets, source code, experimental procedures, links to wikipedia, links to other related papers, links to related pedagogical materials, talks, media releases – all these and more could be integrated more thoroughly into research publishing. Furthermore, rather than being second-class add-ons to “real” research publications, a well-designed citation and archival system would ensure that all these forms have the status of first-class research publications, raising their stature, and helping ensure that people put more effort into adding value in these ways.

Another use for open source research is more pedagogical in flavour. Imagine you are a student assigned to rewrite Einstein’s article on general relativity in the language of modern differential geometry. Think of the excitement of working with the master’s original text, fully inhabiting it, and then improving it still further! Of course, such an assignment is technologically possible even now. However, academia has strong cultural inhibitions against making such modifications to original research articles. I will argue that with properly authenticated archival systems these issues could be addressed, the inhibitions could be removed, and a world of new possibilities opened up.

Having discussed micropublication and open source research in concrete terms, let’s now describe them in more abstract terms, and briefly discuss some of the problems that must be overcome if they are to become viable modes of publication. More detailed resolutions to these problems will be discussed in a later post.

Micropublication does three things. First, it decreases the size of the smallest publishable unit of research. Second, it broadens the class of objects considered as first-class publishable objects so that it includes not just papers, but also items such as data, computer code, simulations, commentary, and so on. Third, it eliminates the barrier of peer review, a point we’ll come back to shortly. The consequence is to greatly reduce the friction slowing down the progress of the research community, by lowering the barriers to publication. Although promising, this lowering of the barriers to publication also creates three problems that must be addressed if the research community is to adopt the concept of micropublication.

The first problem is providing appropriate recognition for people’s contributions. This can be achieved through appropriate archival and citation systems, and is described in detail in a later post.

The second problem is quality assurance. The current convention in science is to filter content before publishing it through a system of peer review. In principle, this ensures that only the best research gets published in the top journals. While this system has substantial failures in practice, on the whole it has improved our access to high-quality research. To ensure similar quality, micropublication must use a publish-then-filter model which enables the highest quality research to be accurately identified. We will discuss the development of such filtering systems in a later post. Note, however, that publish-then-filter already works surprisingly well on the web, due to tools such as Google, which is capable of picking out high value webpages. Such filtering systems are far from perfect, of course, and there are serious obstacles to be overcome if this is to be a successful model.

The third problem is providing tools to organize and search through the mass of publication data. This is, in some sense, the flip side of the quality assurance problem, since it is also about organizing information in meaningful and useful ways, and there is considerable overlap in how these tools must work. Once again, we will discuss the development of these tools in a later post.

Open source research opens up the licensing model used in research publication so that people may make more creative reuse of existing work, and thus speed the process of research. It removes the cumbersome quote-and-cite licensing model in current use in sciece. This makes sense if one is publishing on paper, but is not necessary in electronic publication. Instead, it is replaced by a trustworthy authenticated archive of publication data which allows one to see an entire version history of a document, so that we can see who contributed what and when. This will allow people to rapidly improve, extend and enhance other people’s work, in all the ways described above.

Academics have something of a horror of the informal re-use that I may appear to be advocating. The reason is that the principal currency of research is attention and reputation, not (directly) money. In such a system, not properly citing sources is taken very seriously; even very illustrious researchers have fallen from grace over accusations of plagiarism. For these reasons, it is necessary to design the archival system carefully to ensure that one can gain the benefits of a more informal licensing model, while still adequately recognizing people’s contributions.

Overarching and unifying all these problems is one main problem, the problem of migration, i.e., convincing researchers that it is in their best interest to move to the new system. How can this possibly be achieved? The most obvious implementations of micropublication and open source research will require researchers to give up their participation in the standard recognition system of science — the existing journal system. Such a requirement will undoubtedly result in the migration failing. Fortunately, I believe it is possible to find a migratory path which integrates and extends the standard recognition system of science in such a way that researchers have only positive incentives to make the migration. This path does not start with a single jump to micropublication and open source research, but rather involves a staged migration, with each stage integrating support for legacy systems such as citation and peer review, but also building on new systems that can take the place of the legacy systems, and which are better suited for the eventual goals of micropublication and open source research. This process is quite flexible, but involves many separate ideas, which will be described in subsequent posts.

The Future of Science

How is the web going to impact science?

At present, the impact of the web on science has mostly been to make access to existing information easier, using tools such as online journals and databases such as the ISI Web of Knowledge and Google Scholar. There have also been some interesting attempts at developing other forms of tools, although so far as I am aware none of them have gained a lot of traction with the wider scientific community. (There are signs of exceptions to this rule on the horizon, especially some of the tools being developed by Timo Hannay’s team at Nature.)

The contrast with the internet at large is striking. Ebay, Google, Wikipedia, Facebook, Flickr and many others are new types of institution enabling entirely new forms of co-operation. Furthermore, the rate of innovation in creating such new institutions is enormous, and these examples only scratch the surface of what will soon be possible.

Over the past few months I’ve drafted a short book on how I think science will change over the next few years as a result of the web. Although I’m still revising and extending the book, over the next few weeks I’ll be posting self-contained excerpts here that I think might be of some interest. Thoughtful feedback, argument, and suggestions are very welcome!

A few of the things I discuss in the book and will post about here include:

  • Micropublication: Allowing immediate publication in small incremental steps, both of conventional text, and in more diverse media formats (e.g. commentary, code, data, simulations, explanations, suggestions, criticism and correction). All are to be treated as first class fully citable publications, creating an incentive for people to contribute far more rapidly and in a wider range of ways than is presently the case.
  • Open source research: Using version control systems to open up scientific publications so they can be extended, modified, reused, refactored and recombined by other users, all the while preserving a coherent and citable record of who did what, and when.
  • The future of peer review: The present quality assurance system relies on refereeing as a filtering system, prior to publication. Can we move to a system where the filtering is done after publication?
  • Collaboration markets: How can we fully leverage individual expertise? Most researchers spend much of their time reinventing the wheel, or doing tasks at which they have relatively little comparative advantage. Can we provide mechanisms to easily outsource work like this?
  • Legacy systems and migration: Why is it that the scientific community has been so slow to innovate on the internet? Many of the ideas above no doubt look like pipedreams. Nonetheless, I believe that by carefully considering and integrating with today’s legacy incentive systems (citation, peer review, and journal publication), it will be possible to construct a migration path that incentivizes scientists to make the jump to new tools for doing research.