Scientific communication in the 21st century

By guest blogger Peter Rohde

In the last year the number of papers I have fully read can easily be counted on your hands. For the larger part I only read abstracts. Why is this? Because for most academic works I’m not especially interested in the details of calculations or the nitty gritty fine points of results. That’s something I’ll refer back to when/should I need it. For the larger part I’m only interested in understanding what it is that’s been done, what approaches were used to obtain the results, and what the remaining unanswered questions are. Typically these things can be characterized much more compactly than via a full scientific paper.

Aside from reading abstracts I gain much of my knowledge by speaking to people. This is a particularly useful way of learning for two reasons. Firstly, it is efficient, unlike verbose papers, and secondly it is interactive. If a particular point is not clear to me, I can grill for more detail. So, for the larger part, verbose scientific papers are far less useful to me than are their abstracts or talking to other people. Both of these points concur with the suggestions made in Robin Blume-Kohout’s contribution to this blog, where he advocates the “choose you own adventure” or hierarchically structured model. Evidently, speaking to other people is an example of this model – we prefer the terse over the verbose, with elaborations only when required. In such a structure, as I would envisage it, the abstract would be the root node of a tree. It would summarize the paper in a condensed, but completely self contained way – a micro-publication in very compact form. Each of the components in the abstract could be folded out to reveal further underlying details. This way the content is tailored to every reader. It means that I can continue doing what I normally do – only reading abstracts – with the bonus that if a particular aspect of the abstract is of interest to me, I can delve into it a little further without requiring me to read the entire paper.

This type of scientific communication lends itself exclusively to online publication. Indeed electronic media provides a plethora of new ways to structure and modularize information. Despite this, scientific publication has been stuck in a time warp where the archaic form of publication has been preserved. Essentially, present day electronic publications are structured and organized in exactly the same way as printed publications were 50 years ago, the only difference being that an LCD replaces paper. This is a sad misuse of resources.

Almost every other aspect of e-society has adopted, to some extent, the ideas advocated here and by Robin. The Wikipedia is the obvious, and perhaps most sophisticated example of this. Here every point in every article cross-references to other articles, creating a highly modularized and hierarchical structure. There are also less obvious examples. These days I never purchase newspapers, and it’s not an issue of saving money, it’s an issue of structural design. If I go to any major online news source, I’m presented with a very elegantly structured, hyperlinked front page. At the top of the page are all the headlines, each with a single line summary. Below this are divisions for international news, politics, technology, science etc, each with their own headlines and single line summary. In principle I could read just the front page and have a pretty good idea of what’s going on in the world and if I want more detail I can follow the links. This is much more efficient than the style adopted by many conventional newspapers of having one main story on the front page in addition to a few other headlines crammed at the bottom of the page, and all the rest jammed into separate pullouts.

Another area where the e-world is a step ahead of the paper world is in creating awareness of content. In present day scientific communication awareness of articles is created via two primary means. The first is by speaking with fellow scientists who draw our attention to articles that interested them. The second is by stumbling across things by oneself, for example, by reading the daily arXiv feeds. The trouble is that nowadays there is so much throughput that it becomes increasingly difficult to keep track of it all. A good analogy is the internet itself. Clearly the amount of material becoming available online is impossibly large to manage oneself. So to increase awareness of things that are of general interest, sites such as Slashdot, reddit and Digg have emerged. All these sites use some voting mechanism to create a list of pages that are of most interest to the online community. I think it is rapidly reaching the point where coping with the massive quantity of scientific communication will necessitate these kinds of approaches.

Another example of awareness creation, which is perhaps more suited to scientific publication, is that of recommendation systems. Some well known examples of recommendation systems are Amazon, iTunes, StumbleUpon and Last.fm. Here users’ preferences for pages/books/music are tracked, but not with the intention of creating a popularity list. Instead the preferences are hidden and only used internally by the service provider, who cross correlates your preferences with other users’ to suggest pages/books/music that might be of interest to you. This approach to discovering material is clearly much more effective than trawling through the immense amount of material out there on my own. Instead I can exploit the fact that others have done it for me.

In summary, the structure of present day scientific communication is inherently archaic. It replaces paper with LCD while taking little advantage of the abundance of possibilities for structuring information. Second, the sheer magnitude of scientific communication necessitates new means for creating awareness of material, using, for example, recommendation systems. While it’s very easy for me to sit here and bawl criticism at the current system, it’s not so straightforward to actually effect a transition to a different model. One route would be to convince a major publisher to adopt some of the aforementioned suggestions, and hope that it’s a success. The other would be set up a new system (e.g. a wiki or the like) and convince a group of reputable scientists to transition to that system. In either case, the success of the pursuit would require a certain critical mass.

6 comments

  1. Great article. All the advantages you cite for e-publishing over paper are very true, but paper has three key advantages (for the moment):

    1) Contrast. Reading papers on, well, paper, is easy on the eyes.

    2) Annotate-ability. Writing notes wherever I want on paper is easy. Doing the same on PDFs is hard, especially if I want to write equations.

    3) Works on the plane. Paper doesn’t require an internet connection to be readable.

    1 and 2 will eventually change as tablet PC technology improves, but for now, they’re big stumbling blocks. The best compromise I can think of is to allow people to “customize” their version of the paper by collapsing explanations they don’t need and expanding details they want to read. Once the paper is electronically customized, it can be printed if further reading is desired. Of course, I would only need this for the (small) subset of papers that I actually read in detail, but those papers are the ones I care about most. I have one or two papers I’ve actually had to re-print because the original copies got so worn and covered with scribbled notes that they became hard to read.

  2. Peter, I think you underestimate the amount of work required to put papers into this form. Wikipedia works because there are a lot of people who spend a lot of time on it. And even still much of Wikipedia is really awful.

    Why don’t you try writing a paper in this way? (Or rewriting a paper you already have?) Everything you need is out there. And it would make your suggestions much more concrete.

  3. Hi James,

    Presently you are right, it’s difficult to write papers in this way. I think the main reason for this is that the required software tools are not in place. Before the Wikipedia was in place one could have made the same criticism of highly modularized, hyperlinked articles. But once the Wiki software was in place and tailored for this particular application it became quite straightforward to do this. I think the same ought to apply to the suggestion I made.

  4. Let me just throw out a couple of comments:

    (1) Peter identifies Wikipedia as a good example of new models for publication. Open source software, considered as a publication / collaboration process, also suggests some really interesting new ideas. A recent development is distributed version control systems such as git and Mercurial (Linux is now using git, which was developed by Linus Torvalds) which are potentially far more useful for large-scale collaboration than old-fashioned version control systems like those used in Wikipedia.

    It’d be <em>very</em> interesting to mashup a wiki like MediaWiki (or something more lightweight) with something like git. Probably only take a few days to get something workable if you picked the right projects to mash.

    (2) <a href=”http://scholarpedia.org” rel=”nofollow”>Scholarpedia</a> is an interesting example of a new tool that is getting input from some very eminent scientists, along the lines advocated by Peter in his last paragraph.

  5. You don’t actually need to mash anything up. There is already wiki software that uses version control software (including subversion, mercurial, git, and others) for storage/version control. Ikiwiki is that wiki software. There are things there that could be improved, but it seems to have a healthy community around it.

Comments are closed.