August 2007 – Michael Nielsen

How to write consistently boring scientific literature

How to write consistently boring scientific literature, by Kaj Sand-Jensen

Although scientists typically insist that their research is very exciting and adventurous when they talk to laymen and prospective students, the allure of this enthusiasm is too often lost in the predictable, stilted structure and language of their scientific publications. I present here, a top-10 list of recommendations for how to write consistently boring scientific publications. I then discuss why we should and how we could make these contributions more accessible and exciting.

Sadly, this is hidden behind a publisher pay wall. I particularly enjoyed the opening quote:

“Hell â€“ is sitting on a hot stone reading your own scientific publications”
– Erik Ursin, fish biologist

Non-abelian money

What would happen if we replaced the current monetary system, which is based on an abelian group [*] by a non-abelian currency system?

[*] If someone gives you x dollars, then y dollars, the result is the same as if you were given y dollars first, then x.

Iâ€™ve been puzzling about this for a few years. It raises lots of big questions. How would markets function differently? Might this lead to more efficient allocation of resources, at least in some instances? (At the very least, itâ€™d completely change our notion of what it means to wealthy!) Might new forms of co-operation emerge? How would results in game theory change if we could use non-abelian payoffs?

More generally, it seems like this sort of idea might be used to look at all of economics through an interesting lens.

A nice toy model in this vein is to work with the group of 2 by 2 invertible matrices, with the group operation being matrix multiplication. By taking matrix logarithms, it can be shown that this model is a generalization of the current monetary system.

Electronic implementation of non-abelian money would be a snap. The social implementation might be a bit tougher, however â€“ convincing people that their net wealth should be a matrix would be a tough sell, at least initially. Still, if non-abelian money changed some key results from economics, then in some niches it may be advantageous to make the switch, and possible to convince people that this is a good idea.

(It should, of course, be noted that there are in practice already many effects which make money act in a somewhat non-abelian fashion, e.g., inflation. From the point of view of this post, these are kludges: Iâ€™m talking about changing the underlying abstraction to a new one.)

The standard negative referee report

â€œThe work reported in this paper is obvious, and wrong. Besides, I did it all 5 years ago, anyway.â€

(I heard this from my PhD supervisor, Carl Caves, about 10 years ago. At the time, I thought it was funny…)

Kasparov versus the World

It is the greatest game in the history of chess. The sheer number of ideas, the complexity, and the contribution it has made to chess make it the most important game ever played.
-Garry Kasparov (World Chess Champion) in a Reuters interview conducted during his 1999 game against the World

In 1999, world chess champion Garry Kasparov, widely acknowledged as the greatest player in the history of the game, agreed to participate in a chess match sponsored by Microsoft, playing against “the World”. One move was to be made each 24 hours, with the World’s move being decided by a vote; anyone at all was allowed to vote on the World Team’s next move.

The game was staggering. After 62 moves of innovative chess, in which the balance of the game changed several times, the World Team finally resigned. Kasparov revealed that during the game he often couldn’t tell who was winning and who was losing, and that it wasn’t until after the 51st move that the balance swung decisively in his favour. After the game, Kasparov wrote an entire book about it. He claimed to have expended more energy on this one game than on any other in his career, including world championship games.

What is particularly amazing is that although the World Team had input from some very strong players, none were as strong as Kasparov himself, and the average quality was vastly below Kasparov’s level. Yet, collectively, the World Team produced a game far stronger than one might have expected from any of the individuals contributing, indeed, one of the strongest games ever played in history. Not only did they play Kasparov at his best, but much of the deliberation about World Team strategy and tactics was public, and so accessible to Kasparov, an advantage he used extensively. Imagine that not only are you playing Garry Kasparov at his best, but that you also have to explain in detail to Kasaparov all the thinking that goes into your moves!

How was this remarkable feat achieved?

It is worth noting that another “Grandmaster versus the world” game was played prior to this game, in which Grandmaster and former world champion Anatoly Karpov crushed the World Team. However, Kasparov versus the World used a very different system to co-ordinate the World Team’s efforts. Partially through design, and partially through good luck, this system enabled the World Team to co-ordinate their efforts far better than in the earlier game.

The basic idea used was that anyone in the world could register a vote for their preferred next move. The move taken was whichever garnered the most votes. Microsoft did not release detailed statistics, but claimed that on a typical move more than 5000 people voted. Furthermore, votes came from people at all levels of chess excellence, from chess grandmasters to rank amateurs. On one move, Microsoft reported that 2.4 percent of the votes were cast for moves that were not merely bad, but actually illegal! On other occasions moves regarded as obviously bad by experts obtained up to 10 percent of the vote. Over the course of the match, approximately 50,000 individuals from more than 75 countries participated in the voting.

Critical to the experiment were several co-ordinating devices that enabled the World Team to act more coherently.

An official game forum was set up by Microsoft so that people on the World Team could discuss and co-ordinate their ideas.

Microsoft appointed four official advisors to the World Team. These were outstanding teenage chess players, including two ranked as grandmasters, all amongst the best of their age in the world, although all were of substantially lower caliber than Kasparov. These four advisors agreed to provide advice to the World Team, and to make public recommendations on what move to take next.

In addition these formal avenues of advice, as the game progressed various groups around the world began to offer their own commentary and advice. Particuarly influential, although not always heeded, was the GM school, a strong Russian chess club containing several grandmasters.

Most of these experts ignored the discussion taking place on the game forum, and made no attempt to engage with the vast majority of people making up the World Team, i.e., the people whose votes would actually decide the World’s moves.

However, one of the World Team’s advisors did make an effort to engage the World Team. This was an extraordinary young chess player named Irina Krush. Fifteen years old, Krush had recently become the US Women’s chess champion. Although not as highly rated as two of the other World Team advisors, or as some of the grandmasters offering advice to the World Team, Krush was certainly in the international elite of junior chess players.

Unlike her expert peers, Krush focused considerable time and attention on the World Team’s game forum. Shrugging off flames and personal insults, she worked to extract the best ideas and analysis from the forum, as well as building up a network of strong chess-playing correspondents, including some of the grandmasters now offering advice.

Simultaneously, Krush built a publicly accessible analysis tree, showing possible moves and countermoves, and containing the best arguments and refutations for different lines of play, both from the game forum, and from her correspondence with others, including the GM school. This analysis tree enabled the World Team to focus its attention much more effectively, and served as a reference point for discussion, for further analysis, and for voting.

As the game went on, Krush’s role on the World Team gradually became more and more pivotal, despite the fact that according to their relative rankings, Kasparov would ordinarily have beaten Krush easily, unless he made a major blunder.

Part of the reason for this was the quality of Krush’s play. On move 10, Krush suggested a completely novel move that Kasparov called “A great move, an important contribution to chess”, and which all expert analysts agree blew the game wide open, taking it into uncharted chess territory. This raised her standing with the World Team, and helped her assume a coordinating role. Between moves 10 and 50 Krush’s recommended move was always played by the World Team, even when it disagreed with the recommendations of the other three advisors to the World Team, or with influential commentators such as the GM school.

As a result, some people have commented that the game was really Kasparov versus Krush, and Kasparov himself has claimed that he was really playing Smart Chess, Krush’s management team. Krush has repudiated this point of view, commenting on how important many other people’s input was to her recommendations. It seems likely that a more accurate picture is that Krush was at the center of the co-ordination effort for the World Team, and so had a better sense of the best overall recommendation made by the members of the World Team. Other, ostensibly stronger players weren’t as aware of all these different points of view, and so didn’t make as good decisions about what move to make next.

Krush’s coordinating role brought the best ideas of all contributors into a single coherent whole, weeding out bad moves from the good. As the game went on, much stronger players began to channel their ideas through her, including one of the strongest players from the GM school, Alexander Khalifman. The result was that the World Team emerged stronger than any individual player, indeed, arguably stronger than any player in history with the exception of Kasparov at his absolute peak, and with the advantage of being able to see the World “thinking” out loud as they deliberated the best course of action.

Kasparov versus the World is a fascinating case study in the power of collective collaboration. Most encouragingly for us, Kasparov versus the World provides convincing evidence that large groups of people acting in concert can solve creative problems well beyond the reach of any of them alone.

More practically, Kasparov versus the World suggests the value of providing centralized repositories of information which can serve as reference points for decision making and for the allocation of effort. Krush’s analysis tree was critical to the co-ordination of the World Team. It prevented duplication of effort on the part of the World Team, who didn’t have to chase down lines of play known to be poor, and acted as a reference point for discussion, for further analysis, and for voting.

Finally, Kasparov versus the World suggests the value of facilitators who act to channel community opinion. These people must have the respect of the community, but they need not be the strongest individual contributor. If such facilitators are flexible and responsive (without being submissive), they can co-ordinate and focus community opinion, and so build a whole stronger than any of its parts.

Anton Zeilinger

Anton Zeilinger now has a blog.

Links

Konrad Forstner has a very interesting talk on what he sees as the future of scientific communication.

Nature runs a terrific blog, Nascent, which has frequent discussions of the future of science and scientific communication. Most scientific publishers have their head in the sand about the web. Nature, however, is innovating and experimenting in really interesting ways.

A few more: The Coming Revolution in Scholarly Communications & Cyberinfrastructure, an open access collection of articles by people such as Paul Ginsparg (of arxiv.org), Timo Hannay (Nature), Tony Hey (Microsoft), and many others.

An interesting report by Jon Udell on the use of the web for scientific collaboration. It’s a bit dated in some ways, but in other ways remains very fresh.

Kevin Kelly (founding editor of Wired) speculating on the future of science.

The Django Book, which is a nice example of a book (now published, I believe) that was developed in a very open style, with a web-based commenting s used to provide feedback to the authors as the book was written. I thought about doing something similar with my current book, but concluded that I don’t write in a linear enough style to make it feasible.

An article on open source science from the Harvard Business School.

Fullcodepress, a 24-hour event that’s happening in Sydney as I write. It’s a very cool collaborative project, where two teams are competing to build a fully functional website for a non-profit in 24 hours. Similar in concept to the Startup Weekends that are now springing up all over the place. What, exactly, can a group of human beings achieve when they come together and co-operate really intensively for 24 or 48 hours? Surprisingly much, seems to be the answer.

A thoughtful essay on the problems associated with all the social data people are now putting on the web. Starts from the (common) observation that it would be a lot more useful if it were more publicly available rather than locked up in places like Flickr, Amazon, Facebook, etc, and then makes many insightful observations about how to move to a more open system.

How to read a blog. This is a riff on one of my all-time favourite books, How to read a book, by Mortimer Adler.

Micropublication and open source research

This is an extract from my (very early) draft book on the way the internet is changing how science is done.

I would like to legitimize a new kind of proof: `The Incomplete Proof’. The reason that the output of mathematicians is so meager is that we only publish that tiny part of our work that ended up in complete success. The rest goes to the recycling bin. […] Why did [the great mathematician] Paul Cohen stop publishing at the age of 30? My guess is that he was trying, and probably still is, to prove [the Riemann Hypothesis]. I would love to be able to see his `failed attempts’. […] So here is my revolutionary proposal. Publish all your (good) thoughts and ideas, regardless of whether they are finished or not.
– Doron Zeilberger

Imagine you are reading a research article. You notice a minor typo in the article, which you quickly fix, using a wiki-like editing system to create a new temporary “branch” of the article – i.e., a copy of the article, but with some modifications that you’ve made. The original authors of the article are notified of the branch, and one quickly contacts you to thank you for the fix. The default version of the article is now updated to point to your branch, and your name is automatically added to a list of people who have contributed to the article, as part of a complete version history of the article. This latter information is also collected by an aggregator which generates statistics about contributions, statistics which you can put into your curriculum vitae, grant applications, and so on.

Later on while reading, you notice a more serious ambiguity, an explanation that could be interpreted in several inconsistent ways. After some time, you figure out which explanation the authors intend, and prepare a corrected version of the article in a temporary branch. Once again, the original authors are notified. Soon, one contacts you with some queries about your fix, pointing out some subtleties that you’d failed to appreciate. After a bit of back and forth, you revise your branch further, until both you and the author agree that the result is an improvement on both the original article and on your first attempt at a branch. The author approved default version of the article is updated to point to the improved version, and you are recognized appropriately for your contribution.

Still later, you notice a serious error in the article – maybe a flaw in the logic, or a serious error of omission material to the argument – which you don’t immediately see how to fix. You prepare a temporary branch of the article, but this time, rather than correcting the error, you insert a warning explaining the existence and the nature of the error, and how you think it affects the conclusions of the article.

Once again, the original authors are notified of your branch. This time they aren’t so pleased with your modifications. Even after multiple back and forth exchanges, and some further revisions on your part, they disagree with your assessment that there is an error. Despite this, you remain convinced that they are missing your point.

Believing that the situation is not readily resolvable, you create a more permanent branch of the article. Now there are two branches of the article visible to the public, with slightly differing version histories. Of course, these version histories are publicly accessible, and so who contributed what is a matter of public record, and there is no danger that there will be any ambiguity about the origins of the new material, nor about the origin of the disagreement between the two branches.

Initially, most readers look only at the original branch of the article, but a few look at yours as well. Favourable commentary and a gradual relative increase in traffic to your branch (made suitably visible to potetial readers) encourages still more people to read your version preferentially. Your branch gradually becomes more highly visible, while the original fades. Someone else fixes the error you noticed, leading to your branch being replaced by a still further improved version, and still more traffic. After some months, reality sets in and the original authors come around to your point of view, removing their original branch entirely, leaving just the new improved version of the article. Alternately, perhaps the original authors, alarmed by their dimunition, decide to strike back with a revised version of their article, explaining in detail why you are wrong.

These stories illustrate a few uses of micropublication and open source research. These are simple ideas for research publication, but ones that have big consequences. The idea of micropublication is to enable publication in smaller increments and more diverse formats than in the standard scientific research paper. The idea of open source research is to open up the licensing model of scientific publication, providing more flexible ways in which prior work can be modified and re-used, while ensuring that all contributions are fully recognized and acknowledged.

Let’s examine a few more potential applications of micropublication and open source research.

Imagine you are reading an article about the principles of population control. As you read, you realize that you can develop a simulator which illustrates in a vivid visual form one of the main principles described in the article, and provides a sandbox for readers to play with and better understand that principle. After dropping a (favourably received) note to the authors, and a little work, you’ve put together a nice simulation. After a bit of back and forth with the authors, a link to your simulation is now integrated into the article. Anyone reading the article can now click on the relevant equation and will immediately see your simulation (and, if they like, the source code). A few months later, someone takes up your source code and develops the simulation further, improving the reader experience still further.

Imagine reading Einstein’s original articles on special relativity, and being able to link directly to simulations (or, even better, fully-fledged computer games) that vividly demonstrate the effects of length contraction, time dilation, and so on. In mathematical disciplines, this kind of content enhancement might even be done semi-automatically. The tools could gradually integrate the ability to make inferences and connections – “The automated reasoning software has discovered a simplification of Equation 3; would you like to view the simplification now?”

Similar types of content enhancement could, of course, be used in all disciplines. Graphs, videos, explanations, commentary, background material, data sets, source code, experimental procedures, links to wikipedia, links to other related papers, links to related pedagogical materials, talks, media releases – all these and more could be integrated more thoroughly into research publishing. Furthermore, rather than being second-class add-ons to “real” research publications, a well-designed citation and archival system would ensure that all these forms have the status of first-class research publications, raising their stature, and helping ensure that people put more effort into adding value in these ways.

Another use for open source research is more pedagogical in flavour. Imagine you are a student assigned to rewrite Einstein’s article on general relativity in the language of modern differential geometry. Think of the excitement of working with the master’s original text, fully inhabiting it, and then improving it still further! Of course, such an assignment is technologically possible even now. However, academia has strong cultural inhibitions against making such modifications to original research articles. I will argue that with properly authenticated archival systems these issues could be addressed, the inhibitions could be removed, and a world of new possibilities opened up.

Having discussed micropublication and open source research in concrete terms, let’s now describe them in more abstract terms, and briefly discuss some of the problems that must be overcome if they are to become viable modes of publication. More detailed resolutions to these problems will be discussed in a later post.

Micropublication does three things. First, it decreases the size of the smallest publishable unit of research. Second, it broadens the class of objects considered as first-class publishable objects so that it includes not just papers, but also items such as data, computer code, simulations, commentary, and so on. Third, it eliminates the barrier of peer review, a point we’ll come back to shortly. The consequence is to greatly reduce the friction slowing down the progress of the research community, by lowering the barriers to publication. Although promising, this lowering of the barriers to publication also creates three problems that must be addressed if the research community is to adopt the concept of micropublication.

The first problem is providing appropriate recognition for people’s contributions. This can be achieved through appropriate archival and citation systems, and is described in detail in a later post.

The second problem is quality assurance. The current convention in science is to filter content before publishing it through a system of peer review. In principle, this ensures that only the best research gets published in the top journals. While this system has substantial failures in practice, on the whole it has improved our access to high-quality research. To ensure similar quality, micropublication must use a publish-then-filter model which enables the highest quality research to be accurately identified. We will discuss the development of such filtering systems in a later post. Note, however, that publish-then-filter already works surprisingly well on the web, due to tools such as Google, which is capable of picking out high value webpages. Such filtering systems are far from perfect, of course, and there are serious obstacles to be overcome if this is to be a successful model.

The third problem is providing tools to organize and search through the mass of publication data. This is, in some sense, the flip side of the quality assurance problem, since it is also about organizing information in meaningful and useful ways, and there is considerable overlap in how these tools must work. Once again, we will discuss the development of these tools in a later post.

Open source research opens up the licensing model used in research publication so that people may make more creative reuse of existing work, and thus speed the process of research. It removes the cumbersome quote-and-cite licensing model in current use in sciece. This makes sense if one is publishing on paper, but is not necessary in electronic publication. Instead, it is replaced by a trustworthy authenticated archive of publication data which allows one to see an entire version history of a document, so that we can see who contributed what and when. This will allow people to rapidly improve, extend and enhance other people’s work, in all the ways described above.

Academics have something of a horror of the informal re-use that I may appear to be advocating. The reason is that the principal currency of research is attention and reputation, not (directly) money. In such a system, not properly citing sources is taken very seriously; even very illustrious researchers have fallen from grace over accusations of plagiarism. For these reasons, it is necessary to design the archival system carefully to ensure that one can gain the benefits of a more informal licensing model, while still adequately recognizing people’s contributions.

Overarching and unifying all these problems is one main problem, the problem of migration, i.e., convincing researchers that it is in their best interest to move to the new system. How can this possibly be achieved? The most obvious implementations of micropublication and open source research will require researchers to give up their participation in the standard recognition system of science — the existing journal system. Such a requirement will undoubtedly result in the migration failing. Fortunately, I believe it is possible to find a migratory path which integrates and extends the standard recognition system of science in such a way that researchers have only positive incentives to make the migration. This path does not start with a single jump to micropublication and open source research, but rather involves a staged migration, with each stage integrating support for legacy systems such as citation and peer review, but also building on new systems that can take the place of the legacy systems, and which are better suited for the eventual goals of micropublication and open source research. This process is quite flexible, but involves many separate ideas, which will be described in subsequent posts.

Incentives

In the comments, Franklin writes on the subject of open source research:

On the other side of the coin, what would be the incentives for contributing to other peopleâ€™s research?

This is an excellent question. Generalizing, any proposed change to how people do research, collaborate, or publish necessarily must face the question: what are the incentives to participate in the change? One must find a migration path which provides positive incentives at each step of the way, or else the migration is doomed to failure. I am proposing very significant changes to how research is done, and so the incentives along the migration path necessarily require considerable thought. Addressing these issues systematically is one of the main reasons I’ve written a book.

What I’m imbibing

Commenter Martin points to the January 2007 issue of Physics World, which contains lot of very interesting information about Web 2.0 and Science. In a similar vein, Corie Lok has some thoughtful recent reflections on getting scientists to adopt new tools for research. Finally, let me mention Jon Udell’s interview with Lewis Shepherd, talking about the US Defense Intelligence Agency’s use of wikis, blogs, Intellipedia, and many other interesting things. Some of the challenges he faced in bringing such social tools to Defense are similar to the problems in bringing them to science.

On a completely different topic, let me mention a fantastic presentation about green technology given earlier this year by John Doerr at the TED conference. I’ve been working my way through all the online TED talks, many of which are really good. While I’m at it, I may as well plug the Long Now talks, which is also a great series, with talks by people like Danny Hillis, John Baez, Stewart Brand, Jimmy Wales and many others.

More on funding

Chad Orzel has some thoughtful comments on my earlier questions about research funding. Here’s a few excerpts and some further thoughts:

… a good deal of the image problems that science in general has at the moment can be traced to a failure to grapple more directly with issues of funding and the justification of funding… In the latter half of the 20th century, we probably worked out the quantum details of 1000 times as many physical systems as in the first half, but that sort of thing feels a little like stamp collecting– adding one new element to a mixture and then re-measuring the band structure of the resulting solid doesn’t really seem to be on the same level as, say, the SchrÃ¶dinger equation, but I’m at a loss for how to quantify the difference… The more important question, though, is should we really expect or demand that learning be proportional to funding?

This really gets to the nub of it. In research, as in so many other things, funding may hit a point of diminishing returns beyond which what we learn becomes more and more marginal. However, it is by no means obvious where the threshold is beyond which society as a whole would be better off allocating its resources to other more worthy causes.

And what, exactly, do we as a society expect to get out of fundamental research?

For years, the argument has been based on technology– that fundamental research is necessary to understand how to build the technologies of the future, and put a flying car in every garage. This has worked well for a long time, and it’s still true in a lot of fields, but I think it’s starting to break down in the really big-ticket areas. You can make a decent case that, say, a major neutron diffraction facility will provide materials science information that will allow better understanding of high-temperature superconductors, and make life better for everyone. It’s a little harder to make that case for the Higgs boson, and you’re sort of left with the Tang and Velcro argument– that working on making the next generation of whopping huge accelerators will lead to spin-off technologies that benefit large numbers of people. It’s not clear to me that this is a winning argument– we’ve gotten some nice things out of CERN, the Web among them, but I don’t know that the return on investment really justifies the expense.

The spinoff argument also has the problem that it’s hard to argue that these things wouldn’t have happened anyway. No disrespect to Tim Berners-Lee’s wonderful work, but it’s hard to believe that if he hadn’t started the web, some MIT student in a dorm room wouldn’t have done so shortly thereafter.

Of course, it’s not like I have a sure-fire argument. Like most scientists, I think that research is inherently worth funding– it’s practically axiomatic. Science is, at a fundamental level, what sets us apart from other animals. We don’t just accept the world around us as inscrutable and unchangeable, we poke at it until we figure out how it works, and we use that knowledge to our advantage. No matter what poets and musicians say, it’s science that makes us human, and that’s worth a few bucks to keep going. And if it takes millions or billions of dollars, well, we’re a wealthy society, and we can afford it.

We really ought to have a better argument than that, though.

As for the appropriate level of funding, I’m not sure I have a concrete number in mind. If we’ve got half a trillion to piss away on misguided military adventures, though, I think we can throw a few billion to the sciences without demanding anything particular in return.

One could attempt to frame this in purely economic terms: what’s the optimal rate at which to invest in research in order to maximize utility, under reasonable assumptions? This framing misses some of the other social benefits that Chad alludes to – all other things being equal, I’d rather live in a world where we understand general relativity, just because – but has the benefit of being at less passably well posed. I don’t know a lot about their conclusions, but I believe this kind of question has recently come under a lot of scrutiny from economists like Paul Romer, under the name endogeneous growth theory.

Month: August 2007