Three myths about scientific peer review

What’s the future of scientific peer review? The way science is communicated is currently changing rapidly, leading to speculation that the peer review system itself might change. For example, the wildly successful physics preprint arXiv is only very lightly moderated, which has led many people to wonder if the peer review process might perhaps die out, or otherwise change beyond recognition.

I’m currently finishing up a post on the future of peer review, which I’ll post in the near future. Before I get to that, though, I want to debunk three widely-believed myths about peer review, myths which can derail sensible discussion of the future of peer review.

A brief terminological note before I get to the myths: the term “peer review” can mean many different things in science. In this post, I restrict my focus to the anonymous peer review system scientific journals use to decide whether to accept or reject scientific papers.

Myth number 1: Scientists have always used peer review

The myth that scientists adopted peer review broadly and early in the history of science is surprisingly widely believed, despite being false. It’s true that peer review has been used for a long time – a process recognizably similar to the modern system was in use as early as 1731, in the Royal Society of Edinburgh’s Medical Essays and Observations (ref). But in most scientific journals, peer review wasn’t routine until the middle of the twentieth century, a fact documented in historical papers by Burnham, Kronick, and Spier.

Let me give a few examples to illustrate the point.

As a first example, we’ll start with the career of Albert Einstein, who wasn’t just an outstanding scientist, but was also a prolific scientist, publishing more than 300 journal articles between 1901 and 1955. Many of Einstein’s most ground-breaking papers appeared in his “miracle year” of 1905, when he introduced new ways of understanding space, time, energy, momentum, light, and the structure of matter. Not bad for someone unable to secure an academic position, and working as a patent clerk in the Swiss patent office.

How many of Einstein’s 300 plus papers were peer reviewed? According to the physicist and historian of science Daniel Kennefick, it may well be that only a single paper of Einstein’s was ever subject to peer review. That was a paper about gravitational waves, jointly authored with Nathan Rosen, and submitted to the journal Physical Review in 1936. The Physical Review had at that time recently introduced a peer review system. It wasn’t always used, but when the editor wanted a second opinion on a submission, he would send it out for review. The Einstein-Rosen paper was sent out for review, and came back with a (correct, as it turned out) negative report. Einstein’s indignant reply to the editor is amusing to modern scientific sensibilities, and suggests someone quite unfamiliar with peer review:

Dear Sir,

We (Mr. Rosen and I) had sent you our manuscript for publication and had not authorized you to show it to specialists before it is printed. I see no reason to address the in any case erroneous comments of your anonymous expert. On the basis of this incident I prefer to publish the paper elsewhere.

Respectfully,

P.S. Mr. Rosen, who has left for the Soviet Union, has authorized me to represent him in this matter.

As a second example, consider the use of peer review at the journal Nature. The prestige associated with publishing in Nature is, of course, considerable, and so competition to get published there is tough. According to Nature’s website, only 8 percent of submissions are accepted, and the rest are rejected. Given this, you might suppose that Nature has routinely used peer review for a long time as a way of filtering submissions. In fact, a formal peer review system wasn’t introduced by Nature until 1967. Prior to that, some papers were refereed, but some weren’t, and many of Nature’s most famous papers were not refereed. Instead, it was up to editorial judgement to determine which papers should be published, and which papers should be rejected.

This was a common practice in the days before peer review became widespread: decisions about what to publish and what to reject were usually made by journal editors, often acting largely on their own. These decisions were often made rapidly, with papers appearing days or weeks after submission, after a cursory review by the editor. Rejection rates at most journals were low, with only obviously inappropriate or unsound material being rejected; indeed, for some Society journals, Society members even asserted a “right” to publication, which occasionally caused friction with unhappy editors (ref).

What caused the change to the modern system of near-ubiquitous peer review? There were three main factors. The first was the increasing specialization of science (ref). As science became more specialized in the early 20th century, editors gradually found it harder to make informed decisions about what was worth publishing, even by the relatively relaxed standards common in many journals at the time.

The second factor in the move to peer review was the enormous increase in the number of scientific papers being published (ref). In the 1800s and early 1900s, journals often had too few submissions. Journal editors would actively round up submissions to make sure their journals remained active. The role of many editorial boards was to make sure enough papers were being submitted; if the journal came up short, members of the editorial board would be asked to submit papers themselves. As late as 1938, the editor-in-chief of the prestigious journal Science relied on personal solicitations for most articles (ref).

The twentieth century saw a massive increase in the number of scientists, a much easier process for writing papers, due to technologies such as typewriters, photocopiers, and computers, and a gradually increasing emphasis on publication in decisions about jobs, tenure, grants and prizes. These factors greatly increased the number of papers being written, and added pressure for filtering mechanisms, such as peer review.

The third factor in the move to peer review (ref) was the introduction of technologies for copying papers. It’s just plain editorially difficult to implement peer review if you can’t easily make copies of papers. The first step along this road was the introduction of typewriters and carbon paper in the 1890s, followed by the commercial introduction of photocopiers in 1959. Both technologies made peer review much easier to implement.

Nowadays, of course, the single biggest factor preserving the peer review system is probably social inertia: in most fields of science, a journal that’s not peer-reviewed isn’t regarded as serious, and so new journals invariably promote the fact that they are peer reviewed. But it wasn’t always that way.

Myth number 2: peer review is reliable

Update: Bill Hooker has pointed out that I’m using a very strong sense of “reliable” in this section, holding peer review to the standard that it nearly always picks up errors, is a very accurate gauge of quality, and rarely suppresses innovation. If you adopt a more relaxed notion of reliability, as many but not all scientists and members of the general public do, then I’d certainly back off describing this as a myth. As an approximate filter that eliminates or improves many papers, peer review may indeed function well.

Every scientist has a story (or ten) about how they were poorly treated by peer review – the important paper that was unfairly rejected, or the silly editor who ignored their sage advice as a referee. Despite this, many strongly presume that the system works “pretty well”, overall.

There’s not much systematic evidence for that presumption. In 2002 Jefferson et al (ref) surveyed published studies of biomedical peer review. After an extensive search, they found just 19 studies which made some attempt to eliminate obvious confounding factors. Of those, just two addressed the impact of peer review on quality, and just one addressed the impact of peer review on validity; most of the rest of the studies were concerned with questions like the effect of double-blind reviewing. Furthermore, for the three studies that addressed quality and validity, Jefferson et al concluded that there were other problems with the studies which meant the results were of limited general interest; as they put it, “Editorial peer review, although widely used, is largely untested and its effects are uncertain”.

In short, at least in biomedicine, there’s not much we know for sure about the reliability of peer review. My searches of the literature suggest that we know don’t much more in other areas of science. If anything, biomedicine seems to be unusually well served, in large part because several biomedical journals (perhaps most notably the Journal of the American Medical Association) have over the last 20 years put a lot of effort into building a community of people studying the effects of peer review; Jefferson et al‘s study is one of the outcomes from that effort.

In the absence of compelling systematic studies, is there anything we can say about the reliability of peer review?

The question of reliability should, in my opinion, really be broken up into three questions. First, does peer review help verify the validity of scientific studies; second, does peer review help us filter scientific studies, making the higher quality ones easier to find, because they get into the “best” journals, i.e., the ones with the most stringent peer review; third, to what extent does peer review suppress innovation?

As regards validity and quality, you don’t have to look far to find striking examples suggesting that peer review is at best partially reliable as a check of validity and a filter of quality.

Consider the story of the German physicist Jan Hendrik Schoen. In 2000 and 2001 Schoen made an amazing series of breakthroughs in organic superconductivity, publishing his 2001 work at a rate of one paper every 8 days, many in prestigious journals such as Nature, Science, and the Physical Review. Eventually, it all seemed a bit too good to be true, and other researchers in his community began to ask questions. His work was investigated, and much of it found to be fraudulent. Nature retracted seven papers by Schoen; Science retracted eight papers; and the Physical Review retracted six. What’s truly breathtaking about this case is the scale of it: it’s not that a few referees failed to pick up on the fraud, but rather that the refereeing system at several of the top journals systematically failed to detect the fraud. Furthermore, what ultimately brought Schoen down was not the anonymous peer review system used by journals, but rather investigation by his broader community of peers.

You might object to using this as an example on the grounds that the Schoen case involved deliberate scientific fraud, and the refereeing system isn’t intended to catch fraud so much as it is to catch mistakes. I think that’s a pretty weak objection – it can be a thin line between honest mistakes and deliberate fraud – but it’s not entirely without merit. As a second example, consider an experiment conducted by the editors of the British Medical Journal (ref). They inserted eight deliberate errors into a paper already accepted for publication, and sent the paper to 420 potential reviewers. 221 responded, catching on average only two of the errors. None of the reviewers caught more than five of the errors, and 16 percent no errors at all.

None of these examples is conclusive. But they do suggest that the refereeing system is far from perfect as a means of checking validity or filtering the quality of scientific papers.

What about the suppression of innovation? Every scientist knows of major discoveries that ran into trouble with peer review. David Horrobin has a remarkable paper (ref) where he documents some of the discoveries almost suppressed by peer review; as he points out, he can’t list the discoveries that were in fact suppressed by peer review, because we don’t know what those were. His list makes horrifying reading. Here’s just a few instances that I find striking, drawn in part from his list. Note that I’m restricting myself to suppression of papers by peer review; I believe peer review of grants and job applications probably has a much greater effect in suppressing innovation.

  • George Zweig’s paper announcing the discovery of quarks, one of the fundamental building blocks of matter, was rejected by Physical Review Letters. It was eventually issued as a CERN report.
  • Berson and Yalow’s work on radioimmunoassay, which led to a Nobel Prize, was rejected by both Science and the Journal of Clinical Investigation. It was eventually published in the Journal of Clinical Investigation.
  • Krebs’ work on the citric acid cycle, which led to a Nobel Prize, was rejected by Nature. It was published in Experientia.
  • Wiesner’s paper introducing quantum cryptography was initially rejected, finally appearing well over a decade after it was written.

To sum up: there is very little reliable evidence about the effect of peer review available from systematic studies; peer review is at best an imperfect filter for validity and quality; and peer review sometimes has a chilling effect, suppressing important scientific discoveries.

At this point I expect most readers will have concluded that I don’t much like the current peer review system. Actually, that’s not true, a point that will become evident in my post about the future of peer review. There’s a great deal that’s good about the current peer review system, and that’s worth preserving. However, I do believe that many people, both scientists and non-scientists, have a falsely exalted view of how well the current peer review system functions. What I’m trying to do in this post is to establish a more realistic view, and that means understanding some of the faults of the current system.

Myth: Peer review is the way we determine what’s right and wrong in science

By now, it should be clear that the peer review system must play only a partial role in determing what scientists think of as established science. There’s no sign, for example, that the lack of peer review in the 19th and early 20th century meant that scientists then were more confused than now about what results should be regarded as well established, and what should not. Nor does it appear that the unreliability of the peer review process leaves us in any great danger of collectively coming to believe, over the long run, things that are false.

In practice, of course, nearly all scientists understand that peer review is only part of a much more complex process by which we evaluate and refine scientific knowledge, gradually coming to (provisionally) accept some findings as well established, and discarding the rest. So, in that sense, this third myth isn’t one that’s widely believed within the scientific community. But in many scientists’ shorthand accounts of how science progresses, peer review is given a falsely exaggerated role, and this is reflected in the understanding many people in the general public have of how science works. Many times I’ve had non-scientists mention to me that a paper has been “peer-reviewed!”, as though that somehow establishes that it is correct, or high quality. I’ve encountered this, for example, in some very good journalists, and it’s a concern, for peer review is only a small part of a much more complex and much more reliable system by which we determine what scientific discoveries are worth taking further, and what should be discarded.

Further reading

I’m writing a book about “The Future of Science”; this post is part of a series where I try out ideas from the book in an open forum. A summary of many of the themes in the book is available in this essay. If you’d like to be notified when the book is available, please send a blank email to the.future.of.science@gmail.com with the subject “subscribe book”. I’ll email you to let you know in advance of publication. I will not use your email address for any other purpose! You can subscribe to my blog here.

63 comments

  1. Here are two more examples of the realities of peer review, which amply illustrate your points.

    First, Gauss’ article Disquisitiones generales circa superficies curvas (which proved Gauss’ celebrated Theorema Egregium, i.e., the notion that curvature is intrinsic) was published in at least seven separate editions between 1825 and 1900: three times in Latin, twice in French, and twice in German (details here). Hey, it was a great theorem!

    Second, Heinrich Rohrer told me that his and Gerd Binnig’s Nobel Prize-winning work on scanning probe microscopy was turned down by the reviewers at Physical Review Letters. When I asked him why, he assumed a broadly comical accent and said “Because the reviewers are ee-dee-yoots at Physical Review Letters!”

    I have to say, however, that by and large peer review works amazingly well … in part because there is no review so bad, that it can’t be used to help make an article better. This evolutionary process (although painful) causes better articles to emerge even from bad reviews. Every mathematician and scientists has this experience.

    Also, in the long run, it doesn’t matter what the reviewers think, nearly so much as it matters what the readers think … and even more, what they do … and this audience (students especially) is tougher than any reviewer.

  2. I’m sympathetic to much of what you’re saying, but, on the other hand, I know that peer review has immensely improved many of my own papers.

    The Einstein-Rosen paper was sent out for review, and came back with a (correct, as it turned out) negative report. Einstein’s indignant reply to the editor is amusing to modern scientific sensibilities, and suggests someone quite unfamiliar with peer review:

    Your parenthetical remark suggests that you know the rest of this story:

    http://blogs.discovermagazine.com/cosmicvariance/2005/09/16/einstein-vs-physical-review/

    “After this incident, Einstein vowed never again to publish in Physical Review — and he didn’t. The Einstein-Rosen paper eventually appeared in the Journal of the Franklin Institute, but its conclusions were dramatically altered — the authors chose new coordinates, and showed that they had actually discovered a solution for cylindrical gravitational waves, now known as the “Einstein-Rosen metric.” It’s a little unclear how exactly Einstein changed his mind — whether it was of his own accord, through the influence of the referee’s report, or by talking to Robertson personally.”

    So it seems that Einstein might have benefited from peer review, although he clearly wouldn’t like to admit it.

  3. Economist George Akerlof wrote his most famous paper, “The Market for Lemons: Quality Uncertainty and the Market Mechanism” in 1967. The paper used the market for used cars as an example of asymmetric information – where a seller knows more about the goods he is selling than a buyer does. The first journal Akerlof tried rejected the paper on the grounds that it was trivial. The second also rejected it. The third rejection came with a referee’s comment that the paper was wrong in its reasoning – that if it were right, economics would be different.

    The paper was eventually published, Akerlof shared a Nobel Prize for economics in 2001 primarily on the basis of “The Market for Lemons”, and economics is now different.

    http://nobelprize.org/nobel_prizes/economics/articles/akerlof/index.html

  4. MikeM, that was a terrific link to George Akerlof. In particular, Akerlof’s describes an example peer-review failure because “The economists of the time felt that it would violate their methodology to consider a problem, such as the role of asymmetric information, that was out of its traditional focus.”

    “Out of our traditional focus” is a very common reason for rejection, not only in academic publishing, but also in business, and politics (and if you think about it, even romance).

    I think it was Marvin Minsky, in Society of Mind, who pointed out how very necessary it is, that human cognition has strong censorship mechanisms, operating largely on the preconscious level, that reject ideas that don’t match preconceptions.

    This is no bad thing. But the paradoxical result is that it is (sometimes) more difficult for a good idea to find an audience than a mediocre one.

    For much the same reason, mediocre relationships often are easier to initiate than good ones … with the result that it’s all too easy to find yourself embracing dull ideas *and* dull romantic partners. 🙂

  5. Nice post, I’ll just toss in my two cents.

    Opinion 1.
    My gut feeling is that peer review is good at truncating that statistical tails of quality in science. It helps as a filter against pseudo-science (most of the time), however it can also filter against ideas that are too different from the accepted norms of science (some of the time). The former papers are dross, the latter tend to be brilliant in hindsight. There are far more of the former.

    Opinion 2.
    From my conversations with many scientists quite a few of them told me that peer review has helped with some of their manuscripts. I heard a lot of stories where people were aggrieved by the decision on occasions (particularly in interdisciplinary science where there was a tendency for reviewers to comment negatively on the areas that there were not an expert on), but the general trend was one where it seemed to help.

    I put this down to the ‘second reading’ effect. Writing is hard, and communicating complex ideas in a written paper is no trivial feat. Particularly when one is writing in a language that is not one’s native language. I would bet that having someone read through your paper who can understand the content will be more than likely to raise suggestions that could help improve the paper, be they style, structural or content related.

    There are more important issues at hand, however, such as the sociological implications for the dissemination of knowledge, the issue of whether the current system can continue to scale (probably not), of how to retain aspects of what is good, and many more questions. I’m looking forward to your post on the future of peer review.

  6. Great read.
    See this link for more anecdotal examples of “classic reviews”.
    http://th.informatik.uni-mannheim.de/People/lucks/reject.pdf

    Also, since you have posted previously about PageRank you might consider this as a good example of “misguided research”. Despite the theoretical and intuitive support of the PageRank concept, real-world experiments have shown that PR isn’t better than simple inlink counting. However, this has been a very popular topic in WebIR research.

    http://www.daniel-lemire.com/blog/archives/2007/11/28/is-pagerank-just-good-marketing/

  7. Sérgio Nunes’ post indirectly points to another unintended consequence of peer review, which comes about as follows (humor alert!).

    (1) The most inexpensive kind of faculty to hire are theorem-provers. (2) The most rigorous form of peer review is theorem-checking. (3) Hence, in any academic ecosystem that regulated by peer-review, the theorem-provers eventually become the dominant species of professor.

    Arguably, this trend has been wonderful news for mathematics, OK news for the physical sciences, not-so-good news for engineering … and it has been an utter disaster for economics. 🙂

    Interestingly, my experience has been that the academic medicine is almost wholly immune to invasion by theorem-provers. The reason is that medicine is taught by Oslerian methods of immersive apprenticeship, and physicians are not allowed to accept only patients who are easy to cure, but instead, have to treat every patient who comes in the door—including patients for whom no cure (and often no useful diagnosis) is feasible.

    This creates in academic medicine a working environment in which heuristic reasoning from incomplete information works extremely well, but theorem-proving is seldom feasible.

    You wouldn’t want to be treated by a physician who had learnt medicine the way that economists learn economics; this gives reason to doubt the existence of “one size fits all” systems of academic peer review.

  8. “There’s no sign, for example, that the lack of peer review in the 19th and early 20th century meant that scientists then were more confused than now about what results should be regarded as well established, and what should not.”

    Hm, I somehow worry about this claim. I suspect you’re right, but I wonder if you know of any historical studies pointing in this direction?

  9. Let’s suppose we agree that peer review is poor as a filter for accepting and rejecting papers, but good as a way of improving papers before publication. This suggests the following policy: (1) Reject a paper without review if it is irrelevant or clearly incompetent. (2) Review the remaining papers. (3) All of the remaining papers will be published, under the condition that all of the reviewers’ suggestions are incorporated in some way in the final version. (4) The editor gets to decide whether the reviewers’ suggestions have been adequately addressed. (5) If the author is unable to revise the paper to the satisfaction of the editor, then publish the paper with the reviewers’ comments appended to the paper and perhaps some kind of warning note at the beginning of the paper.

  10. Peter,

    in reply to your comment 3, you might like to look at the Kennefick essay I link to in my post. It’s the original source for the material at Cosmic Variance.

    And, of course, I completely agree with you that peer review sometimes improves papers quite considerably!

  11. Ian – On your opinion 1, I agree with all that you say. It does raise the question of what to do about it. Do you create a system that accepts both the dross and the brilliant gems, on the grounds that they’re very hard to tell apart (institutionally, I mean)? Or do you throw both out at the refereeing stage? My preference is probably to do the former, and then try to build good filtering tools on top so that, for the most part, I don’t see the dross. The arXiv sort of takes this approach, although the filtering still isn’t very good.

  12. Sergio,

    Thanks for the link (and the kind words).

    On the PageRank-indegree work, I’ve only looked at one paper on it. The paper wasn’t very good – it seemed to be getting the PageRank results from the IE toolbar, which is a terrible approximation to PageRank.

    I did try an experiment myself a few weeks ago – built a model web where each page has an indegree with Pareto distribution. I found only a very coarse connection between PageRank and indegree. That’s not conclusive, of course – it’s not the real web – but it led me to believe that the link between the two is probably not all that strong.

    I followed the second link you provided. It was was interesting, although I couldn’t find the paper linked from that page, but I don’t really find it plausible that Google does this as marketing. They’ve spent hundreds of millions of dollars on PageRank – more than $300 million to acquire the license from Stanford, alone. Aside from geeks and SEO people I’ve never met anyone who’s heard of PageRank, which suggests that it’s failing badly if it’s for marketing – all those people know is that Google gives them a better search experience, and they don’t much care how it does it.

  13. Michael Nielsen says: “I don’t really find it plausible that Google does [PageRank] as marketing. They’ve spent hundreds of millions of dollars on PageRank – more than $300 million to acquire the license from Stanford, alone.”

    Michael, may I recommend that you be very cautious in drawing that conclusion? For example, the 1959 novel The Tempter (which centers upon information theory, oddly enough) describes *precisely* this same business strategy, and discusses its moral, legal, and economic ramifications at considerable length, and in fascinating depth.

    Now, who would be so eccentric as to write an entire novel about information theory? Why … it was Norbert Weiner, of course!

    And who would be so eccentric as to read Weiner’s novel? Gosh … perhaps it was Larry Page and Sergey Brin! 🙂

  14. There are two overlooked points here:

    1. Peer review usually plays a role in tenure bids. At some colleges and universities, particularly those who emphasize teaching, one frequently just needs to produce material that demonstrates they are active in their field. Others actually require publications in peer-reviewed journals (often an actual number of such publications is required). Still others go so far as to inspect impact factor.

    2. Regardless of peer review, the quality and/or impact of a paper is often not immediately apparent. This is why so many ideas have been “rediscovered” while others, assumed to be correct, are later proven to be flawed.

    Personally, I think the biggest problem with peer review is that the community

    a) needs to be more open to new and challenging ideas, i.e. less conservative which means rejecting papers that don’t do more than make incremental changes while accepting those that challenge existing assumptions; and

    b) is far too insular. While the Einstein example is certainly a counterexample and I have no hard data to back up my gut feeling on this, it seems to me that well-known authors or those from well-known research groups likely have an easier time with peer-review. I mean, it takes some serious cajones to reject Einstein and that makes me think it’s a rare occurrence. While a double-blind process might help in this instance, that becomes more difficult in the age of the arXiv.

  15. I use to liken peer review to a spam filter when discussing its role, capabilities and limitations. Like a spam filter, peer review is there to improve the signal/noise ratio. It can fail in either direction, letting slip through poor papers or rejecting good ones. Configuring the filter is up to the individual recipient (i.e. journal or conference): one would have a hard time trying to get a chemistry paper published in a journal on economics even if the paper was clearly worth a Nobel price.

  16. I forgot to mention the other day that there is one historical inaccuracy in your original post. In a certain sense, peer review has been around since the seventeenth century, albeit in slightly different form.

    Beginning in the seventeenth century (or perhaps a little earlier – I’m not positive) what would now be equivalent to publishing in a premier journal (Physical Review, Nature, JAMA, etc.), consisted of having one’s paper presented to one of the learned societies of the time. The historical ‘remnants’ of this process are still evident in the names of several of the Royal Society’s journals, notably “Proceedings of…” and “Notes and Records of…”

    This process always involved some sort of peer review and often suffered from the same problems peer review still suffers from. Two of the most notable examples of this were the nearly simultaneous attempts by Abel and Galois to have their work (the origins of group theory) communicated to the French Academy of Sciences. Both were unsuccessful (at least in their short lifetimes) and, in fact, Galois received what amounted to a referee’s report dated July 4, 1831 that rejected one of the foremost papers in the history of mathematics. The referees were Poisson and Lacroix. Abel was simply ignored (again, until he was dead).

    Ironically, though neither knew the other, they were working on the same problem nearly simultaneously and had trouble not just with the same society, but the same people. At one point or another both ran into the immense ego of Cauchy in their attempts to be recognized by the Academy.

    Mario Livio’s book, despite its “popular” title (The Equation That Couldn’t Be Solved: How Mathematical Genius Discovered the Language of Symmetry), represents very serious scholarly research into the history of group theory and, particularly, Abel and Galois themselves. There is also a book out on the history of learned societies that I have not yet read, but, thanks to this post, will make my next read. I don’t recall the title and it is on the bookshelf in my office.

  17. Ian – There’s no inaccuracy in the post. As noted in my post, a system very similar to the modern peer review system was already in use in 1731; I’m also familiar with some of the earlier procedures, which are sufficiently dissimilar to the modern system (and quite ad hoc) that I decided they didn’t merit inclusion.

  18. Well, maybe so, but then I think your heading “Myth number 1: Scientists have always used peer review” is misleading. Are we talking here about modern scientists and, if so, when do you claim modern science came into being? Most historians of science peg modern science as coming into being at the tail end of the sixteenth century (at least in Western terms – see Jim Cushing’s book Philosophical Concepts in Physics: The Historical Relation between Philosophy and Scientific Theories). Even by the date you yourself give (1731) a peer review system has existed for two-thirds of the time of modern science. And I would encourage you to take another look at the cases of Abel and Galois (I cannot, for the life of me, locate the exact quote, but Livio specifically mentions peer review in his discussion).

    I do stand corrected on one thing, however. Galois was aware of one of Abel’s results from Ferrusac’s Bulletin. However, Galois’ first paper was published about the time Abel died even though his work was carried out about the same time Abel was doing the same.

  19. “Always” implies it’s widely used. (You can quibble with this interpretation, but it’s very clear in the post that’s what I mean). The point is that peer review wasn’t widespread until well into the 20th century.

  20. The point is that peer review wasn’t widespread until well into the 20th century.

    Perhaps this is quibbling, but I’d like to see some data that backs that up (incidentally, the reason I am quibbling about this is that my PhD is in the History of Mathematics & Physics and I have studied this extensively). You are correct if your definition of peer review is the process employed by Physical Review and similar journals (I would argue that Nature and Science do not have the same process even to this day since the editors exert more control – to prove this I would need access to their records, however, which is not likely to happen).

    However, since different journals have different processes, I would argue that peer review in the sense that referees are sought to review a paper before it is “communicated” via official means, has been widespread since at least the late-eighteenth century. Again, more data would be helpful to this assertion (and such an undertaking would constitute a major research project I would think), but I would go so far as to claim that, in the broader sense, peer review is part of what defines modern science.

    Even with Einstein’s papers, I would be curious to know how many of his papers were “communicated” to the publishing journal by someone else. In truth, I do not know how the German journals worked at the time. I am most familiar with the British and French journals. Nonetheless, in some cases the “communication” of a paper to a journal can represent peer review (though it is difficult to tell in any individual instance).

    While I agree with some of your conclusions on this point, I really find the heading misleading since it implies, intentionally or not, that there was little or no peer-oriented oversight (if that’s the right word) to the scientific process prior to the twentieth century when, in fact, it was ubiquitous for quite some time, just perhaps in a different form.

    Take a similar situation as an example of what I’m trying to get at. A glance at many papers from the early twentieth century will reveal that the act of including references or footnotes was spotty at best (this gave me quite a bit of consternation when completing my PhD and a paper a couple of years ago since following the trail of an idea proved immensely difficult). But we clearly know via correspondence, conference proceedings, and other sources that many of these authors built on each others’ work. So it would be erroneous to conclude that scientists worked nearly independently back then.

    In short, all I’m saying is that the title of Myth 1 is a misleadingly strong statement. The history of science has as many nuances and subtleties as science itself.

  21. Well, I’m a few days late to the game here, due to my vacation…

    I have had an instinctive dislike for the anonymous peer review system for as long as I can remember in my short career as a scientist (say 10-ish years). I am really happy to have found out recently that many others share this dislike and am still catching up with what people are saying. Some really great conversation above, but I did want to just put in a couple comments:

    * I am really looking forward to your upcoming post on the future of peer review.

    * I personally feel (intuitively, having done no research) that a system of fully-published, non-anonymous peer review would be much more effective than what we have now. 10 years ago it was probably not feasible, but I think now we have what we need as far as the technology for enabling it. I do hope this is part of your plan for the future.

    * One thing that really has bothered me and would be solved by a fully open system is that I have really been frustrated in the past that I could not give credit to very good ideas from reviewers. It is unfair and silly that I have not been able to give credit to ideas that referees have contributed to make my publications stronger.

  22. Steve – As regards open peer review, I think this is very much an all-or-nothing affair. Many experiments with open peer review have been tried (I discuss some here). They have a lot of trouble getting people to participate. I think this is in part because of fear of retribution in other anonymous forums. Why criticise someone’s work publicly today, when tomorrow they may be the anonymous referee of your work? Without a universal switch, it’s hard to see happening, because the local difficulty of switching is very high.

    I know of one specific instance where an anonymous referee disclosed their identity to an author whose work they had rejected; the results were not positive for anyone involved.

    I do think we can move towards a more open system of peer review, but through more indirect steps – I’ll describe this in my future post.

  23. I know of one specific instance where an anonymous referee disclosed their identity to an author whose work they had rejected; the results were not positive for anyone involved.

    Actually I know another case in which it turned out reasonably well (meaning there was no animosity) but I can definitely see how this might be the exception.

    The thing is that there are times when you have a pretty good guess who a referee is anyway (particularly if you sent in a list of suggestions with your paper).

    Incidentally, I find that particular practice (of sending suggestions) a little dubious. It works if authors are generally honest and don’t suggest friends and collaborators, but I’m sure there must have been papers that have squeaked through because the author suggested someone he/she knew would approve it and the editor didn’t pay careful enough attention to the paper to override the referee’s report.

  24. Widening the scope of the discussion slightly, an associated problem is that “peer reviewed” becomes established as meaning “true” in all the aspirant disciplines suffering from physics envy – psychology, the therapies, health promotion, disability studies, etc etc – where the prerequisites for effective review are much thinner.

    Among other problems, the mechanism that prevails is that
    (a) practitioners wishing to seek tenure want to have their articles published;
    (b) the primary journal in the field fills up and overflows;
    (c) someone starts up a new journal to catch the overflow:
    (d) there are, initially, not enough good articles to fill this new journal, and not enough good reviewers and editors:
    (e) articles are published that are below publishable quality;
    (f) these inferior articles gain the status won for refereed articles by Nature and Science.

  25. Excellent reading. A study of peer review is also done in the next report

    http://www.canonicalscience.org/publications/canonicalsciencereports/20082.html

    The conclusions by Tom Jefferson about peer-review were that it is “completely useless at detecting research fraud”.

    The above report also includes a list of thirty four Nobel Laureates whose awarded work was rejected by peer review. Those are some excerpts:

    “The 1996 Nobel Prize in Physics was awarded to DAVID MORRIS LEE, DOUGLAS DEAN OSHEROFF, and ROBERT COLEMAN RICHARDSON for the discovery of superfluid Helium. Their key paper was rejected by the reviewers of the journal Physical Review Letters. One reviewer argued that the system «cannot do what the authors are suggesting it does».”

    “WILLIAM NUNN LIPSCOMB received the 1976 Nobel Prize in Chemistry for his studies on the structure of boranes. In an interview, LIPSCOMB recalled how the Journal of the American Chemical Society rejected the first manuscript in which he used the concept of pseudorotacion to explain the structure of a boron hydride. Another manuscript in which he showed that p-dithiin was V-shaped was also rejected by the Journal of Organic Chemistry.”

    ROSALYN YALOW, described how her Nobel- prize-winning paper was received by the journals, in the next terms:

    “In 1955 we submitted the paper to Science…. The paper was held there for eight months before it was reviewed. It was finally rejected. We submitted it to the Journal of Clinical Investigations, which also rejected it.”

  26. I believe the main problem with peer review, as in the physical sciences, is the anonimity of the reviewer, which opens up the posisbility of abuse. This is specially what young authors encounter often and is not helping science.

  27. Sorry for this to come so late after the discussion start, but a few remarks dropped in this debate need to be rectified no matter how late it happens, because otherwise they risk to cement wrong reasoning for a long time. With this I mean John Sidles’ statement in his very 1st comment on Jan 8, 2009, namely “… by and large peer review works amazingly well … in part because there is no review so bad, that it can’t be used to help make an article better. This evolutionary process (although painful) causes better articles to emerge even from bad reviews. ”
    With due respect, this argument reminds me of some of those Western leftwingers’ belated attempts to find something positive in the Soviet communism regimes that collapsed in the late 1980ies. I still remember them acknowledging reluctantly that the recently failed system with which they had preached appeasement was utterly unbearable for the people subject to it. Still, in the last attempt to justify the alliance with a wrong faith, they added: “But at least you had free healthcare!” (never mind it neither cared nor cured).
    I can see here a lot of similarity with John Sidles’ words, trying to defend the undefendable. Please, scientific papers are not written to be literary masterpieces! And looking at the ACCEPTED papers one sees immediately that literary masterpieces do not emerge that way, no matter how obediently the authors had to honor referees’ suggestions. Papers are supposed to disseminate knowledge and innovation as quickly and efficiently as possible, period. Occasionlally, dumb and inane referee comments help the author find better words to eliminate one – surely not every – source of misunderstanding, but there are much better (quicker, more thorough) ways to achieve the same. Think of presentations at seminars and conferences. This is quick and reliable! On the contrary, a couple of years trying to please an anonymous “peer” who does not have to be a peer at all, and the whole work one did can turn out useless while coming too late to make a difference.
    Let us not fool ourselves: what really decides about a paper’s destiny is not a referee report but the editor’s preference

  28. Dear All:

    A colleague has drawn my attention to this discussion, and I would like to draw your attention to advanced forms of peer review that resolve most of the concerns and issues raised above.

    Since 2001 the interactive open access journal Atmospheric Chemistry and Physics (ACP, http://www.atmospheric-chemistry-and-physics.net) and over a dozen of sister journals published by the European Geosciences Union (EGU, http://www.egu.eu) demonstrate how the efficiency of scientific communication and quality assurance can be enhanced by a two-stage process of publication with public peer review with interactive discussion.

    These interactive open access journals are by most if not all standards and statistical indicators more successful than comparable traditional journals, and the same or similar concepts of public peer review and discussion have recently also been adopted in other scientific disciplines (economics, life sciences, etc.). For more information see the web pages and references listed below.

    With best regards,
    Uli Pöschl

    http://www.atmospheric-chemistry-and-physics.net/index.html

    http://www.atmospheric-chemistry-and-physics.net/general_information/public_relations.html

    http://www.atmospheric-chemistry-and-physics.net/pr_acp_poschl_liber_quarterly_2010_interactive_open_access_publishing.pdf

  29. @Ulrich, thank you for your contribution, that is exactly the kind of journal I hoped to find. What I consider to be one of the most important attributes:

    “foster and provide a lasting record of scientific discussion;”

    I really don’t believe it serves the interest of science to completely, flat-out reject a scientific study, or to retract a study, even if it is at the request of the original author. I like the idea of keeping anonymity for the sake of removing social pressure on the reviewer, but the criticisms of the study need to remain open and transparent so that the public can evaluate the review. Should we change our scientific assumptions further down the road, it would also be valuable to be able to go back to historical studies that were not well received and be able to say “hey, wait a minute, that study was not accepted by the public for these concerns, which have now been debunked, so the study now holds more significance”.

    Ultimately, it is the entire body of science that establishes our opinion of what studies most likely reflect reality; and this is the only truly reliable way to limit human bias. Consider this: the fundamental scientific principle of reproducibility is so cherished because we have acknowledge the danger of human bias; and that it can only be reliably removed through such a principle. Well… non-transparent, peer review undermines this tenant by refusing to allow certain studies to stand up to the fundamental, unbiased test of reproducibility. You are essentially trading a biased process for a fundamental scientific principle.

  30. Also. For the same reason it is valuable to go back and reevaluate the criticisms of historical scientific studies that were not accepted; it could be equally valuable to science to be able to go back and reevaluate the studies themselves. For instance; if a study could not be reproduced, this could due to a particular detail in that study that was overlooked by future studies. It then becomes highly valuable to reevaluate past studies in science, as much as it is valuable to reevaluate our basic assumptions in science.

    The idea that we should “purge” scientific works from the body of our scientific knowledge, I find to be an idea which is offensive to the core values of science.

Comments are closed.