Connecting scientists to scientists

I’ve been struggling for some time with a writing problem. This is the problem of finding a really sharp way of conveying one of the most powerful ideas of open science: all the untapped creative potential existing in latent connections between scientists, and which could be released using suitable tools to activate the most valuable of those latent connections. I’ve discussed this idea in previous essays, but something was always lacking. In this post I take another shot at it, this time confronting the problem head on.

A fact of any scientist’s life is that you carry a lot of unsolved problems around in your head. Some of those problems are big (“find a quantum theory of gravity”), some of them are small (“where’d that damned minus sign disappear in my calculation?”), but all are grist for future progress. Mostly, it’s up to you to solve those problems yourself. If you’re lucky, you might also have a few supportive colleagues who can sometimes help you out.

Very occasionally, though, you’ll solve a problem in a completely different way. You’ll be chatting with a new acquaintance, when one of your problems (or something related) comes up. You’re chatting away when all of a sudden, BANG, you realize that this is just the right person to be talking to. Maybe they can just outright solve your problem. Or maybe they give you some crucial insight that provides the momentum needed to vanquish the problem.

Every working scientist recognizes this type of fortuitous serendipitous interaction. The problem is that they occur too rarely.

A few years ago, I started participating in various open source forums. Over time, I noticed something surprising going on in the healthiest of those forums. When people had a problem that was bugging them, rather than keeping silent about it, they’d post a description of the problem to the forum. Often, I’d look at their question and think to myself “yeah, I can see why they posted, that looks like a tough problem.” Then, forty minutes later, someone would come in and say “Oh, that’s easy, you just do X, Y, and Z”. Very often, X, Y and Z were quite ingenious, or at the least relied on knowledge that neither I nor the original questioner possessed. The original problem had been trivial all along.

What’s going on is similar to the fortuitous scientific exchange. A problem that’s difficult or impossible for most people can be trivial or routine to just the right person. But what was interesting and surprising about the open source forums was this: it seemed to be happening all the time. People who I’d never heard of would pop up, ask an interesting question, then someone else I’d never heard of would pop up, and provide an insightful answer. It didn’t happen every time, but it was happening over and over again.

A big “ahah!” moment for me occurred when I understood what was going on. By scaling up the creative conversation, those open source projects were providing a systematic mechanism that enabled people to find other people with just the right expertise to make their problem easy. Most of us spend much of our time stymied by problems that would be routine, if only we could find the right person to help us. As recently as 20 years ago, finding that right person was likely to be difficult. But what open source forums show is that it is possible to scale up conversation in this way, and significantly increase the likelihood of such serendipitous interaction.

Needless to say, scientists mostly don’t work this way. Many skeptics of open science say they never could, that scientists will forever be unwilling to share their problems and ideas in the way necessary to make this work. For the present post, it’s fine if you hold that position, for my purpose here isn’t to discuss the practicality of doing this. That’s a post for another day.

The question I’m concerned with is, instead, what is lost because we don’t do this? How much do we lose because so many scientists waste their time struggling with problems that some other scientist would find entirely routine?

I don’t know how to answer these questions quantitatively. What I do know is that as a practicing scientist, much of my time was spent working on problems that were hard for me, yet which I absolutely knew would be routine for someone else. The time I spent working on such problems was time lost to the whole scientific enterprise. Yet the tools and culture of science were such that I couldn’t easily outsource those problems to a person with a comparative advantage over me. When I talk about topics like restructuring expert attention, collaboration markets and open source research, this is what I’m talking about: tools and norms which allow us to trade in expert attention, and so to concentrate in areas where we have a comparative advantage.

Now, there are many caveats to this story. Most open source projects fail. Many problems – including many of the “big problems” of science – are intrinsically non-routine, and it may be extremely difficult to identify who (if anyone) has a comparative advantage in solving such problems. Furthermore, even for routine problems, there may be considerable intrinsic transaction costs associated with trade in expert attention – finding a common language, coming to a common understanding of the problem, and so on. The market for a problem may be thin (“find the screwdriver yourself!”) – for example, many of the problems facing benchtop experimentalists are problems exclusive to their own laboratories. Finally, finding ways to successfully scale up scientific conversation is not at all trivial. These are all important caveats, deserving extended discussion in their own right. Despite this, I believe the key idea – developing tools to aggregate information about comparative advantage, and to connect people who might benefit from a trade in attention – is worth taking seriously.

I started this post off with a discussion of the difficulty of describing what I believe is a latent potential for discovery within the scientific community. As I finish the post off, I must say that the post falls short of the strength and sharpness I’d like. What’s really needed is a detailed example that shows the mechanics of open source in action: how the dynamic division of labour actually works in a successful open source project. At present, so far as I’m aware there are no really successful examples within science; the culture of science remains too closed. There are, however, some extremely encouraging nascent examples, like open notebook science, and open source biology, and one day hopefully these and others will bloom.

Further reading:

Eric Raymond’s classic essay The Cathedral and the Bazaar and Yochai Benkler’s article Coase’s Penguin both contain illuminating discussions of open source, developing many of the ideas discussed in this post much further. The best way to understand open source, though, is to get actively involved in one of the many healthy projects. Cameron Neylon has a cautionary post (see also here) about the difficulty of building effective networking tools for scientists. Many people have written about ways of better connecting scientists – I especially enjoyed Shirley Wu and Cameron Neylon’s essays on the subject.

I’m writing a book about “The Future of Science”; this post is part of a series where I try out ideas from the book in an open forum. A summary of many of the themes in the book is available in this essay. If you’d like to be notified when the book is available, please send a blank email to the.future.of.science@gmail.com with the subject “subscribe book”. I’ll email you to let you know in advance of publication. I will not use your email address for any other purpose! You can subscribe to my blog here.

20 comments

  1. Hi Michael,

    Great post. I’m still not sold on the “trade in expert attention” idea. But I do think this “how to pose problems openly” and “how to scan for where my attention has high benefit/effort ratio” is a crucial question to look into.

    Best
    Anders

  2. Anders,

    I’d be better off with a different verb or a different phrase, for it’s not really just expert attention that is being traded. What’s being traded is much more complex, involving the problem that initiates the collaboration, the expertise involved in solving the problem, and then some kind or reputation from whatever results. I haven’t thought of a better phrase than “trading in expert attention”, though.

  3. Hi Michael,

    In your December 29, 2008 post, we batted around the Manhattan project a bit as an example of wildly successful open science (albeit only open inside of Los Alamos). And John Sidles mentioned others, including the human genome project. John pointed me to an article “Who Was J. Robert Oppenheimer? Charisma and Complex Organization” by Charles Thorpe and Steven Shapin. I haven’t read much of it yet, but towards the beginning, there are all sorts of quotations from those famous scientists related to how incredibly fun it was to be part of that incredibly open community.

    So, I don’t understand what kind of open examples you’re looking for? I would think that studying the Manhattan project or human genome project in terms of openness would be quite illuminating. Particularly in regards to hypothesizing about what kind of collaboration technology available to us today would have improved those projects–would it have been possible to speed the progress of the Manhattan project very much with our fancy collaborative tools? How much slower would it have been if people had to communicate with the best electronic tools we have today?

    Also, I’m trying to form some ideas I’ve been having about what is “spent” (or invested) by openness. I think the amount invested for the same act of openness (e.g. open notebook science) can vary a huge amount between individuals. I’m starting to think it’s possible that for some people the cost of being open is too high and the returns too low to make it “profitable.” That is, it’s possible many people are innately not cut out for it. Or, more optimistically, there need to be a variety of different mechanisms for being open, so the scientists can find the “lowest cost” option for themselves.

    A hypothetical example of what I am try to talk about: it seems quite plausible that there exists an excellent closed scientist who would have his creativity crippled by being forced to keep his notebook public. Simply due to his innate, pretty much unchangeable nature. He could reduce his investment costs in openness, possibly, by publishing all of his notebooks after “finishing” his project. This would also reduce the value to open science, but it may make it appropriately “profitable” for that particular person.

    I’m particularly thinking about this in terms of how to mentor my students into openness. I don’t think forcing our entire lab into complete openness is a good idea.

  4. Michael, if you’ve never seen it, check out the Association of Internet Researchers. Hop on their mailing list and you’ll find it very similar in structure to the open source lists you mention.

    It seems like a fair amount of this goes on in places like FriendFeed, blogs, and Twitter, too.

    Maybe scientists can borrow another open source convention and start hanging out in IRC? That actually sounds pretty fun. 🙂

  5. The real problem I’ve run into with this in running a journal (http://www.cshprotocols.org) is that it is often a one-sided process with a need for an expedient answer. To explain–the journal publishes biology methods, and each method has its own discussion forum where readers can leave questions about performing the assay or tips on improving it. These forums are almost never used, despite a very high readership on particular articles. When questions are asked, they are almost never answered (although as the editor, I do my best to track down answers for the reader).

    I think there are two reasons why this system fails:
    1) while getting a question answered is an obvious benefit for the person doing so, there’s no obvious benefit for the answerer, other than being nice. As such, you can’t expect answerers to put in a huge amount of time and effort scanning through our thousands of articles looking for questions that they can answer. If they stumble across one they can answer, sure, great, no problem. But how likely is it that a busy researcher is going to spend hours out of his/her week scanning through queries from others to see if any fit that researcher’s knowledge set? Even if you can convince people that they should answer queries as part of their duty to science (kind of like peer-review), you’d still need a system to completely streamline the process and connect the questioner with the right answerer with minimal effort.

    2) Expediency–if I need help with a technical question, I can’t sit around and wait 6 months for someone to stumble across my question. I can’t put my research on hold for that long, I need an answer now! So I’m much better served by a directed search, probably first asking the labs around me, then through the published literature, to find someone who can answer my question. Serendipity is nice when it happens, but if you’ve got deadlines and grants to fulfill, can you count on it to get you through crucial roadblocks? There’s also the question of trust–does the person who stumbled across your problem really have the right answer–maybe not a big deal in the realm of theoretical work, but in biology, where reagents are expensive and often exceedingly rare, that’s a big risk to take, and again you might be better off with a proven source.

    If you have answers to these sorts of questions, I’d love to hear them, as I’d love to give our readers better tools for seeking advice.

  6. Very informative and interesting post. I really like the idea of finding just the right person for a particular problem – or piece of the puzzle. I think most of us scientist have more ideas that we can chase down alone. Sure we might keep particular problems/projects to ourselves – but we have so many more than if we can make the right connection, the right collaboration – we are far more likely to solve the problem and make progress in helping humanity.

    thanks for your post – look forward to more.

  7. Darius – That’s a very nice post from Chip Morningstar. Sort of a sharp version of “Show, don’t tell”. In fact, it’s the version of “Show, don’t tell” that actually shows, as opposed to merely telling.

  8. Gavin – Thanks for the pointers. What I’m looking for is examples of significant scientific discoveries that occurred through open source style techniques. I know of a few examples that come close, but no really killer example. If you know of such an example it’d be great to hear of it!

  9. Joe – Thanks for linking to that report, it looks like a very useful resource. Have you used Illumin8? Any comments? Looks like you need to use if from an organization that has a subscription.

  10. David – I don’t have a short answer. I might attempt a long answer as a post on another day – it’s a really interesting question. Let me add to the list of problems you mention:

    – fear that disclosure of research-related questions will lead to scooping. This reduces the number of questions that are asked.

    – poor community design. It’s telling that the discussion on many scientific blogs is significantly more vibrant than in many journal forums. In a healthy open source forum the discussion can be at another level still. There’s enormous differences, though, in how those three types of communities are built, and I don’t think it’s so a priori surprising that two succeed, and the other hasn’t, yet.

  11. Steve – I was just looking for online examples using an open source style approach. The Manhattan Project is certainly a great example, offline – bring a large fraction of the world’s best scientists into one place, and give them a common goal! I’m not so sure about the human genome project – I need to look into it more.

    In regard to your other comments, about some people not being well suited to working in the open, I completely agree. Some difficulties will be obviated in the near future, by better tools, but others are more intrinsic. I’m particularly fascinated by the problem of the creative individual who really needs their space. For largely routine problems, I suspect they’ll simply be outcompeted by others taking a more open appraoch. If they are working on a big nonroutine problem, though, that may not be so true. Grothendieck has a wonderful quote about the importance of solitude in doing the best work. But I can’t find it online right now, and I’ve got to run…

  12. Michael says: “I was just looking for online examples using an open source style approach. The Manhattan Project is certainly a great example …”

    Michael, one of the regrettable ironies of the Manhattan Project is that its science is now (almost) completely open, yet its organizational history remains classified.

    For example, a key document is the February 1945 memorandum from project engineer Willam “Deak” Parsons titled “‘Homestretch’ Measures”. This document has never been declassified (although the correspondence leading up to it is open).

    This is regrettable, as many of the organizational principles that were conceived during the Manhattan Project are proving—in the long run—to be subtler, and have broader and farther-reaching effects, even than the weapons themselves.

    This is *not* to hold up the Manhattan Project as a laudable exemplar. The point is the opposite—some of these mistakes we should not make twice.

  13. > Every working scientist recognizes this type of
    > fortuitous serendipitous interaction. The problem
    > is that they occur too rarely.

    I coined the phrase “manufactured serendipity” years ago to suggest that serendipity is, in fact, too important to leave to chance.

    We can’t actually cause serendipity to happen, but we can absolutely create environments that make it a lot more likely.

  14. Jon – That’s a great phrase. Markets, of course, use prices as signals to allow us to determine when we have a comparative advantage in solving some problem. So they can be a great mechanism for manufacturing serendipity. It’d be nice to have tools which combined the best of open source and the best of markets. The Matlab programming competition has something of that flavour – open source, but with a clear signalling mechanism (the score).

Comments are closed.