Doing science online

This post is the text for an invited after-dinner talk about doing science online, given at the banquet for the Quantum Information Processing 2009 conference, held in Santa Fe, New Mexico, January 12-16, 2009.

Good evening.

Let me start with a few questions. How many people here tonight know what a blog is?

How many people read blogs, say once every week or so, or more often?

How many people actually run a blog themselves, or have contributed to one?

How many people read blogs, but won’t admit it in polite company?

Let me show you an example of a blog. It’s a blog called What’s New, run by UCLA mathematician Terence Tao. Tao, as many of you are probably aware, is a Fields-Medal winning mathematician. He’s known for solving many important mathematical problems, but is perhaps best known as the co-discover of the Green-Tao theorem, which proved the existence of arbitrarily long arithmetic progressions of primes.

Tao is also a prolific blogger, writing, for example, 118 blog posts in 2008. Popular stereotypes to the contrary, he’s not just sharing cat pictures with his mathematician buddies. Instead, his blog is a firehose of mathematical information and insight. To understand how valuable Tao’s blog is, let’s look at a example post, about the Navier-Stokes equations. As many of you know, these are the standard equations used by physicists to describe the behaviour of fluids, i.e., inside these equations is a way of understanding an entire state of matter.

The Navier-Stokes equations are notoriously difficult to understand. People such as Feynman, Landau, and Kolmogorov struggled for years attempting to understand their implications, mostly without much success. One of the Clay Millenium Prize problems is to prove the existence of a global smooth solution to the Navier-Stokes equations, for reasonable initial data.

Now, this isn’t a talk about the Navier-Stokes equations, and there’s far too much in Terry Tao’s blog post for me to do it justice! But I do want to describe some of what the post contains, just to give you the flavour of what’s possible in the blog medium.

Tao begins his post with a brief statement explaining what the Clay Millenium Problem asks. He shares the interesting tidibt that in two spatial dimenions the solution to the problem is known(!), and asks why it’s so much harder in three dimensions. He tells us that the standard answer is turbulence, and explains what that means, but then says that he has a different way of thinking about the problem, in terms of what he calls supercriticality. I can’t do his explanation justice here, but very roughly, he’s looking for invariants which can be used to control the behaviour of solutions to the equations at different length scales. He points out that all the known invariants give weaker and weaker control at short length scales. What this means is that the invariants give us a lot of control over solutions at long length scales, where things look quite regular, but little control at short length scales, where you see the chaotic variation characteristic of turbulence. He then surveys all the known approaches to proving global existence results for nonlinear partial differential equations — he says there are just three broad approaches – and points out that supercriticality is a pretty severe obstruction if you want to use one of these approaches.

The post has loads more in it, so let me speed this up. He describes the known invariants for the equations, and what they can be used to prove. He surveys and critiques existing attempts on the problem. He makes six suggestions for ways of attacking the problem, including one which may be interesting to some of the people in this audience: he suggests that pseudorandomness, as studied by computer scientists, may be connected to the chaotic, almost random behaviour that is seen in the solutions the Navier-Stokes equations.

The post is filled to the brim with clever perspective, insightful observations, ideas, and so on. It’s like having a chat with a top-notch mathematician, who has thought deeply about the Navier-Stokes problem, and who is willingly sharing their best thinking with you.

Following the post, there are 89 comments. Many of the comments are from well-known professional mathematicians, people like Greg Kuperberg, Nets Katz, and Gil Kalai. They bat the ideas in Tao’s post backwards and forwards, throwing in new insights and ideas of their own. It spawned posts on other mathematical blogs, where the conversation continued.

That’s just one post. Terry Tao has hundreds of other posts, on topics like Perelman’s proof of the Poincare conjecture, quantum chaos, and gauge theory. Many posts contain remarkable insights, often related to open research problems, and they frequently stimulate wide-ranging and informative conversations in the comments.

That’s just one blogger. There are, of course, many other top-notch mathematician bloggers. Cambridge’s Tim Gowers, another Fields Medallist, also runs a blog. Like Tao’s blog, it’s filled with interesting mathematical insights and conversation, on topics like how to use Zorn’s lemma, dimension arguments in combinatorics, and a thought-provoking post on what makes some mathematics particularly deep.

Alain Connes, another Fields Medallist, is also a blogger. He only posts occasionally, but when he does his posts are filled with interesting mathematical tidbits. For example, I greatly enjoyed this post, where he talks about his dream of solving one of the deepest problems in mathematics – the problem of proving the Riemann Hypothesis – using non-commutative geometry, a field Connes played a major role in inventing.

Berkeley’s Richard Borcherds, another Fields Medallist, is also a blogger, although he is perhaps better described as an ex-blogger, as he hasn’t updated in about a year.

I’ve picked on Fields Medallists, in part because at least four of the 42 living Fields Medallists have blogs. But there are also many other excellent mathematical blogs, including blogs from people closely connected to the quantum information community, like Scott Aaronson, Dave Bacon, Gil Kalai, and many others.

Let me make a few observations about blogging as a medium.

It’s informal.

It’s rapid-fire.

Many of the best blog posts contain material that could not easily be published in a conventional way: small, striking insights, or perhaps general thoughts on approach to a problem. These are the kinds of ideas that may be too small or incomplete to be published, but which often contain the seed of later progress.

You can think of blogs as a way of scaling up scientific conversation, so that conversations can become widely distributed in both time and space. Instead of just a few people listening as Terry Tao muses aloud in the hall or the seminar room about the Navier-Stokes equations, why not have a few thousand talented people listen in? Why not enable the most insightful to contribute their insights back?

You can also think of blogs as a way of making scientific conversation searchable. If you type “Navier-Stokes problem” into Google, the third hit is Terry Tao’s blog post about it. That means future mathematicians can easily benefit from his insight, and that of his commenters.

You might object that the most important papers about the Navier-Stokes problem should show up first in the search. There is some truth to this, but it’s not quite right. Rather, insofar as Google is doing its job well, the ranking should reflect the importance and significance of the respective hits, regardless of whether those hits are papers, blog posts, or some other form. If you look at this way, it’s not so surprising that Terry Tao’s blog post is near the top. As all of us know, when you’re working on a problem, a good conversation with an insightful colleague may be worth as much (and sometimes more) than reading the classic papers. Furthermore, as search engines become better personalized, the search results will better reflect your personal needs; in a search utopia, if Terry Tao’s blog post is what you most need to see, it’ll come up first, while if someone else’s paper on the Navier-Stokes problem is what you most need to see, then that will come up first.

I’ve started this talk by discussing blogs because they are familiar to most people. But ideas about doing science in the open, online, have been developed far more systematically by people who are explicitly doing open notebook science. People such as Garrett Lisi are using mathematical wikis to develop their thinking online; Garrett has referred to the site as “my brain online”. People such as chemists Jean-Claude Bradley and Cameron Neylon are doing experiments in the open, immediately posting their results for all to see. They’re developing ideas like lab equipment that posts data in real time, posting data in formats that are machine-readable, enabling data mining, automated inference, and other additional services.

Stepping back, what tools like blogs, open notebooks and their descendants enable is filtered access to new sources of information, and to new conversation. The net result is a restructuring of expert attention. This is important because expert attention is the ultimate scarce resource in scientific research, and the more efficiently it can be allocated, the faster science can progress.

How many times have you been obstructed in your research by the need to prove or disprove a small result that is a little outside your core expertise, and so would take you days or weeks, but which you know, of a certainty, the right person could resolve in minutes, if only you knew who that person was, and could easily get their attention. This may sound like a fantasy, but if you’ve worked on the right open source software projects, you’ll know that this is exactly what happens in those projects – discussion forums for open source projects often have a constant flow of messages posing what seem like tough problems; quite commonly, someone with a great comparative advantage quickly posts a clever way to solve the problem.

If new online tools offer us the opportunity to restructure expert attention, then how exactly might it be restructured? One of the things we’ve learnt from economics is that markets can be remarkably effective ways of efficiently allocating scarce resources. I’ll talk now about an interesting market in expert attention that has been set up by a company named InnoCentive.

To explain InnoCentive, let me start with an example involving an Indian not-for-profit called the ASSET India Foundation. ASSET helps at-risk girls escape the Indian sex industry, by training them in technology. To do this, they’ve set up training centres in several large cities across India. They’ve received many requests to set up training centres in smaller towns, but many of those towns don’t have the electricity needed to power technologies like the wireless routers that ASSET uses in its training centers.

On the other side of the world, in the town of Waltham, just outside Boston, is the company InnoCentive. InnoCentive is, as I said, an online market in expert attention. It enables companies like Eli Lilly and Proctor and Gamble to pose “Challenges” over the internet, scientific research problems they’d like solved, with a prize for solution, often many thousands of dollars. Anyone in the world can download a detailed description of the Challenge, and attempt to win the prize. More than 160,000 people from 175 countries have signed up for the site, and prizes for more than 200 Challenges have been awarded.

What does InnoCentive have to do with ASSET India? Well, ASSET got in touch with the Rockefeller Foundation, and explained their desire for a low-cost solar-powered wireless router. Rockefeller put up 20,000 in prize money to post an InnoCentive Challenge to design a suitable wireless router. The Challenge was posted for two months at InnoCentive. 400 people downloaded the Challenge, and 27 people submitted solutions. The prize was awarded to a 31-year old Texan software engineer named Zacary Brown, who delivered exactly the kind of design that ASSET was looking for; a prototype is now being built by engineering students at the University of Arizona.

Let’s come back to the big picture. These new forms of contribution – blogs, wikis, online markets and so forth – might sound wonderful, but you might reasonably ask whether they are a distraction from the real business of doing science? Should you blog, as a young postdoc trying to build up a career, rather than writing papers? Should you contribute to Wikipedia, as a young Assistant Professor, when you could be writing grants instead? Crucially, why would you share ideas in the manner of open notebook science, when other people might build on your ideas, maybe publishing papers on the subjects you’re investigating, but without properly giving you credit?

In the short term, these are all important questions. But I think a lot of insight into these questions can be obtained by thinking first of the long run.

At the beginnning of the 17th century, Galileo Galilei constructed the first astronomical telescope, looked up at the sky, and turned his new instrument to Saturn. He saw, for the first time in human history, Saturn’s astonishing rings. Did he share this remarkable discovery with the rest of the world? He did not, for at the time that kind of sharing of scientific discovery was unimaginable. Instead, he announced his discovery by sending a letter to Kepler and several other early scientists, containing a latin anagram, “smaismrmilmepoetaleumibunenugttauiras”. When unscrambled this may be translated, roughly, as “I have discovered Saturn three-formed”. The reason Galileo announced his discovery in this way was so that he could establish priority, should anyone after him see the rings, while avoiding revealing the discovery.

Galileo could not imagine a world in which it made sense for him to freely share a discovery like the rings of Saturn, rather than hoarding it for himself. Certainly, he couldn’t share the discovery in a journal article, for the journal system was not invented until more than 20 years after Galileo died. Even then, journals took decades to establish themselves as a legitimate means of sharing scientific discoveries, and many early scientists looked upon journals with some suspicion. The parallel to the suspicion many scientists have of online media today is striking.

Think of all the knowledge we have, which we do not share. Theorists hoard clever observations and questions, little insights which might one day mature into a full-fledged paper. Entirely understandably, we hoard those insights against that day, doling them out only to trusted friends and close colleagues. Experimentalists hoard data; computational scientists hoard code. Most scientists, like Galileo, can’t conceive of a world in which it makes sense to share all that information, in which sharing information on blogs, wikis, and their descendents is viewed as being (potentially, at least) an important contribution to science.

Over the short term, things will only change slowly. We are collectively very invested in the current system. But over the long run, a massive change is, in my opinion, inevitable. The advantages of change are simply too great.

There’s a story, almost certainly apocryhphal, that the physicist Michael Faraday was approached after a lecture by Queen Victoria, and asked to justify his research on electricity. Faraday supposedly replied “Of what use is a newborn baby?”

Blogs, wikis, open notebooks, InnoCentive and the like aren’t the end of online innovation. They’re just the beginning. The coming years and decades will see far more powerful tools developed. We really will enormously scale up scientific conversation; we will scale up scientific collaboration; we will, in fact, change the entire architecture of expert attention, developing entirely new ways of navigating data, making connections and inferences from data, and making connections between people.

When we look back at the second half of the 17th century, it’s obvious that one of the great changes of the time was the invention of modern science. When historians look back at the early part of the twentyfirst century, they will also see several major changes. I know many of you in this room believe that one of those changes will be related to the sustainability of how humans live on this planet. But I think there are at least two other major historical changes. The first is the fact that this is the time in history when the world’s information is being transformed from an inert, passive, widely separated state, and put into a single, unified, active system that can make connections, that brings that information alive. The world’s information is waking up.

The second of those changes, closely related to the first, is that we are going to change the way scientists work; we are going to change the way scientists share information; we are going to change the way expert attention itself is allocated, developing new methods for connecting people, for organizing people, for leveraging people’s skills. They will be redirected, organized, and amplified. The result will speed up the rate at which discoveries are made, not in one small corner of science, but across all of science.

Quantum information and computation is a wonderful field. I was touched and surprised by the invitation to speak tonight. I have, I think, never felt more honoured in my professional life. But, I trust you can understand when I say that I am also tremendously excited by the opportunities that lie ahead in doing science online.

Further reading

I’m writing a book about “The Future of Science”; this post is part of a series where I try out ideas from the book in an open forum. A summary of many of the themes in the book is available in this essay. If you’d like to be notified when the book is available, please send a blank email to the.future.of.science@gmail.com with the subject “subscribe book”. I’ll email you to let you know in advance of publication. I will not use your email address for any other purpose! You can subscribe to my blog here.

52 comments

  1. Hi Michael –

    Thanks for the shout-out! The ASSET India Challenge is a great example of finding solutions in unexpected places – and changing the world. We are very happy to have ASSET as one of our Seekers. If you ever have questions about InnoCentive, please feel free to drop me a note.

    Connie French
    Enterprise Marketing Manager, InnoCentive

  2. Great post, as always. I very much appreciate the high quality and eminent readability of what you write.

    I’m curious: has anyone pulled together a “current state of affairs” summary of the ways people are doing science online, including the currently available tools, their respective pros and cons, and how they are being used effectively in research? I’ve seen a few tools and example projects mentioned here and there, but usually the same ones comes up. Has there been any attempt to comprehensively explore what tools are available, what needs they fill, and what needs have not yet been met?

    Thanks again for writing. I’m looking forward to your book.

  3. Dear Michael, this is a very interesting post! (Of course, I am very flattered that I am mentioned as a person closely connected to the quantum information community, as this certainly reflects my feelings and aspirations; that my blog is mentioned along the very successful blogs of Dave and Scott, and that even my comment on the extraordinary Navier-Stokes post by Terry Tao was noticed. Thanks!)

    Here are few remarks: I agree regarding the importance of the “first change” about information, and human action based on information. Information is vastly growing, accessing information is becoming vastly easier and quicker; humans’ information-based actions are becoming quick, partially automatic and highly entangled. This is a major change in our social and economic reality. (For the better?, you may ask.)

    As for the second change in the small corner of science I am more skeptical. I am not sure we witness a major change and how good it would be. Even for the Navier-Stokes equation, a very important characteristic of the effort regarding it is, as you mentioned, “people such as Feynman, Landau, and Kolmogorov struggled for years attempting to understand their implications, mostly without much success”. Now, if blogging will make our mostly solitary business of hitting our heads against the wall somehow more pleasant, this is fine; but if it will somehow come as a substitute for our traditional head-against-wall hitting ritual this will not cut it, at least as far as traditional goals of science go.

  4. A student sent me the following link: http://www.academia.edu/. Don’t know if it could be utilized for research connections or if it’s just another way like “rateyourprofessor.com” for students to see who’s who at universities around the world. But I don’t think I’ve seen anything else quite like it in terms of an inventory of who is interested in what. Participants are invited to upload papers as well, which for many, may provide another way of sharing research, especially when a project has low “publish-ability” in traditional outlets.

    I sent this post as recommended reading to a current class (thesis writing for undergrads, mostly neurosciences) who are my guinea pigs for a scaffolded approach to the science 2.0 idea. They are skeptical of blogs, skeptical of “open science” and seemingly not encountering the idea as part of their regular science classes. Now, I’m making them blog (private to the class for the time being), set up delicious accounts, and read various perspectives on the subject. Your essay on “The Future of Science” is a required reading in a couple of weeks, as are some essays by proponents of the traditional model. I’m looking forward to seeing how they process it all!

  5. Another fascinating post and one that I hope will become widely cited. I agree with Jacob, “Great post, as always. I very much appreciate the high quality and eminent readability of what you write.”

    The peroration is touching. I found this exciting also:

    “Blogs, wikis, open notebooks, InnoCentive and the like aren’t the end of online innovation. They’re just the beginning. The coming years and decades will see far more powerful tools developed. We really will enormously scale up scientific conversation; we will scale up scientific collaboration; we will, in fact, change the entire architecture of expert attention, developing entirely new ways of navigating data, making connections and inferences from data, and making connections between people.”

    As I read, I wondered whatever happened to all the blog-specific search engines that were all the rage some years ago. Have they been subsumed within the universal search of Google and its rivals?

    I had not heard of InnoCentive before and as someone who runs a free grants/scholarship listing site in the health sciences, I found that fascinating. I try very hard with ScanGrants to get the word out about sources of funding to researchers and found InnoCentive very heartening in that respect. My dream is to somehow become incredibly wealthy and sponsor prizes for research on amyotrophic lateral sclerosis, rather like Prize4Life
    http://www.prize4life.org/

    Given my interest in ALS and thus in neurodegenerative diseases, I was sorry to read Mickey’s comments, “…(thesis writing for undergrads, mostly neurosciences) who are my guinea pigs for a scaffolded approach to the science 2.0 idea. They are skeptical of blogs, skeptical of “open science” and seemingly not encountering the idea as part of their regular science classes.”

    I am a co-founder of the new blog Next Generation Science

    http://www.nextgenerationscience.com/

    and am finding that working on a blog is a wonderful way to learn. For instance, I learned of Michael’s blog because my partner at Next Generation Science Walter Jessen (cancer biologist/bioinformatician) put a link on our home page to FriendFeed Science 2.0 room. I read there a discussion about Web 2.0 technologies in science and specifically a comment there by Michael. I clicked on his name to find out more about him and now I follow this blog and have been hugely impressed by his erudition and think pieces on Science 2.0. It is from Walter that I have learned that Twitter is an outstanding tool for keeping up on Science Web 2.0 and would be most interested in Michael’s thoughts on Twitter.

    This truly is an exciting time to be alive and interested in science and Michael and his colleagues are providing a powerful counter to those who argue that blogs cannot act as fora for meaty discussion on weighty topics. I have visited this blog for only a few weeks, but have been most impressed by the trouble Michael takes to respond with precision and courtesy to comments posted here.

  6. Gil – I started to reply to your comment, but it has turned into a post all its own. It’ll be next open science post – should appear in the next day or so!

  7. Mickey – I hope it goes well. I’m very interested to hear of your students’ skepticism. I presume they take the recent big changes (email, skype, preprints, online journals) for granted, assuming it couldn’t be any other way, but don’t see how other things could change? There’s an interesting paper by Ziman in Nature (1969, entitled “Information, communication, knowledge”) that lays out a bunch of arguments essentially pouring cold water on things like preprints, email, etc, although of course he didn’t call them that. It might provide some interesting perspective for your students.

  8. Hope – a lot of the blog-specific search engines survive. They don’t seem to have succeeded very well as businesses, though, and there’s not a lot of innovation going on, that I can see.

    Twitter! It’s absolutely, utterly fascinating, is the short answer. Gives all sorts of insights into how humans relate to technology, and to each other. The long answer is way too long – maybe grist for a future blog post! I’m not actually directly active on twitter, although I do have an account. But I do participate in a lot of twitter-derived threads on FriendFeed, which is an interesting phenomenon in its own right, standing with one foot in the Twitter universe, and one foot in the FriendFeed Universe. Not quite enough time to really be active with both feet in!

  9. Hi, Michael. I agree with you about the blog search engines. I think they have been rendered redundant by the universal search of Google. They are so 2006 or so.

    I hope you do, indeed, write a long post on Twitter. It certainly merits one and I would be most interested by your comments. I very much enjoy participating in the FriendFeed Science 2.0 room. I only wish there were more medical people in it. But it is interesting to chat with the mathematicians and physicists there, though, in order to see how Science 2.0 tools are used in various disciplines. It is useful to see what they are doing. I have enjoyed reading of Steve Koch’ doings, for instance, via FriendFeed: http://stevekochscience.blogspot.com/2009/01/update-on-our-new-open-science.html

    and I enjoy your comments there.

    I know what you mean about the press of time. I have found at there are wonderful things to read that I learn about on Twitter. This blog and this posting, for instance, are very worth reading:

    http://philbaumann.com/2009/01/18/free-ebook-140-health-care-uses-for-twitter/

    and this report I twitted about having read about it in the email newsletter of the American Library Association. That is an example of multichannel info consumption and dissemination and a case such as you have written about on the economics of sharing—I took the time to twit about the item but have not taken the time yet to print out the report to read it myself—kind of fraudulent of my part to recommend something that I have not read, but time is of the essence:

    http://connect.educause.edu/Library/ELI/2009HorizonReport/48003?time=1232508996

  10. Michael,

    Thank you for the reading recommendation — I appreciate the help! I’m not sure if an explicit “class” in science/web 2.0 is practical or not, but I think a bibliography and some practice with the basic tools should be helpful. It’s difficult to tell at this point how much student skepticism is actual skepticism of science 2.0 as a valid form of practice and how much is just overload b/c they are part of a small group of undergrads doing “hard” research and already feel overwhelmed by the demands of a full course load, 15-20 hours/week in a lab, applications & interviews to med school… and now I’m expecting them to tackle the intersection of technology, ideology, and science. Happily, their first blog entries had them responding to each other in just the sort of way I’d hoped — great comments, good questions, some exchange of sources. They have not been equally impressed by other requirements, though seemed to take on 2collab just fine (most preferred it’s “academic” feel to delicious; I prefer the latter, but the point was to get them to begin seeing the value of sharing work). The first time teaching something new is always an experiment — actually, Hope’s response was really useful, and I’ve included nextgenerationscience.com on their reading list b/c the explicit, practical motivation of funding opportunities and also to show that everyone in the community starts by jumping in and figuring it out. My real hope is that I can provide a foundation that makes them more comfortable not just with the idea of doing science online, but with the skills required — there are some clear genre-specific sociolinguistic trends and genre-specific terms that do make the entrance into the online community somewhat intimidating, despite how very friendly the people are. This is another reason I appreciated Hope’s reply. Her description of what the journey’s been like is very much like mine, albeit her tech skills are way more sophisticated!

  11. Pingback: buzz

Comments are closed.