Five problems with doing research in the open

I’m an advocate of extreme openness in doing research. I believe online tools are greatly underused by scientists, and that taking a far more open approach to the sharing of scientific ideas and data is one important step to taking full advantage of those tools.

Although there are great advantages to such openness, there are also many problems. The biggest problems, which I’ve talked about here before, are to do with lack of incentives to share information, and the relatively undeveloped state of online tools. While tools like wikis and blogs are great for some purposes, they’re just not all that well adapted to research.

This post enumerates some of the other problems with extreme openness. None are insurmountable, and none change my belief that extreme openness will be extremely valuable to the progress of science. But they seem worth writing out systematically. In the interests of brevity I’ve concentrated on listing problems, not proposing solutions, although solutions to many come readily to mind.

No-one wants someone looking over their shoulder while they work: As I draft my book, there’s times when I’m best off working away in private, without interruption and distraction, and times when I could really do with some scrutiny. It’s plain irritating and disheartening to be flamed by someone for a flaw in your work that you’re already aware of, and planning to fix. A partial solution is to do research in a low-visibility but essentially open fashion. Many open source projects do this – work on the project is carried out via open but obscure channels (e.g., mailing lists), and then major releases are announced in a high visibility channel.

Conversation doesn’t scale naively: In an ideal world, online tools will connect people with well-matched and complementary expertise. What you see instead in a lot of online conversations is experts “connected” to the rude and uninformed. I don’t know that this brings much benefit to anyone. I think this problem can be solved with improved design of the tools, and improved filtering, but it’s a real problem.

Groupthink: Original thinking that goes against conventional wisdom is not always well rewarded in the open. Commentary is often knee-jerk, rather than constructive. As an example, one of the most original thinkers I know is Robin Hanson; Robin often blogs [1] interesting ideas that contradict conventional wisdom. There’s some great comments on his blog posts, but there’s also a lot of noise, and poorly thought-out responses as people see their sacred cows challenged.

Giving offence: This is really an instance of the previous problem: it’s difficult to develop ideas that may give offense in the open. I don’t think I’d want to write a research article about gun control out in the open.

Ethics and IP concerns: Sometimes, confidentiality may prevent disclosure of data, either for ethical reasons or to protect IP, or both.

[1] Robin is only one of many contributors to that blog, and you may need to scroll down to find some of Robin’s contributions.

Published

Biweekly links for 11/03/2008

Click here for all of my del.icio.us bookmarks.

Published

Biweekly links for 10/31/2008

Click here for all of my del.icio.us bookmarks.

Published

Oh, Canada!

I became a Canadian Permanent Resident, just a few hours ago. I should crack open the maple syrup, and down a few Tim Hortons donuts!

Update: Uh-oh. I’ve been told that I spelt donut in an un-Canadian fashion. Could be a problem here…

Published
Categorized as Canada

Biweekly links for 10/27/2008

  • Jon Udell on the evolution of a Wikipedia article.
    • He traces, of all things, the “Heavy metal umlaut” article. It’s fascinating to see the evolution in real time.
  • Australian government trying to gag web censor critics
    • Continuing an unfortunate tradition in which both major parties in Australia make major mistakes in regard to internet policy.
  • I think I’m musing my mind – Roger Ebert’s Journal
    • Ebert on losing the ability to speak, and what writing means to him.
  • Will Algorithms Make Human Editors Obsolete? Not If Journalists Collaborate – Publishing 2.0
    • “most news site still see original content creation as their sole purpose — they don’t see the tremendous need, and the tremendous value in filtering the content that already exists. They don’t see that every link on their site is an important editorial judgment, not an afterthought, not an algorithmic process to set and forget (which often leads to algorithms making bad recommendations, as many news sites who use them will tell you).”
  • Terry Tao: Princeton Companion to Mathematics
    • The PCM is now out. Based on the bits I’ve seen, this should be an incredible resource for mathematicians. Note that there’s an “Advice to Younger Mathematicians” section which is free on the web, containing advice from people like Atiyah, Connes and many other great mathematicians. See the links in the post.
  • Kevin Kelly: Cloud Culture
    • “The war over copyright will seem tame compared to the legal battles that the life in the cloud will hatch. Who’s laws will prevail? The laws of your domicile, the laws of your server’s domicile, or the laws of international exchange? Who gets your taxes if all the work is being done in the cloud? The transparent discontinuity between legal regimes will be a threat to the expansion of the cloud. This friction will also force the growth of multiple clouds. Clouds with varying legal frameworks will compete at the global level, although within many geographical regions, there may be little choice. But the legal issues are not merely international. Who owns the data, you or the cloud? If all your email and voice calls go through the cloud, who is responsible for what it says? In the new intimacy of the cloud, when you have half-baked thoughts, weird daydreams, should they not be treated differently than what you really believe?”
  • Tech Tips for the Basic Computer User – David Pogue
    • I was surprised by how many of these I didn’t know – maybe a quarter, some of them very useful.
  • The character of a country
    • “In a connected world, countries, governments and companies also have character, and their character — how they do what they do, how they keep promises, how they make decisions, how things really happen inside, how they connect and collaborate, how they engender trust, how they relate to their customers, to the environment and to the communities in which they operate — is now their fate.”
  • David Heinemeier Hansson: Acquire taste
    • “The problem with the concept of taste is that it’s so ephemeral. One view of the world is that some people just have it and others don’t. Either you’re lucky enough to be born with it and you’ll be forever awesome or you’re a tasteless sod doomed to create crappy work. I don’t subscribe.

      I think taste is mostly about developing an eye for the details that matter and that it’s absolutely something that can be learned. The best way to learn what details that matter is to examine the details of great and not-so-great work and contrast and compare.”

  • Intro to Failure – Eva Amsen
    • “But still. I’m not doing the “normal thing”. I don’t have a postdoc lined up. I feel like I told people I am dropping out of high school and moving to Hollywood to become an actor. No Academy Award would ever make up for the feeling of failing high school. And no matter what I end up doing in a few years, no matter how much I love it or how good I am at it, it’s not going to make up for the feeling of failing the standard research career. I think I also sense some of this in Anna’s recent blog post . There is a feeling of being lost when leaving the mainstream track. One of the things I talked about in my failure session at BioBarCamp was my aversion to the term “alternative career”. The fact that it’s called “alternative” already makes it sound like it doesn’t quite live up to the career it is an alternative to – the research career.”
  • Linux Kernel Development
    • Great overview, with many fascinating tidbists, including a list of which companies are contributing most to the kernel. One particularly interest fact: “over 70% of all kernel development is demonstrably done by developers who are being paid for their work.” Linux isn’t so much a volunteer effort by individuals, as it is now a volunteer effort by companies; the economics of that are pretty darned interesting. I guess a lot see themselves as downstream of Microsoft’s business model, and want a viable competitor, similar to Google’s support of Firefox.
  • Larry Sanger: The Early History of Nupedia and Wikipedia
    • Excellent account, first of two parts.
  • Wikipedia’s history, according to Wikipedia
    • Very informative, with lots of great links.
  • Editing – and re-editing – Sarah Palin’s Wikipedia entry – International Herald Tribune
    • Many interesting details on the cleanup of Sarah Palin’s wikipedia page that occurred in the run-up to the announcement that she was McCain’s VP pick.
  • Wikipedia Edits Forecast Vice Presidential Picks – washingtonpost.com
    • A comparison of Wikipedia edits in the runup to McCain’s pick of Palin. The bottom line is that Palin’s entry showed a lot of activity compared with other contenders, but there’s loads of other interesting information in the article as well.

Click here for all of my del.icio.us bookmarks.

Published

Why Augmenting Collective Intelligence is Easier than Augmenting Individual Intelligence

When I first heard about intelligence augmentation, I thought the idea was amazing – you could outsource cognitive tasks to your computer, effectively making you smarter.

At first, it’d be mundane stuff, multiplying numbers on a calculator, things like that. But as computers got more powerful, it’d be possible to outsource progressively more complex and interesting tasks. You’d be getting smarter, along with the progress of technology.

I heard about this in the early 1990s, before the web had taken off. At the time, the way I (and, I suspect, many other people who’d heard of it) looked at intelligence augmentation was primarily as a way of augmenting individual intelligence.

The way things have turned out, though, it seems to be a lot easier to augment collective intelligence than it is to augment individual intelligence. At the least, progress on augmenting collective intelligence has been spectacular over the past 15 years, while progress on augmenting individual intelligence has been slow. If I have to choose between giving up my calculator (or any other individual tool), and giving up Google, the calculator will be in the trash.

Perhaps part of the reason for my mistake was familiarity. For most of us, especially circa 1990, the intelligence of individuals was an everyday concept, but collective intelligence was, and to some extent remains, exotic.

Of course, with hindsight it’s not so strange that augmenting collective intelligence is easier than augmenting individual intelligence.

Collective intelligence requires us to externalize our thoughts, expressing them in symbols, so they can be communicated to others. This has the coincidental effect of making those thoughts (or, at least, their expression) accessible to computers in a way that our internal brain state is not. The more communication is taking place, the more opportunity there is for software to contribute.

Google demonstrates this vividly, extracting valuable information from the links between webpages, information that can then be fed back to make us smarter. I’ve long thought it’d be fun to do a controlled experiment in which two groups of people are given an IQ test, with the only difference between the groups being that one has access to the web, and the other does not.

You may object that I’m using the term “augmentation of collective intelligence” in a funny way. After all, Google is used by just a single person at a time. Of course, I’m using the term broadly, to mean tools for intelligence augmentation that build in an essential way upon collective intelligence. Maybe a more literal description would be “collective augmentation of intelligence”, or something similar. But the argument I’ve made holds equally true also in the narrow sense of literally augmenting collective intelligence, as shown by examples Kasparov versus the World, the Matlab programming competition, open source biology, Linux, and Wikipedia.

A busy day at Wikipedia

Sarah Palin was announced as John McCain’s running mate on August 29, 2008. Shortly before the announcement, her Wikipedia page looked like this.

About nine hours and 1100 edits after the announcement, the article was vastly different, and much, much better.

The example comes from an enjoyable talk by David Weinberger that I attended last night. Weinberger also runs a good blog, but note that, like many blogs, you need to look at the back catalogue to see what it’s like outside American election season.

Incidentally, I wonder what a graph of the frequency of edits to Palin’s page would show before the announcement, as compared with other contenders for the VP slot? Her page does seem to get a remarkable amount of editing attention in the runup to the announcement.

Published
Categorized as Wikipedia

Biweekly links for 10/24/2008

  • Charlie’s Diary: Living through Interesting Times
    • Sobering, as an author who is writing a book about “The Future of Science”: “There’s a graph I’d love to plot, but I don’t have the tools for. The X-axis would plot years since, say, 1950. The Y-axis would be a scatter plot with error bars showing the deviation from observed outcomes of a series of rolling ten-year projections modeling the near future. Think of it as a meta-analysis of the accuracy of projections spanning a fixed period, to determine whether the future is becoming easier or harder to get right. I’m pretty sure that the error bars grow over time, so that the closer to our present you get, the wider the deviation from the projected future would be. Right now the error bars are gigantic… We’re living through interesting times”
  • BBC NEWS | Japanese plant writes blog
    • “A potted plant at a cafe near Tokyo, Japan is entertaining customers by writing a regular blog about its feelings… The plant’s latest entry reads: “It was cloudy today. It was a cold day.” “
  • Charlie’s Diary: On finishing
    • Charlie Stross: “Creativity is a weird thing. You can plod along a steep uphill road for months, or it can hit you like an express train. I’ve learned to go with the flow when this happens: if a story wants to escape it’s best to let it out, to go with the flow and worry about cleaning up the schedule afterwards. Once you pass forty, it doesn’t seem to happen so often. This is the first such novel-length outburst I’ve had since I wrote the first draft of “Glasshouse”, back in 2003, and I hope to live long enough to experience it again some time …”
  • Usain Bolt: It’s Just Not Normal – Freakonomics
  • Robert Calliau: A Short History of the Web
    • A useful history. I found this tidbit amusing: “The Hypertext’91 conference (San Antonio) allows us a “poster” presentation (but does not see any use of discussing large, networked hypertext systems…). “
  • Doug Engelbart: The Demo
    • Engelbart’s legendary 1968 demo, demonstrating networked hypertext, and loads of other stuff.
  • molly irwin: the present
    • Wonderful project in recording one’s own life.
  • The Coming of Age of E-Prints in the Literature of Physics
    • Nothing here will come as news to physicists, but it’s interesting to see it all measured, albeit a few years ago: “Examination of the role of e-prints in physics literature was conducted by citation analysis… The data from SPIRES-HEP indicates that e-prints are used to a greater extent by physicists than previously measured and that e-prints have become an integral and valid component of the literature of physics. “
  • Digg’s Recent Bans and the Limits of Crowdsourcing – Mashable
    • Most decision-making systems are susceptible to gaming, and online systems like Digg are no exception. This article has a lot of interesting material about Digg’s fight against people who game the system. I don’t buy the article’s line that this is being done purely to serve Digg’s VC masters, but there’s tons of great stuff in here anyway.
  • Crowdsourcing: The Tools of Globalization
    • A great InnoCentive success story: “ASSET … is a non-profit organization that helps train the children of sex workers and girls rescued from trafficking, in technology, so they can escape the sex slave industry in India… For ASSET to spread to rural India, they not only needed equipment and funding, but a way to allow IT companies to open branches in such areas. And to do that they needed power. And for power they needed electricity. Or did they?… What if the hardware needed to open shop could be run off solar energy? At this point another non-prof, Global Giving, connected ASSET to the Rockefeller Foundation, which agreed to pay to post the problem to InnoCentive’s site. Several months later some 27 solutions had been submitted. The “challenge” called for the design of a solar-powered wireless router composed of low-cost, readily available hardware and software components.”
  • Crowdsourcing: The Pitfalls of Citizen Journalism
    • “CNN wanted to give its viewers a voice. Instead it provided stock manipulators with one. Nice.”
  • Kevin Kelly: Deep Fun
    • “Directions for about 25 well-proven games for groups are succinctly supplied by this free PDF book. These games originated in church youth groups, but I’ve seen them used at camps, large family gatherings, company retreats, and even a few tech meetings. They are aimed at building community, and are primarily ones that can be run indoors. I’ve played a number of these games as an adult over the years and they really are deep fun. It is amazing how fast you can unleash your inner kindergartner. Some of this group fun, like Silent Football, have been around since ancient youth camp times. I wish more folks would enliven their stuffy meetings and offsites with a few of these games.”
  • Kevin Kelly: Thinkism
    • The best argument against the singularity that I’ve seen. Basically, Kelly’s point is that there are certain actions an AI may wish to undertake that have inescapably long natural timescales. E.g., if the AI needs neutron decay for some reason (goodness knows what) it’s just going to need to wait 12 minutes, like the rest of us. I don’t entirely buy this – there’s a great deal that “thinkism” can accomplish, and there’s a great deal of lengthy experimentation that can be avoided. I haven’t done the argument complete justice, but that’s the gist.
  • Kevin Kelly: The Expansion of Ignorance
  • Old Rules for the New Economy
    • Kevin Kelly on Paul Krugman on Kelly’s “New Rules for the New Economy”. I found the book difficult to read until I realized Kelly was writing as a prophet – someone who glimpses something majestic and grand but perhaps does not fully understand it. After that, I thought the book was terrific.
  • Overcoming Bias: Academics in Clown Suits
    • “Imagine the reception an academic would get if he gave a talk in a clown suit…. No matter how well his work otherwise corresponded to academic norms, it would be hard to get other academics to take him seriously… Academics are well aware that these norms are relatively arbitrary, but usually assume that similar norms do not influence the content of their talks or papers. But I strongly suspect that not only are some presentation formats considered too silly to be taken seriously, the same also applies to many topics. That is, I suspect academics refuse to consider certain topics and theses because such things just seem silly. Academics assume that silly-seeming topics must be unworthy of study, but this conclusion may not really be based on much analysis; it could be the same immediate unthinking reaction they would have to a prof in a clown suit. “
  • Marginal Revolution: China makes major shift to universal health care
  • Personal Genome Project
    • “Participants may volunteer to publicly share their DNA sequence and other personal information for research and education. “
  • The LHC Grid: Data storage and analysis for the largest scientific instrument on the planet
    • Great overview of what’s going on in the LHC computing-wise. I hadn’t realized it, but outside Google and (probably) the NSA, the LHC has the world’s biggest cluster -a a staggering project.

Click here for all of my del.icio.us bookmarks.

Published

Suppress innovation, but claim the credit

It is a staple of wisdom amongst many physicists that “physicists invented the web”. This is a story trotted out particularly when physicists justify their work to the outside world. A string theorist once told me that virtually all his grant applications include a paragraph that says “support fundamental research in physics – that’s what brought us the web”.

In fact, the claim that physicists invented the web is largely mythical.

It’s true that the principal inventor of the web, Tim Berners-Lee, was a programmer working at CERN, the huge European particle accelerator. In 1988 he sketched out a way of hooking up hypertext ideas, developed by people like Ted Nelson and Bill Atkinson, to the internet, developed by people like Vint Cerf and Bob Kahn. He talked the idea up at CERN for a year, with no response. In 1989 he wrote up and circulated a formal proposal around CERN. Again, no response for a year. Finally, he coded up a prototype in his spare time. In this, he actually was helped by his manager, who said it was okay if he used one of CERN’s workstations to build the prototype. It was launched to the world about one year later.

Berners-Lee didn’t succeed because CERN was doing fundamental research. He succeeded in spite of it.

There’s a point to this, aside from correcting an often-repeated and erroneous claim. Large institutions, even those that believe they are dedicated to innovation, often systematically suppress the best ideas of the people working in them. The unpredictable nature of innovation, especially many of the most important innovations, means that the institutional mission is often mis-aligned with the innovation.

The example of Berners-Lee is one of many. Think of Bell Labs lack of support for the early work on Unix. Or the way grant agencies fund focused grants in fashionable areas decided by centralized committees. After the fact, these organizations trumpet their “success”. But very often they succeed only in spite of themselves, and they frequently show no signs of learning from this fact.

Published

Biweekly links for 10/20/2008

Click here for all of my del.icio.us bookmarks.

Published