Biweekly links for 04/27/2009

  • Keeping Abreast of Pornographic Research in Computer Science – Steve Hanov’s Technology Blog
    • Useful (and funny) overview of the approaches taken by companies like Google in figuring out what pages contain porn, and what don’t.
  • Building the SENSEable City – O’Reilly Radar
    • “Well, for the urban planners, there is a big, big revolution going on. What happens today is that policies and plans are thought by assumptions. And their effects and imports can be evaluated only after a long time that they are implemented because, again as it was seen before, gathering this information is expensive. It’s costly. It’s cumbersome. So it’s really impossible to get this information in real-time. What is going to happen is that instead of planning the city, the urban planners would actually have to program the city, to configure [it] in real-time because information will flow in real-time. So if you change the direction of the one-way road, you will see almost immediately what the effect on traffic is. If you close an area to cars, you can see immediately what will happen into the mobility in general. And if you create public spaces in a place rather than another, you will see immediately how people will react to that. “

Click here for all of my del.icio.us bookmarks.

Published

Biweekly links for 04/24/2009

  • Kavan Modi’s quantum information blog
  • Bruce Perens – A Cyber-Attack on an American City
    • “Just after midnight on Thursday, April 9, unidentified attackers climbed down four manholes serving the Northern California city of Morgan Hill and cut eight fiber cables in what appears to have been an organized attack on the electronic infrastructure of an American city. Its implications, though startling, have gone almost un-reported.

      That attack demonstrated a severe fault in American infrastructure: its centralization. The city of Morgan Hill and parts of three counties lost 911 service, cellular mobile telephone communications, land-line telephone, DSL internet and private networks, central station fire and burglar alarms, ATMs, credit card terminals, and monitoring of critical utilities. In addition, resources that should not have failed, like the local hospital’s internal computer network, proved to be dependent on external resources, leaving the hospital with a “paper system” for the day.

      In technical terms, the area was partitioned from the surrounding internet.”

  • World Digital Library
  • Henry Ford & Event Driven Architecture
    • Nice metaphor for programming in a distributed environment.
  • …My heart’s in Accra » China’s complicated internet culture
  • Overcoming Bias: Future Incompetence
    • Good question: “[…] will those who feel that their superior minds justify their ruling the lives of others accept having their lives ruled by future folk with greatly enhanced minds? “
  • The Long Now Blog » Daniel Everett, “Endangered Languages, Lost Knowledge and the Future”
    • “The Pirahã language has no numbers or concept of counting (only terms for “relatively small” and “relatively large”); no kinship terms beyond immediate children and parents; no “left” and “right” (only “upriver” and “downriver”); no named distinction of past and future (only near time and far time); no creation stories or myths; and—most important for linguists—no recursion.

      A recursive sentence like “The boy who was fishing owned the dog” does not occur in the Pirahã language. They would say, “The boy was fishing” and “The boy owned the dog.” The eminent linguist Noam Chomsky has declared that recursion is an essential part of human language and is innate. Chomsky’s former student Everett says that the Pirahã language proves otherwise. The resultant controversy is profound.”

  • Stewart Brand – On the Waterfront – Interview – NYTimes.com
  • SAT is Not Too Easy « Gödel’s Lost Letter and P=NP
    • Good overview of some of the basic lower bounds for SAT.
  • Andrew Rosenthal – Talk to The Times – The New York Times
    • Fascinating: “Frankly, I think it is the task of bloggers to catch up to us, not the other way around… If the Times editorial board were a single person, he or she would have six Pulitzer prizes; one Emmy; [etc …]”. If the blogosphere were a single person, it would have a half dozen or more Nobel Prizes, at least four Fields medallists, goodness knows how many Emmys, etc. He’s fighting a strange battle, arguing essentially that the median New York Times editorial is better than the median blog. But fights about quality of content are held in the tail of the best material, not in the median. And in nearly every subject, the Times now has now lost the fight in the tail.

Click here for all of my del.icio.us bookmarks.

Published

SciBarCamp 2

Last year’s SciBarCamp was one of my favourite events ever – here’s a great explanation of why, from Jim Thomas. It’s on again this year, May 8-9, in Toronto. Take a look at the participant list, and sign up (space is limited)! Here’s more, from the organizers:

SciBarCamp is a gathering of scientists, artists, and technologists for a weekend of talks and discussions. The goal is to create connections between science, entrepreneurs and local businesses, and arts and culture.

In the tradition of BarCamps, otherwise known as “unconferences”, the program is decided by the participants at the beginning of the meeting, in the opening reception. SciBarCamp will require active participation; while not everybody will present or lead a discussion, everybody will be expected to contribute substantially – this will help make it a really creative event.

Our venue, Hart House, is a congenial space with plenty of informal areas to work or talk. The space, which made such a wonderful venue for last year’s SciBarCamp, is being made available through a collaboration with Science Rendezvous.

Published

Biweekly links for 04/20/2009

  • singletasking: Caterina Fake
  • Massively Multiplayer Online Game service granted banking license
    • “MMO operator MindArk has been granted a banking license for its virtual world Entropia Universe, by the Swedish Financial Supervisory Authority.

      MindArk says the move will allow it to act as a central bank for all variations of Entropia Universe and integrate the in-game economies with the real world.

      “This is an exciting and important development for the future of all virtual worlds being built using the Entropia Platform,” commented MindArk CEO, Jan Welter Timkrans.

      “Together with our partner planet owner companies we will be in a position to offer real bank services to the inhabitants of our virtual universe.”

      Entropia Universe acts as a platform from which partners can launch virtual worlds within, with the focus being on microtransactions and virtual currency monetisation. “

  • Luis von Blog

Click here for all of my del.icio.us bookmarks.

Published

Biweekly links for 04/17/2009

  • Pooling of Unshared Information in Group Decision Making: Biased Information Sampling During Discussion
    • “Decision-making groups can potentially benefit from pooling members’ information, particularly when members individually have partial and biased information but collectively can compose an unbiased characterization of the decision alternatives. The proposed biased sampling model of group discussion, however, suggests that group members often fail to effectively pool their information because discussion tends to be dominated by (a) information that members hold in common before discussion and (b) information that supports members’ existent preferences. In a political caucus simulation, group members individually read candidate descriptions that contained partial information biased against the most favorable candidate and then discussed the candidates as a group. Even though groups could have produced unbiased composites of the candidates through discussion, they decided in favor of the candidate initially preferred by a plurality rather than the most favorable candidate…”
  • SciBarCamp Toronto 2
    • SciBarCamp Toronto 2 is happening May 8-9, Hart House, Toronto. See the Participant page to register!
  • Killer Bean Forever
    • Feature-length animated movie animated entirely by one person, Jeff Lew (of the Matrix Reloaded). Will be released on DVD in July (US and Canada).
  • arXiview: A New iPhone App for the arXiv
    • Browse the preprint arXiv from your iPhone.
  • A Comparison of Approaches to Large-Scale Data Analysis
    • “There is currently considerable enthusiasm around the MapReduce (MR) paradigm.. Although the basic control flow of this framework has existed in parallel SQL database management systems (DBMS) for over 20 years, some have called MR a dramatically new computing model [8, 17]. In this paper… we evaluate both kinds of systems in terms of performance and development complexity… we define a benchmark consisting of a collection of tasks that we have run on an open source version of MR as well as on two parallel DBMSs. For each task, we measure each system’s performance for various degrees of parallelism on a cluster of 100 nodes… Although the process to load data into and tune the execution of parallel DBMSs took much longer than the MR system, the observed performance of these DBMSs was strikingly better. We speculate about the causes of the dramatic performance difference and consider implementation concepts that future systems should take from both kinds of architectures”

Click here for all of my del.icio.us bookmarks.

Published

Biweekly links for 04/06/2009

Click here for all of my del.icio.us bookmarks.

Published

Biweekly links for 04/03/2009

  • Amazon Elastic MapReduce
    • “Amazon Elastic MapReduce is a web service that enables businesses, researchers, data analysts, and developers to easily and cost-effectively process vast amounts of data. It utilizes a hosted Hadoop framework running on the web-scale infrastructure of Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Simple Storage Service (Amazon S3).”
  • Data produced, analyzed and consumed. The impact of big science : business|bytes|genes|molecules
    • “The fact remains that today we are moving towards a clear separation between data producers, data consumers and methods developers. There was a time that a small group of people could cover all that ground, but with the industrialization of data production (microarrays are already there, mass specs and sequencers not quite yet), traditional roles, even in an academic setting are not efficient. “
  • Adding Noughts in Vain
    • Andrew Doherty’s wonderful blog about politics, climate, New Zealand, and whatever else strikes his fancy.
  • Mathemata: the blog of Francois Dorais
  • Noam Chomsky on Post-Modernism
    • “There are lots of things I don’t understand — say, the latest debates over whether neutrinos have mass or the way that Fermat’s last theorem was … proven … But from 50 years in this game, I have learned two things: (1) I can ask friends who work in these areas to explain it to me at a level that I can understand, and they can do so…; (2) if I’m interested, I can proceed to learn more so that I will come to understand it. Now Derrida, Lacan, Lyotard, Kristeva, etc. — even Foucault, whom I knew and liked, and who was somewhat different from the rest — write things that I also don’t understand, but (1) and (2) don’t hold: no one who says they do understand can explain it to me and I haven’t a clue as to how to proceed to overcome my failures. That leaves one of two possibilities: (a) some new advance in intellectual life has been made… which has created a form of “theory” that is beyond quantum theory, topology, etc., in depth and profundity; or (b) … I won’t spell it out.
  • Caveat Lector » Blog preservation
    • “I suggest mildly that this [blog preservation] would be a fantastic problem to tackle for an academic library looking to make a name for itself. If you can’t make the argument for a general blog-preservation program (and that’s hard, because libraries are so inward-looking at times of crisis), dig up the ten or fifteen best blogs published by people at your institution and make an argument about those. Then release the code you write to the rest of us who want to do this!”
  • Preservation for scholarly blogs – Gavin Baker
    • How will we preserve scholarly blogs for the future?
  • A Blog Around The Clock : Defining the Journalism vs. Blogging Debate, with a Science Reporting angle
    • Thoughtful and thought-provoking.
  • Anarchism Triumphant: Free Software and the Death of Copyright (Eben Moglen)
  • Western internet censorship: The beginning of the end or the end of the beginning? – Wikileaks

Click here for all of my del.icio.us bookmarks.

Published

Conscious modularity and scaling open collaboration

I’ve recently been reviewing the history of open source software, and one thing I’ve been struck by is the enormous effort many open source projects put it into making their development modular. They do this so work can be divided up, making it easier to scale the collaboration, and so get the benefits of diverse expertise and more aggregate effort.

I’m struck by this because I’ve sometimes heard sceptics of open science assert that software has a natural modularity which makes it easy to scale open source software projects, but that difficult science problems often have less natural modularity, and this makes it unlikely that open science will scale.

It looks to me like what’s really going on is that the open sourcers have adopted a posture of conscious modularity. They’re certainly not relying on any sort of natural modularity, but are instead working hard to achieve and preserve a modular structure. Here are three striking examples:

  • The open source Apache webserver software was originally a fork of a public domain webserver developed by the US National Center for Supercomputing Applications (NCSA). The NCSA project was largely abandoned in 1994, and the group that became Apache took over. It quickly became apparent that the old code base was far too monolithic for a distributed effort, and the code base was completely redesigned and overhauled to make it modular.
  • In September 1998 and June 2002 crises arose in Linux because of community unhappiness at the slow rate new code contributions were being accepted into the kernel. In some cases contributions from major contributors were being ignored completely. The problem in both 1998 and 2002 was that an overloaded Linus Torvalds was becoming a single point of failure. The situation was famously summed up in 1998 by Linux developer Larry McVoy, who said simply “Linus doesn’t scale”. This was a phrase repeated in a 2002 call-to-arms by Linux developer Rob Landley. The resolution in both cases was major re-organization of the project that allowed tasks formerly managed by Torvalds to be split up among the Linux community. In 2002, for instance, Linux switched to an entirely new way of managing code, using a package called BitKeeper, designed in part to make modular development easier.
  • One of the Mozilla projects is an issue tracking system (bugzilla), designed to make modular development easy, and which Mozilla uses to organize development of the Firefox web browswer. Developing bugzilla is a considerable overhead for Mozilla, but it’s worth it to keep development modular.

The right lesson to learn from open source software, I think, is that it may be darned hard to achieve modularity in software development, but it can be worth it to reap the benefits of large-scale collaboration. Some parts of science may not be “naturally” modular, but that doesn’t mean they can’t be made modular with conscious effort on the part of scientists. It’s a problem to be solved, not to give up on.

Published

First Principles

How would you use 100 million dollars if someone asked you to set up and run an Institute for Theoretical Physics?  My friend Howard Burton has written a memoir of his 8 years as the founding Executive Director of the Perimeter Institute, taking it from conception to being one of the world’s best known institutes for theoretical physics. I’ve heard many people theorize about how a scientific institution ideally should be organized (“consider a spherical physicist…”), and I’ve contributed more than a few thoughts of my own to such discussions. What I really liked about this book, and what gives it a unique perspective, is that it’s from someone who was actually in the hot seat, from the get-go.

Published