Data-driven intelligence

I’ve started a new blog, on data-driven intelligence. In future, this is where technical posts like my Google Technology Stack posts will go. There are two posts up:

  • A post describing Pregel, Google’s system for implementing graph-based algorithms on large clusters of machines. In addition to describing how Pregel works, I give a toy single-machine Python implementation which can be used to play with Pregel. The code is up on GitHub.
  • Sex, Einstein, and Lady Gaga: what’s discussed on the most popular blogs. I crawled 50,000 pages from Technorati’s list of the top 1,000 blogs, and determined the percentage of pages containing words such as “sex”, “Einstein”, “Gaga”, and many others. The results were entertaining.

The blog, of course, has an RSS feed.