- Building an Inverted Index with Hadoop and Pig « SquareCog’s SquareBlog
- “In this post, I present a (very) brief description of the Pig project and demonstrate how one can construct an inverted index from a collection of text files using just a few lines of PigLatin.
Pig offers SQL-like data processing instructions (select, project, filter, group), while being both more flexible by allowing simple integration of user-defined functions, and more straightforward by allowing users to issue command proceduraly, rather than declaratively, as in SQL. “
- “In this post, I present a (very) brief description of the Pig project and demonstrate how one can construct an inverted index from a collection of text files using just a few lines of PigLatin.
- Yahoo! Hadoop Tutorial
- Comparison of biological wikis
- Andrew Su’s survey of biological wikis (if you click again it links through to a spreadsheet). Lots of very interesting data about number of edits, number of editors, etc.
- Datawocky: The Real Long Tail: Why both Chris Anderson and Anita Elberse are Wrong
Click here for all of my del.icio.us bookmarks.