<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Write your first MapReduce program in 20 minutes</title>
	<atom:link href="http://michaelnielsen.org/blog/write-your-first-mapreduce-program-in-20-minutes/feed/" rel="self" type="application/rss+xml" />
	<link>http://michaelnielsen.org/blog/write-your-first-mapreduce-program-in-20-minutes/</link>
	<description></description>
	<lastBuildDate>Tue, 09 Mar 2010 22:14:18 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Jonathan Stray &#187; Why We Need Open Search, and How to Make Money Doing It</title>
		<link>http://michaelnielsen.org/blog/write-your-first-mapreduce-program-in-20-minutes/comment-page-1/#comment-26173</link>
		<dc:creator>Jonathan Stray &#187; Why We Need Open Search, and How to Make Money Doing It</dc:creator>
		<pubDate>Sun, 27 Sep 2009 09:51:16 +0000</pubDate>
		<guid isPermaLink="false">http://michaelnielsen.org/blog/?p=529#comment-26173</guid>
		<description>[...] data processing is now well understood. To be precise, I want an open search company that sells map-reduce access to their index. Map-reduce is a standard framework for breaking down large computational [...]</description>
		<content:encoded><![CDATA[<p>[...] data processing is now well understood. To be precise, I want an open search company that sells map-reduce access to their index. Map-reduce is a standard framework for breaking down large computational [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Michael Nielsen &#187; Consistent hashing</title>
		<link>http://michaelnielsen.org/blog/write-your-first-mapreduce-program-in-20-minutes/comment-page-1/#comment-23140</link>
		<dc:creator>Michael Nielsen &#187; Consistent hashing</dc:creator>
		<pubDate>Thu, 04 Jun 2009 03:06:38 +0000</pubDate>
		<guid isPermaLink="false">http://michaelnielsen.org/blog/?p=529#comment-23140</guid>
		<description>[...] interested in distributed dictionaries is because they&#8217;re used as input and output to the MapReduce framework for distributed computing. Of course, that&#8217;s not the only reason distributed [...]</description>
		<content:encoded><![CDATA[<p>[...] interested in distributed dictionaries is because they&#8217;re used as input and output to the MapReduce framework for distributed computing. Of course, that&#8217;s not the only reason distributed [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Michael Nielsen</title>
		<link>http://michaelnielsen.org/blog/write-your-first-mapreduce-program-in-20-minutes/comment-page-1/#comment-20406</link>
		<dc:creator>Michael Nielsen</dc:creator>
		<pubDate>Wed, 15 Apr 2009 12:02:31 +0000</pubDate>
		<guid isPermaLink="false">http://michaelnielsen.org/blog/?p=529#comment-20406</guid>
		<description>Asokan - Thanks, fixed.</description>
		<content:encoded><![CDATA[<p>Asokan &#8211; Thanks, fixed.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Asokan Pichai</title>
		<link>http://michaelnielsen.org/blog/write-your-first-mapreduce-program-in-20-minutes/comment-page-1/#comment-20381</link>
		<dc:creator>Asokan Pichai</dc:creator>
		<pubDate>Wed, 15 Apr 2009 04:40:08 +0000</pubDate>
		<guid isPermaLink="false">http://michaelnielsen.org/blog/?p=529#comment-20381</guid>
		<description>The Dave Spencer link is wrong. AFAICS, it says tropo instead of chencer</description>
		<content:encoded><![CDATA[<p>The Dave Spencer link is wrong. AFAICS, it says tropo instead of chencer</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Neal Richter</title>
		<link>http://michaelnielsen.org/blog/write-your-first-mapreduce-program-in-20-minutes/comment-page-1/#comment-18123</link>
		<dc:creator>Neal Richter</dc:creator>
		<pubDate>Sun, 22 Feb 2009 03:17:13 +0000</pubDate>
		<guid isPermaLink="false">http://michaelnielsen.org/blog/?p=529#comment-18123</guid>
		<description>Obviously Dean and Ghemawat are due some credit here, yet map and reduce have been around in languages like Lisp for approx 25 years prior to their paper.  There were even companies/groups that parallelized their Lisp implementations in the 1980s: 

http://ieeexplore.ieee.org/iel3/1244/519/00010373.pdf?arnumber=10373 
http://search.barnesandnoble.com/Parallel-LISP/Takatoshi-Ito/e/9780387527826
http://www.springerlink.com/content/k560u307713j57r4/

Did Dean and Ghemawat &#039;finish the job&#039; and do a great implementation at exactly the right time and at the right company to start a distributed computing revolution?
Absolutely and congrats to them for that.

Yet when first introduced to it, it was completely obvious to me and others who learned the map and reduce lisp/scheme primitives as undergrads.  And &#039;difference&#039; in their methodology is trivial.. of course this does describe the best revolutions.. they are all trivial in retrospect.

It&#039;s also worth reading David DeWitt&#039;s savaging of MapReduce to cure you of any further illusion of it being &#039;invented&#039; by Dean and Ghemawat.  (I think he goes way to far.. modern MapReduce is here to stay - we&#039;re not going back to RDBMs systems for distributed computing anytime soon)
http://www.databasecolumn.com/2008/01/mapreduce-a-major-step-back.html</description>
		<content:encoded><![CDATA[<p>Obviously Dean and Ghemawat are due some credit here, yet map and reduce have been around in languages like Lisp for approx 25 years prior to their paper.  There were even companies/groups that parallelized their Lisp implementations in the 1980s: </p>
<p><a href="http://ieeexplore.ieee.org/iel3/1244/519/00010373.pdf?arnumber=10373" rel="nofollow">http://ieeexplore.ieee.org/iel3/1244/519/00010373.pdf?arnumber=10373</a><br />
<a href="http://search.barnesandnoble.com/Parallel-LISP/Takatoshi-Ito/e/9780387527826" rel="nofollow">http://search.barnesandnoble.com/Parallel-LISP/Takatoshi-Ito/e/9780387527826</a><br />
<a href="http://www.springerlink.com/content/k560u307713j57r4/" rel="nofollow">http://www.springerlink.com/content/k560u307713j57r4/</a></p>
<p>Did Dean and Ghemawat &#8216;finish the job&#8217; and do a great implementation at exactly the right time and at the right company to start a distributed computing revolution?<br />
Absolutely and congrats to them for that.</p>
<p>Yet when first introduced to it, it was completely obvious to me and others who learned the map and reduce lisp/scheme primitives as undergrads.  And &#8216;difference&#8217; in their methodology is trivial.. of course this does describe the best revolutions.. they are all trivial in retrospect.</p>
<p>It&#8217;s also worth reading David DeWitt&#8217;s savaging of MapReduce to cure you of any further illusion of it being &#8216;invented&#8217; by Dean and Ghemawat.  (I think he goes way to far.. modern MapReduce is here to stay &#8211; we&#8217;re not going back to RDBMs systems for distributed computing anytime soon)<br />
<a href="http://www.databasecolumn.com/2008/01/mapreduce-a-major-step-back.html" rel="nofollow">http://www.databasecolumn.com/2008/01/mapreduce-a-major-step-back.html</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: JK</title>
		<link>http://michaelnielsen.org/blog/write-your-first-mapreduce-program-in-20-minutes/comment-page-1/#comment-18054</link>
		<dc:creator>JK</dc:creator>
		<pubDate>Thu, 19 Feb 2009 01:22:40 +0000</pubDate>
		<guid isPermaLink="false">http://michaelnielsen.org/blog/?p=529#comment-18054</guid>
		<description>Please, don&#039;t take that too serious :) I just thought of it because I&#039;d instantly rewrite that line to

&gt; filenames = map(partial(os.path.join, &#039;text&#039;), flist)

when in the mood of doing things functional. That, however, doesn&#039;t save many characters and additionally requires an import to work.

In any case, I really appreciate your article and I&#039;m looking forward to more of that quality.

Oh, and let me point to the Disco project at http://discoproject.org/ which uses an Erlang server and a client library to write applications in Python. Their API for the map and reduce functions is a little different, though (which is interesting to note as there are similar implementation approaches that work).</description>
		<content:encoded><![CDATA[<p>Please, don&#8217;t take that too serious <img src='http://michaelnielsen.org/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  I just thought of it because I&#8217;d instantly rewrite that line to</p>
<p>&gt; filenames = map(partial(os.path.join, &#8216;text&#8217;), flist)</p>
<p>when in the mood of doing things functional. That, however, doesn&#8217;t save many characters and additionally requires an import to work.</p>
<p>In any case, I really appreciate your article and I&#8217;m looking forward to more of that quality.</p>
<p>Oh, and let me point to the Disco project at <a href="http://discoproject.org/" rel="nofollow">http://discoproject.org/</a> which uses an Erlang server and a client library to write applications in Python. Their API for the map and reduce functions is a little different, though (which is interesting to note as there are similar implementation approaches that work).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Michael Nielsen</title>
		<link>http://michaelnielsen.org/blog/write-your-first-mapreduce-program-in-20-minutes/comment-page-1/#comment-18050</link>
		<dc:creator>Michael Nielsen</dc:creator>
		<pubDate>Wed, 18 Feb 2009 22:17:08 +0000</pubDate>
		<guid isPermaLink="false">http://michaelnielsen.org/blog/?p=529#comment-18050</guid>
		<description>Thanks JK, I&#039;ll look into it.</description>
		<content:encoded><![CDATA[<p>Thanks JK, I&#8217;ll look into it.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: JK</title>
		<link>http://michaelnielsen.org/blog/write-your-first-mapreduce-program-in-20-minutes/comment-page-1/#comment-18032</link>
		<dc:creator>JK</dc:creator>
		<pubDate>Wed, 18 Feb 2009 15:12:34 +0000</pubDate>
		<guid isPermaLink="false">http://michaelnielsen.org/blog/?p=529#comment-18032</guid>
		<description>&gt;  filenames = [ os.path.join(”text”, f) for f in flist ]

Heh, you could even use `functools.partial` and then MapReduce to spread that over a whole bunch of clustered machines, if I got MR right from this well-explanatory article :D</description>
		<content:encoded><![CDATA[<p>&gt;  filenames = [ os.path.join(”text”, f) for f in flist ]</p>
<p>Heh, you could even use `functools.partial` and then MapReduce to spread that over a whole bunch of clustered machines, if I got MR right from this well-explanatory article <img src='http://michaelnielsen.org/blog/wp-includes/images/smilies/icon_biggrin.gif' alt=':D' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Matt Williams</title>
		<link>http://michaelnielsen.org/blog/write-your-first-mapreduce-program-in-20-minutes/comment-page-1/#comment-17142</link>
		<dc:creator>Matt Williams</dc:creator>
		<pubDate>Sun, 18 Jan 2009 16:55:29 +0000</pubDate>
		<guid isPermaLink="false">http://michaelnielsen.org/blog/?p=529#comment-17142</guid>
		<description>I&#039;ve provided a high level overview of MapReduce in my brand spanking new blog here: http://wordflows.com/matt/2009/01/18/understanding-mapreduce/

Feedback is appreciated, as I&#039;ve only just started blogging today!

I am in agreement that it is far from a new concept. Unless i am missing something obvious in their whitepaper, its a technique we have been using in distributed systems for at least a decade.</description>
		<content:encoded><![CDATA[<p>I&#8217;ve provided a high level overview of MapReduce in my brand spanking new blog here: <a href="http://wordflows.com/matt/2009/01/18/understanding-mapreduce/" rel="nofollow">http://wordflows.com/matt/2009/01/18/understanding-mapreduce/</a></p>
<p>Feedback is appreciated, as I&#8217;ve only just started blogging today!</p>
<p>I am in agreement that it is far from a new concept. Unless i am missing something obvious in their whitepaper, its a technique we have been using in distributed systems for at least a decade.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Understanding Map/Reduce &#124; Matt Williams</title>
		<link>http://michaelnielsen.org/blog/write-your-first-mapreduce-program-in-20-minutes/comment-page-1/#comment-17141</link>
		<dc:creator>Understanding Map/Reduce &#124; Matt Williams</dc:creator>
		<pubDate>Sun, 18 Jan 2009 16:48:59 +0000</pubDate>
		<guid isPermaLink="false">http://michaelnielsen.org/blog/?p=529#comment-17141</guid>
		<description>[...] few references you may find interesting:   Google Research Publication Write your first Map/Reduce function in 20 mins Misconceptions about [...]</description>
		<content:encoded><![CDATA[<p>[...] few references you may find interesting:   Google Research Publication Write your first Map/Reduce function in 20 mins Misconceptions about [...]</p>
]]></content:encoded>
	</item>
</channel>
</rss>
