Michael Nielsen

Biweekly links for 09/07/2009

â€¦My heartâ€™s in Accra Â» Xiao Qiang and Evgeny Morozov with dueling views of digital activism
- “Evgeny Morozov offers… a healthy dose of skepticism about the possibility of digital activists changing the world via Facebook and Twitter. He begins with the story of Anders Colding-JÃ¸rgensen, a Danish psychologist who created a Facebook activism group to protest the dismantling of Stork Fountain in Copenhagen. Of course, the government wasnâ€™t actually planning on dismantling the fountain, a national symbol. But his Facebook group implied that the fountain was under threat, and from his initial 100 invitations to the group, there were 27,500 members of a Facebook group demanding the fountain be saved within three days. At the peak, two people were joining per minute â€“ Jorgensen decided to end the experiment shortly afterwards. (Amusingly enough, there are still more than 26,000 members, even though the fiction as been well exposed.)”
Cool Tools: The United States Constitution
- “The US Constitution is one of our most remarkable inventions of all time. A lot of people in other countries think so too. It is a robust self-correcting legal OS. But it was written in an arcane code long ago. To make any sense from it you need some help.
  This lively graphic novel adaptation of the Constitution is by far the best aid I’ve found to deciphering its code. It is the comic book version, but rather than dumbing it down, it smartens it up. The graphic novel goes through the Constitution article by article, and explains what each bit means, why it is there, and how it came to be. Like the Bible, the Constitution doesn’t say what you thought it did. I was surprised what was not there as well as what was. I learned tons from this annotation, despite studying it in high school. It renewed my respect for it, and in a way, also makes clear its limitation. I feel I can be a slightly better citizen. Best of all, this book does all that with pictures, which makes it a page-turner.”

Click here for all of my del.icio.us bookmarks.

Biweekly links for 09/04/2009

Wine, Physics, and Song
- Howard Barnum’s blog. True to the promise of the name, he does indeed cover wine, physics and song, with wine currently having a slight edge over physics, and song well behind. Economics occasionally sneaks in.
ongoing: interview with Ravelry’s Casey Jones
- Interview with Casey Jones, the main developer on Ravelry, the amazingly successful site for knitters and crocheters.
EtherPad Blog: Saving is Obsolete
- Very cool: move to any point in the history of a document: “Have you ever forgot to hit “save” and lost work? Ever wished you could go back to an earlier version of a document to see how the document evolved?
  Now you can. EtherPad keeps track of all your typing in realtime. With our new Time-Slider, you can browse the complete history of a document using a familiar user interface.”
Story Time : Common Knowledge
- Excellent thoughtful piece from John Wilbanks. Takeaway for me: scientific results need a narrative explanation, so humans can understand them, and a structured machine readable explanation, so computers can understand them. Who will provide the latter? John points to publishers. I’m doubtful. I wonder whether it can’t be baked into the paper preparation process, the same way blogging platforms like wordpress bake machine readable metadata automatically into RSS feeds. Tough problem, still.
Finding and Fixing Errors in Google’s Book Catalog | Freedom to Tinker
- “There was a fascinating exchange about errors in Google’s book catalog over at the Language Log recently. We rarely see such an open and constructive discussion of errors in large data sets, so this is an unusual opportunity to learn about how errors arise and what can be done about them… What’s most interesting to me is a seeming difference in mindset between critics like Nunberg on the one hand, and Google on the other. Nunberg thinks of Google’s metadata catalog as a fixed product that has some (unfortunately large) number of errors, whereas Google sees the catalog as a work in progress, subject to continual improvement. Even calling Google’s metadata a “catalog” seems to connote a level of completion and immutability that Google might not assert. An electronic “card catalog” can change every day — a good thing if the changes are strict improvements such as error fixes — in a way that a traditional card catalog wouldn’t.”
â€œOpen Accessâ€ Journals are Advertising Â« Algorithmic Game Theory
- Noam Nisan with some thoughtful concerns about author pays open access. Caveats are necessary (see my comment at the post), but the concerns are worth thinking about.
Howard Rheingold : Mindful Infotention: Dashboards, Radars, Filters
- “Knowing what to pay attention to is a cognitive skill that steers and focuses the technical knowledge of how to find information worth your attention. More and more, knowing where to direct your attention involves a third element, together with your own attentional discipline and use of online power tools – other people. Increasingly, most of the recommendations that make it possible to find fresh and useful signals amid the overwhelming noise of the Internet are social media – online networks that make possible social exchange and relationship. Tuning and feeding our personal learning networks is where the internal and the technological meet the social. “
Copenhagen’s Living Library
- Borrow a human from Copenhagen’s Living Library.
Second Skin | Savage Minds
- “So I just watched Second Skin, a documentaryâ€”as far as I know, the only documentaryâ€”which focuses squarely on the lives of on-line game players.”
Stephen Fry: In search of the planet’s most endangered species | Environment | The Guardian
- “Are the animals worth saving because they hold an important place in the great interconnected web of existence? Are they worth saving because they might one day yield important clues and compounds to help us with medicine or some other useful technology? Or are they worth saving because they are the beautiful achievement of millions of years of natural selection? Extinction is a natural part of creation, this is unquestionably true: yet no matter what one’s views on climate change or global warming, it is impossible, impossible, to deny that man-made alterations to habitat are threatening thousands of plant and animal species across the planet at an unprecedented rate and scale. So the question is perhaps not “Why should we save them?” but “What right do we have to destroy them?””
Exploring dangerous neighborhoods: Latent Semantic Analysis and computing beyond the bounds of the familiar
- Using data mining to evaluate whether a psychiatric patient poses a danger to themselves or others.
Do Bugreporters Become Better Over Time? Â« NetworkLabs
- Fascinating study of bug reporting patterns for Mozilla. Takeaways: (1) people rapidly improve the quality of their bug reports; (2) there appears to be a sizeable difference in quality between the bug reports of newbies and experienced developers; and (3) there are a small minority of people who have submitted lots of bugs, but who don’t seem to be any better than the newbies.
Intervening in the life cycles of scientific knowledge – Don Swanson
- “the fragmentation that inevitably accompanies the growth of science has created an altogether different set of problems–as well as opportunities. Interrelationships among the fragments, unnoticed because of the insularity of specialties, have been shown to harbor previously unknown solutions to authentic scientific problems, and so to hold a potential for rejuvenating knowledge that might otherwise be considered obsolete. The invisible growth of relatedness probably follows a combinatorial law and so may far exceed even the explosive growth rates that have characterized both the scientific community and the mountains of print it produces. “
A Protocol for Packet Network Intercommunication (Cerf and Kahn, pdf)
- You’re reading these words over the protocol described in this paper.
Galaxy Zoo Blog Â» A Galaxy Zoo – WorldWide Telescope Mashup
- “Have you ever found yourself staring proudly at the collection of beautiful and exotic galaxies that fill your favourites list? Have you ever wanted to share these objects with a friend or loved one and realized there was just no easy way to do it? Sure, you can click on the image, delve into SkyServer, and copy and paste one image at a time into an email, butâ€¦ That gets kind of tedious pretty quickly, and if your favourites list is like mine, itâ€™s not a 5-minute copy and paste kind of task.
  Well, now there is an easier way to inflict your favourites on others.”
What Kate Saw in Silicon Valley
- Summary: startups fail, even ones from famous founders (customers don’t care how famous the founder is); startups completely change what they’re doing, on short timescales; it costs almost nothing to start a startup; the founders are scrappy (an asset not appreciated by most of society); founders are not obviously trying to stand out (see above: customers don’t care how impressive the founder seems); founders need mentors; starting a startup is a very solitary activity. “By inverting this list, we can get a portrait of the “normal” world. It’s populated by people who talk a lot with one another as they work slowly but harmoniously on conservative, expensive projects whose destinations are decided in advance, and who carefully adjust their manner to reflect their position in the hierarchy.”
A Computational View of Market Efficiency
- “We propose to study market efficiency from a computational viewpoint. Borrowing from theoretical computer science, we define a market to be efficient with respect to resources S (e.g., time, memory) if no strategy using resources S can make a profit. As a first step, we consider memory-m strategies whose action at time t depends only on the m previous observations at times t-m,…,t-1. We introduce and study a simple model of market evolution, where strategies impact the market by their decision to buy or sell. We show that the effect of optimal strategies using memory m can lead to “market conditions” that were not present initially, such as (1) market bubbles and (2) the possibility for a strategy using memory m’ > m to make a bigger profit than was initially possible. We suggest ours as a framework to rationalize the technological arms race of quantitative trading firms. “
Explicit semantic analysis (pdf)
- Very interesting paper on extracting concepts from a very large corpus of data (e.g., Wikipedia), using ideas based on latent semantic analysis. The rough idea seems to be to do a singular value decomposition (SVD) of the word-frequency matrix, and then to truncate the SVD to a much smaller “concept space”. The isometries appearing in the SVD can then be used to define a rotation into concept space. These rotations can then be used to compare general phrases, e.g., “the author wrote a blog post” versus “the essayist penned an essay”, seeing how closely they overlap in concept space. For more details, see the paper. I wonder how much the results would be improved by starting with a larger corpus (e.g., Google’s cache of the web).
The Semantic Web – my personal (unofficial) FAQ: James Hendler
Facebook’s Religion Question Prompts Soul-Searching
- Facebook gives people a free-form text box to describe their religion. Asking such a personal question gives some surprising answers. My favourite was probably the woman who summed up both her Catholicism and her difficulties with Catholicism by describing her religion as “Matthew 25”. “Jedi” comes in at number 10.

Click here for all of my del.icio.us bookmarks.

Finding Primes: A Fun Subproblem

The ongoing open mathematics project finding primes aims to find a deterministic algorithm to efficiently generate [tex]k[/tex]-digit primes. The fastest known algorithm seems to be a method of Odlyzko which generates a [tex]k[/tex]-digit prime in time [tex]O(10^{k/2})[/tex]. The people working on the project have made some observations which come tantalizingly close to breaking that barrier. The obstruction is a beautiful little problem that I thought many people might enjoy, and which may well be tractable. If you’re interested in participating in the finding primes project, it might be a good entry point. So if you’ve got a good idea to solve the problem, pitch in and help over at the Polymath blog. But please be polite: read some background first, and take a look at some of the research threads to get a feel for how things work, and what’s already known.

One small caveat: the argument that follows is not in any way mine, it’s all the work of other people! A recent comment thread on this argument starts here.

Let [tex]\pi(x)[/tex] denote the number of primes less than or equal to [tex]x[/tex]. [tex]\pi(x)[/tex] has a lot of structure, and there’s a surprising amount that can be said about it. In particular, the people working on finding primes have figured out a clever way of computing the parity of [tex]\pi(x)[/tex] in time about [tex]x^{5/11+o(1)}[/tex].

Suppose you can find two [tex]k[/tex]-digit numbers [tex]x[/tex] and [tex]y[/tex] such that the parity of [tex]\pi(x)[/tex] and [tex]\pi(y)[/tex] are different. Set [tex]z = \lfloor (x+y)/2 \rfloor[/tex], i.e., take [tex]z[/tex] to be the midpoint between [tex]x[/tex] and [tex]y[/tex]. Compute the parity of [tex]z[/tex]. It must have either a different parity to [tex]x[/tex] or a different parity to [tex]y[/tex]. Repeating this procedure [tex]O(k)[/tex] times, we can use a binary search to find adjacent [tex]k[/tex]-digit numbers [tex]p-1[/tex] and [tex]p[/tex] such that [tex]\pi(p-1)[/tex] and [tex]\pi(p)[/tex] have different parity. We conclude that [tex]p[/tex] must be prime. That takes time [tex]O(10^{5/11 k + o(1)})[/tex], and so breaks the barrier set by Odlyzko’s method.

What’s the problem with this algorithm? The problem is finding the initial [tex]k[/tex]-digit numbers [tex]x[/tex] and [tex]y[/tex] such that [tex]\pi(x)[/tex] and [tex]\pi(y)[/tex] have different parity. It would surprise me a great deal if this weren’t possible, but it’s not obvious (at least to me) how to do it quickly. Is there a fast way of doing this?

Biweekly links for 08/31/2009

Facebook’s Religion Question Prompts Soul-Searching
- Facebook gives people a free-form text box to describe their religion. Asking such a personal question gives some surprising answers. My favourite was probably the woman who summed up both her Catholicism and her difficulties with Catholicism by describing her religion as “Matthew 25”. “Jedi” comes in at number 10.
Markets marketed better Â« Meteuphoric
- “What do you call a system where costs and benefits return to those who cause them? Working markets or karma, depending on whether the accounting uses money or magic.
  In popular culture karma generally has good connotations, and markets generally have bad. Reasons for unease about markets should mostly apply just as well to karma, but nobody complains…that inherent tendencies to be nice are an unfair basis for wellbeing distribution. Nor that people who have had a lot of good fortune recently might have cheated the system somehow. Nor that the divine internalizing of externalities encourages selfishness. Nor that people who are good out of desperation for fair fortune are being exploited. So why the difference?
  
  Perhaps mysterious forces are just more trustworthy than social institutions? Or perhaps karma seems nice because its promotion is read as â€˜everyone will get what they deserveâ€™, while markets seem nasty because their promotion is read as â€˜everyone deserves what theyâ€™ve gotâ€™”
Astonishing video of a chimp at a magic show
News organisations and start-ups
- “What would a content site look like if you started from how to make money – as print media once did – instead of taking a particular form of journalism as a given and treating how to make money from it as an afterthought?”
Mike Brown’s Planets: Fog! Titan! Titan Fog! (and a peer review experiment)
- Cool for two reasons. First, Titan has fog! Second, Mike Brown seriously invites reviews of the paper, and promises to treat them as he would referee comments.
25 Years Later, First Registered Domain Name Changes Hands
- The first .com was apparently registered in 1985; it just changed hands for the first time ever.
Mathematics and the internet (pdf)
- Terry Tao’s talk about how online tools are changing mathematics.

Click here for all of my del.icio.us bookmarks.

Biweekly links for 08/28/2009

25 Years Later, First Registered Domain Name Changes Hands
- The first .com was apparently registered in 1985; it just changed hands for the first time ever.
Mathematics and the internet (pdf)
- Terry Tao’s talk about how online tools are changing mathematics.
What We Can Learn From Craigslist
How XML Threatens Big Data : Dataspora Blog
- Excellent thoughtful article on data bureaucracy and the limitations of XML.
The impact factor’s Matthew effect: a natural experiment in bibliometrics
- “Using an original method for controlling the intrinsic value of papers–identical duplicate papers published in different journals with different impact factors–this paper shows that the journal in which papers are published have a strong influence on their citation rates, as duplicate papers published in high impact journals obtain, on average, twice as much citations as their identical counterparts published in journals with lower impact factors. The intrinsic value of a paper is thus not the only reason a given paper gets cited or not; there is a specific Matthew effect attached to journals and this gives to paper published there an added value over and above their intrinsic quality. “
The importance of failure
- “This is a point that I don’t often hear made when people talk about failure; the moral behind a failure-related story is usually about preventing it, or dealing with the aftermath, but not about the fact that sometimes things go bad despite your best efforts, and all the careful risk management and contingency planning won’t keep you from going down in flames. This is important, because it forces every person to establish a risk threshold that they are willing to accept in every one of their life efforts. “
US Top All-Time Donors 1989-2008
- Surprising list of top donors in US politics.
High-Speed Robot Hand
- Incredible video of a robot which can throw a ball, pick up a grain of rice, spin a pen, and many other things, all with incredible speed.

Click here for all of my del.icio.us bookmarks.

Biweekly links for 08/24/2009

Pointless babble Â« Stephen Fry on Twitter
- “The clueâ€™s in the name of the service: Twitter. Itâ€™s not called Roar, Assert, Debate or Reason, itâ€™s called Twitter. As in the chirruping of birds. Apparently, according to Pears (the soapmakers…â€“ certainly their â€œstudyâ€ is froth and bubble) 40% of Twitter is â€œpointless babbleâ€, (http://is.gd/2mKSg) which means of course that a full 60% of Twitter discourse is NOT pointless babble, which is disappointing. Very disappointing. I would have hoped 100% of Twitter was fully free of earnestness, usefulness and commercial intent. Why do these asinine reports jump onto a bandwagon they donâ€™t understand and why do those reporting on them relate with such glee that a service that was never supposed in the first place to be more than gossipy tittle-tattle and proudly banal verbal doodling is â€œfailing to deliver meaningful commercial or political contentâ€. Bollocky bollocks to the lot of them. They can found their own â€œenterprise orientedâ€ earnest microblogging service. Remind me to avoid it.”
Amplifying on the PCR Amplifier Â« GÃ¶delâ€™s Lost Letter and P=NP
- Excellent explanation of how the polymerase chain reaction lets us make many copies of a DNA strand.
Study Hacks Â» How to Schedule Your Writing Like a Professional Writer

Click here for all of my del.icio.us bookmarks.

Biweekly links for 08/21/2009

Avatar
- First trailer for the new James Cameron film.
Public Library of Science announces new initiative for sharing influenza research
- “PLoS is launching PLoS Currents (Beta) â€” a new and experimental website for the rapid communication of research results and ideas. In response to the recent worldwide H1N1 influenza outbreak, the first PLoS Currents research theme is influenza.
  PLoS Currents: Influenza, which we are launching today, is built on three key components: a small expert research community that PLoS is working with to run the website; Google Knol with new features that allow content to be gathered together in collections after being vetted by expert moderators; and a new, independent database at the National Center for Biotechnology Information (NCBI) called Rapid Research Notes, where research targeted for rapid communication, such as the content in PLoS Currents: Influenza will be freely and permanently accessible. To ensure that researchers are properly credited for their work, PLoS Currents content will also be given a unique identifier by the NCBI so that it is citable.”
Nicholas Carr’s Blog: Close down the schools!
Cosma Shalizi’s course on Data Mining
- Lecture notes included. I wish I’d looked at these earlier – the bits I’ve read are very informative.
LogiLogi: Philosophy beyond the paper
- Thoughtful and stimulating discussion of how philosophy might benefit from the introduction of new online tools.
Science magazine and JoVE announce scientific-video partnership
- “Science, the journal of scientific research, news, and commentary published by The American Association for the Advancement of Science (AAAS), and JoVE, the scientific video journal, announced that they have entered into a partnership for joint production and publication of scientific videos online. The purpose of the partnership is to enhance scientific articles published in Science through video demonstrations of experimental techniques.
  Under the partnership, which is currently in its pilot phase, Science will select papers suitable for the video enhancement, and will identify author groups willing to help shape the video demonstrations. JoVE will then work with the authors to create the actual demonstrations, using the company’s platform for geographically distributed video-production. According to Stewart Wills, Online Editor at Science, direct, in-article video demonstrations should increase the value of Science research to its main audience, working scientists and students. “
The definitive, two-part answer to “is data journalism?” | Holovaty.com
- “It’s a hot topic among journalists right now: Is data journalism? Is it journalism to publish a raw database? Here, at last, is the definitive, two-part answer:
  1. Who cares?
  
  2. I hope my competitors waste their time arguing about this as long as possible.”
Sharing with Google Groups
- Potentially rather handy: you can share stuff on Google with entire groups: “As more and more businesses and organizations “go Google,” we find that many of the features we develop based on feedback from large enterprises end up benefiting all of our users. We recently rolled out improvements to the way Google Groups interacts with several of our applications. Now, sharing calendars, sites and documents with multiple people is easy â€” instead of adding people one at a time, you can simply share with an entire Google Group.”
Official Google Research Blog: On the predictability of Search Trends
- “As we see that many of the search trends are predictable, we are introducing today a new forecasting feature in Insights for Search, along with a new version of the product. The forecasting feature is applied to queries which are identified as predictable (see, for instance, basketball or the trends in the Automotive category) and then shown as an extrapolation of the historical trends and search patterns.”

Click here for all of my del.icio.us bookmarks.

Bertrand’s postulate

This post is motivated by the ongoing Polymath Project “finding primes”, an open source mathematics project whose aim (roughly speaking) is to find an efficient algorithm to deterministically generate a prime number of at least [tex]k[/tex] digits. The post isn’t a contribution to the ongoing research discussion, but rather describes background material I wanted to understand better: a proof of a mathematical theorem known as Bertrand’s Postulate, a simple result from the 19th century about how frequently prime numbers occur. It’s being cross-posted to the Polymath wiki. The post requires some familiarity with elementary number theory.

Despite its name, Bertrand’s postulate is a theorem, not an example of what we’d ordinarily call a postulate in mathematics. Here’s what it says:

Theorem (Bertrand’s Postulate): For every integer [tex]n \geq 2[/tex], there is a prime [tex]p[/tex] satisfying [tex]n < p < 2n[/tex].

Bertrand’s postulate is relevant to the finding primes problem because it guarantees the existence of a [tex]k[/tex]-digit prime for any [tex]k[/tex]. Brute force search thus yields a [tex]k[/tex]-digit prime after about [tex]O(10^k)[/tex] steps; this can be considered the “trivial bound” for the problem.

Bertrand’s postulate was apparently first proved by Chebyshev. For large [tex]n[/tex], the claim follows as a consequence of the prime number theorem. We will give an elementary proof due to Erdos.

Our strategy is to analyse the prime factorization of the binomial coefficient [tex]{2n \choose n}[/tex]. What we’ll discover is that for primes [tex]p \leq n[/tex], the corresponding prime power in the prime factorization of [tex]{2n \choose n}[/tex] is never very big. In fact, we can guarantee that those powers are quite small – much smaller than what one might a priori believe could be the case. What we’ll find as a consequence is that when we take the product of all the prime powers for [tex]p \leq n[/tex], the product can’t possibly be as big as [tex]{2n \choose n}[/tex]. And that means that there must be primes in the range [tex]n < p < 2n[/tex] which appear in the prime factorization of [tex]{2n \choose n}[/tex]. Rather than prove Bertrand's postulate directly, we'll first prove two results of independent interest. The first is a bound on the product of all primes less than [tex]n[/tex], due to Chebyshev. Lemma (Chebyshev bound): For integers [tex]n \geq 2[/tex], [tex]\prod_{p \leq n} p \leq 4^n[/tex], where the product is over all primes [tex]p[/tex] less than or equal to [tex]n[/tex].

The Chebyshev bound tells us that primes can’t occur too frequently (for example, with constant density) in the positive integers, for if they did, the product of primes on the left would rise much faster than [tex]4^n[/tex]. Although it’s not needed for a proof of Bertrand’s postulate, insight can be gained by expressing this idea in a simple quantitative way. Observe that if [tex]\pi(n)[/tex] is the number of primes less than or equal to [tex]n[/tex], then [tex]\pi(n)! \leq \prod_{p \leq n} p[/tex], simply because the [tex]k[/tex]th prime must be at least [tex]k[/tex]. Combining this observation with Chebyshev’s bound gives [tex]\pi(n)! \leq 4^n[/tex]. Taking logarithms of both sides, and applying Stirling’s approximation, we see that up to constant factors and small corrections, we have [tex]\pi(n) \ln \pi(n) \leq n[/tex]. This, in turn, implies that, up to constant factors and small corrections, [tex]\pi(n) \leq n \ln \ln n / \ln n[/tex]. Roughly speaking, the prime numbers are at most logarithmically dense in the positive integers. In fact, the prime number theorem tells us that [tex]\pi(n) \approx n / \ln n[/tex], as [tex]n[/tex] becomes large.

Proof of Chebyshev’s bound: We will induct on [tex]n[/tex]. The case when [tex]n=2[/tex] is obviously true. Furthermore, if we assume the result when [tex]n[/tex] is odd, it follows immediately for [tex]n+1[/tex], since the left-hand side is not affected by incrementing [tex]n[/tex]. So we assume the result is true up to some even number, [tex]n = 2m[/tex], and try to prove the result for [tex]2m+1[/tex]. What we’ll aim to show is that:

[tex] (*) \prod_{m+1 < p \leq 2m+1} p \leq 4^m [/tex] If we can prove this, then multiplying by the inequality [tex]\prod_{p \leq m+1} p \leq 4^{m+1}[/tex] (which is true by the inductive hypothesis) will give us the desired result. To prove (*), note that every prime [tex]p[/tex] satisfying [tex]m+1 < p \leq 2m+1[/tex] must divide [tex]{2m+1 \choose m}[/tex], since [tex]p[/tex] appears in the numerator, but not the denominator. Furthermore, the uniqueness of prime factorization means that the product of all such primes must also divide [tex]{2m+1 \choose m}[/tex], and so we have [tex] \prod_{m+1 < p \leq 2m+1} p \leq {2m+1 \choose m}. [/tex] The proof is completed by noting that [tex]{2m+1 \choose m}[/tex] appears twice in the binomial expansion of [tex](1+1)^{2m+1}[/tex], and thus [tex]{2m+1 \choose m} \leq 4^m[/tex]. QED

There are two main ideas in our proof of Chebyshev’s bound, and it’s worth pausing to appreciate them. The first idea is that there should be a relationship between products of primes and factorials, e.g.,

[tex] \prod_{m+1 < p \leq 2m+1} p \leq {2m+1 \choose m}. [/tex] The second is that there should be a relationship between combinatorial coefficients like [tex]{2m+1 \choose m}[/tex] and exponentials with fixed bases, e.g., [tex]{2m+1 \choose m} \leq 4^m[/tex]. Once these two ideas are firmly in mind, the proof of the Chebyshev bound is obvious. The second lemma used in our proof of Bertrand's postulate is a useful fact about the prime factorization of [tex]{2n \choose n}[/tex], telling us that [tex]{2n \choose n}[/tex] has a rather special prime factorization which only involves low powers. After the main statement of the lemma we will single out two special cases where the proof can be used to make statements that are a little stronger than the main statement. These cases are included explicitly only because they are useful later. Note that although they complicate the statement of the lemma somewhat, they are trivial consequences of the proof of the main statement in the lemma. Lemma: Let [tex]p[/tex] be a prime, and define [tex]r(p,n)[/tex] to be the largest number such that [tex]p^{r(p,n)}[/tex] divides [tex]{2n \choose n}[/tex]. Then [tex]p^{r(p,n)} \leq 2n[/tex]. There are two special cases where we can make stronger statements: (1) if [tex]p > \sqrt{2n}[/tex], then [tex]r(p,n)[/tex] is [tex]1[/tex] or [tex]0[/tex]; (2) if [tex]2n/3 < p \leq n[/tex], then [tex]r(p,n) = 0[/tex].

This lemma contains the essential insight underlying Bertrand’s postulate, and that we explained earlier: because we can guarantee that the prime powers are low, [tex]{2n \choose n}[/tex] must have many prime factors. A careful accounting using the Chebyshev bound will show that we simply don’t have enough if one of those prime factors isn’t in the range [tex]n < p < 2n[/tex]. Proof: To prove the lemma, first note that the prime power of [tex]p[/tex] in the factorization of [tex]n![/tex] is given by [tex]\sum_{j=1}^{\infty} \lfloor n / p^j \rfloor[/tex]. As a result, [tex]r(p,n) = \sum_{j=1}^\infty ( \lfloor 2n / p^j \rfloor – 2 \lfloor n / p^j \rfloor)[/tex]. All terms in this sum are either [tex]0[/tex] or [tex]1[/tex], depending on whether [tex]\lfloor 2n/p^j \rfloor[/tex] is even or odd, and they are always [tex]0[/tex] when [tex]j > \log_p (2n)[/tex], whence [tex]r(p,n) \leq \log_p (2n)[/tex], which gives the main result.

In the special case of (1), [tex]p > \sqrt{2n}[/tex], simply observe that all but the first term in the sum vanishes, and so [tex]r(p,n) = \lfloor 2n/ p \rfloor – 2 \lfloor n/p \rfloor[/tex], which is either [tex]0[/tex] or [tex]1[/tex], according to whether [tex]\lfloor 2n/p \rfloor[/tex] is even or odd.

In the special case of (2), [tex]2n/3 < p \leq n[/tex], observe again that all but the first term in the sum vanishes, and so [tex]r(p,n) = \lfloor 2n/p \rfloor - 2 \lfloor n/p \rfloor[/tex]. But for any such value of [tex]p[/tex] we see that [tex]\lfloor 2n/p \rfloor[/tex] is equal to [tex]2[/tex], which is even, and so [tex]r(p,n) = 0[/tex]. QED

The proof of Bertrand’s postulate is now straightforward.

Proof of Bertrand’s postulate: For a contradiction, suppose there are no primes satisfying [tex]n < p < 2n[/tex]. In light of case (2) of the last lemma, we can assume that all the prime factors in [tex]{2n \choose n}[/tex] satisfy [tex]p \leq 2n/3[/tex]. We split the prime factorizing of [tex]{2n \choose n}[/tex] up into the cases where [tex]p \leq \sqrt{2n}[/tex] and where [tex]\sqrt{2n} < p \leq 2n/3[/tex]. Applying the last lemma we obtain: [tex] {2n \choose n} \leq \prod_{p \leq \sqrt{2n}} (2n) \prod_{\sqrt{2n} < p \leq 2n/3} p [/tex] We can bound the first product by noting that there are no more than [tex]\sqrt{2n}[/tex] primes of size at most [tex]\sqrt{2n}[/tex], and bound the second product using Chebyshev's bound: [tex] {2n \choose n} \leq (2n)^{\sqrt 2n} 4^{2n/3}. [/tex] Observing that [tex]{2n \choose n}[/tex] is the largest of the [tex]2n+1[/tex] terms in the binomial expansion of [tex](1+1)^{2n}=4^n[/tex], we see that [tex] \frac{4^n}{2n+1} \leq {2n \choose n} [/tex] Combinining the last two inequalities and rearranging a little gives [tex] 4^{n/3} \leq (2n+1)(2n)^{\sqrt 2n}. [/tex] The left-hand side rises much faster than the right, and so this inequality can only be true over some finite set of [tex]n[/tex]. It should be straightforward to convince yourself that it fails for [tex]n > 2048[/tex], for example, and thus we must have [tex]n \leq 2048[/tex]. In fact, we can do quite a bit better than [tex]n \leq 2048[/tex] with a bit more work, but the bound suffices for our purposes. What we have now shown is that our initial assumption – no prime in the range [tex]n < p < 2n[/tex] - can only possibly hold when [tex]n \leq 2048[/tex]. But it's easily seen that the assumption is also false in that range, just by considering the sequence of primes [tex]2,3,5,7,13,23,43,83,163,317,631,1259[/tex], and [tex]2503[/tex]. Each prime in the sequence is less than twice its predecessor, and so there must be a prime between [tex]n[/tex] and [tex]2n[/tex] for any [tex]n \leq 2048[/tex]. QED

Biweekly links for 08/17/2009

Project Chanology – Wikipedia, the free encyclopedia
- Excellent Wikipedia article on the confrontations between Anonymous and the Church of Scientology.
rehash.nl
- Archive of materials (including video) from the series of hacker conferences in the Netherlands.
Measuring Distances in Google Maps
- How to measure distances along customized routes in Google Maps.
DebugAdvisor: A Recommender System for Debugging – Microsoft Research
- “In large software development projects, when a programmer is assigned a bug to fix, she typically spends a lot of time searching (in an ad-hoc manner) for instances from the past where similar bugs have been debugged, analyzed and resolved. Systematic search tools that allow the programmer to express the context of the current bug, and search through diverse data repositories associated with large projects can greatly improve the productivity of debugging. This paper presents the design, implementation and experience from such a search tool called DebugAdvisor.
  The context of a bug includes all the information a programmer has about the bug, including natural language text, textual rendering of core dumps, debugger output etc.
  
  Our key insight is to allow the programmer to collate this entire context as a query to search for related information. Thus, DebugAdvisor allows the programmer to search using a fat query, which could be kilobytes of structured and unstructured data…”
Happiness and unhappiness in east and west: Themes…[Emotion. 2009] – PubMed Result
- Why studies which ask people to self-assess “happiness” on some scale are of dubious utility (what do they measure?): “Cultural folk models of happiness and unhappiness are likely to have important bearings on social cognition and social behavior… the authors systematically analyzed American and Japanese participants’ spontaneously produced descriptions of the two emotions and observed… that whereas Americans associated positive hedonic experience of happiness with personal achievement, Japanese associated it with social harmony. Furthermore, Japanese were more likely than Americans to mention both social disruption and transcendental reappraisal as features of happiness… descriptions of unhappiness included various culture-specific coping actions: Whereas Americans focused on externalizing behavior (e.g., anger and aggression), Japanese highlighted transcendental reappraisal and self-improvement. Implications for research on culture and emotion are discussed.”
Genesis 1 – LOLCat Bible Translation Project
- “Boreded Ceiling Cat makinkgz Urf n stuffs
  1 Oh hai. In teh beginnin Ceiling Cat maded teh skiez An da Urfs, but he did not eated dem.
  
  2 Da Urfs no had shapez An haded dark face, An Ceiling Cat rode invisible bike over teh waterz.
  
  3 At start, no has lyte. An Ceiling Cat sayz, i can haz lite? An lite wuz.4 An Ceiling Cat sawed teh lite, to seez stuffs, An splitted teh lite from dark but taht wuz ok cuz kittehs can see in teh dark An not tripz over nethin.5 An Ceiling Cat sayed light Day An dark no Day. It were FURST!!!1”
Scylla and Charybdis
- “Institutions need both fixed representations and novel representations to remain organized and retain peopleâ€™s attention. Interpretive traditions, where the interpretand is fixed, face a special challenge in this regard. That they are able to resolve it successfully, most of the time, is a testament to the immense skill of our species as information managers.”
Thinking about Mario, Pompeii and the internet â€“ confused of calcutta
co5TARS
- Fun and informative way of visualizing the careers of actors, directors, etc.

Click here for all of my del.icio.us bookmarks.

Biweekly links for 08/14/2009

ongoing Â· Blog & Tweet
- Tim Bray: “Because whenever you see a vendor owning a communications medium, thatâ€™s part of the problem, not part of the solution. Even if the vendor is as lovable as Twitter; and I do love â€™em. So Iâ€™m going to route around the breakage, and you might want to think about doing the same.”
Motivic stuff
- Another interesting new mathematical blog, this one focusing on cohomology, homotopy theory, and arithmetic geometry.
Augmented Social Cognition: More details of changing editor resistance in Wikipedia
- Data showing that new editors are much more likely to have their edits reverted. Claims that this shows Wikipedia is become more resistant to new ideas. The obvious objection is that maybe all it shows is that the contributions of new editors aren’t very good compared to established editors. Stil, lots of interesting data.
Augmented Social Cognition: The slowing growth of Wikipedia: some data, models, and explanations
- Wikipedia’s growth rate has essentially plateaued.
David Byrne: So, How Does It Work on the Bus?
- Excellent article from David Byrne about being a rock star on tour. A few little tidbits: they place at least 4 shows a week to make ends meet, so there’s not a lot of time to stick around; the tour used to be viewed as a loss leader to sell albums (no more!); they usually depart just 90 minutes after show ends; and much more.
A Comparison of Open Source Search Engines Â« zooie’s blog
Benchmarking Amazon EC2 for High-Performance Scientific Computing
- Interesting, though many others factors will often need to be compared in practice. Abstract: “How effective are commercial cloud computers for high-performance scientific computing compared to currently available alternatives? I aim to answer a specific instance of this question by examining the performance of Amazon EC2 for high-performance scientific applications. I used macro and micro benchmarks to study the performance of a cluster composed of EC2 high-CPU compute nodes and compared this against the performance of a cluster composed of equivalent processors available to the open scientific research community. My results show a significant performance gap in the examined clusters that system builders, computational scientists, and commercial cloud computing vendors need to be aware of.”

Click here for all of my del.icio.us bookmarks.