This post is motivated by the ongoing Polymath Project “finding primes”, an open source mathematics project whose aim (roughly speaking) is to find an efficient algorithm to deterministically generate a prime number of at least digits. The post isn’t a contribution to the ongoing research discussion, but rather describes background material I wanted to understand better: a proof of a mathematical theorem known as Bertrand’s Postulate, a simple result from the 19th century about how frequently prime numbers occur. It’s being cross-posted to the Polymath wiki. The post requires some familiarity with elementary number theory.
Despite its name, Bertrand’s postulate is a theorem, not an example of what we’d ordinarily call a postulate in mathematics. Here’s what it says:
Theorem (Bertrand’s Postulate): For every integer , there is a prime satisfying .
Bertrand’s postulate is relevant to the finding primes problem because it guarantees the existence of a -digit prime for any . Brute force search thus yields a -digit prime after about steps; this can be considered the “trivial bound” for the problem.
Bertrand’s postulate was apparently first proved by Chebyshev. For large , the claim follows as a consequence of the prime number theorem. We will give an elementary proof due to Erdos.
Our strategy is to analyse the prime factorization of the binomial coefficient . What we’ll discover is that for primes , the corresponding prime power in the prime factorization of is never very big. In fact, we can guarantee that those powers are quite small – much smaller than what one might a priori believe could be the case. What we’ll find as a consequence is that when we take the product of all the prime powers for , the product can’t possibly be as big as . And that means that there must be primes in the range which appear in the prime factorization of .
Rather than prove Bertrand's postulate directly, we'll first prove two results of independent interest. The first is a bound on the product of all primes less than , due to Chebyshev.
Lemma (Chebyshev bound): For integers , , where the product is over all primes less than or equal to .
The Chebyshev bound tells us that primes can’t occur too frequently (for example, with constant density) in the positive integers, for if they did, the product of primes on the left would rise much faster than . Although it’s not needed for a proof of Bertrand’s postulate, insight can be gained by expressing this idea in a simple quantitative way. Observe that if is the number of primes less than or equal to , then , simply because the th prime must be at least . Combining this observation with Chebyshev’s bound gives . Taking logarithms of both sides, and applying Stirling’s approximation, we see that up to constant factors and small corrections, we have . This, in turn, implies that, up to constant factors and small corrections, . Roughly speaking, the prime numbers are at most logarithmically dense in the positive integers. In fact, the prime number theorem tells us that , as becomes large.
Proof of Chebyshev’s bound: We will induct on . The case when is obviously true. Furthermore, if we assume the result when is odd, it follows immediately for , since the left-hand side is not affected by incrementing . So we assume the result is true up to some even number, , and try to prove the result for . What we’ll aim to show is that:
If we can prove this, then multiplying by the inequality (which is true by the inductive hypothesis) will give us the desired result. To prove (*), note that every prime satisfying must divide , since appears in the numerator, but not the denominator. Furthermore, the uniqueness of prime factorization means that the product of all such primes must also divide , and so we have
The proof is completed by noting that appears twice in the binomial expansion of , and thus . QED
There are two main ideas in our proof of Chebyshev’s bound, and it’s worth pausing to appreciate them. The first idea is that there should be a relationship between products of primes and factorials, e.g.,
The second is that there should be a relationship between combinatorial coefficients like and exponentials with fixed bases, e.g., . Once these two ideas are firmly in mind, the proof of the Chebyshev bound is obvious.
The second lemma used in our proof of Bertrand's postulate is a useful fact about the prime factorization of , telling us that has a rather special prime factorization which only involves low powers. After the main statement of the lemma we will single out two special cases where the proof can be used to make statements that are a little stronger than the main statement. These cases are included explicitly only because they are useful later. Note that although they complicate the statement of the lemma somewhat, they are trivial consequences of the proof of the main statement in the lemma.
Lemma: Let be a prime, and define to be the largest number such that divides . Then . There are two special cases where we can make stronger statements: (1) if , then is or ; (2) if , then .
This lemma contains the essential insight underlying Bertrand’s postulate, and that we explained earlier: because we can guarantee that the prime powers are low, must have many prime factors. A careful accounting using the Chebyshev bound will show that we simply don’t have enough if one of those prime factors isn’t in the range .
Proof: To prove the lemma, first note that the prime power of in the factorization of is given by . As a result, . All terms in this sum are either or , depending on whether is even or odd, and they are always when , whence , which gives the main result.
In the special case of (1), , simply observe that all but the first term in the sum vanishes, and so , which is either or , according to whether is even or odd.
In the special case of (2), , observe again that all but the first term in the sum vanishes, and so . But for any such value of we see that is equal to , which is even, and so . QED
The proof of Bertrand’s postulate is now straightforward.
Proof of Bertrand’s postulate: For a contradiction, suppose there are no primes satisfying . In light of case (2) of the last lemma, we can assume that all the prime factors in satisfy . We split the prime factorizing of up into the cases where and where . Applying the last lemma we obtain:
We can bound the first product by noting that there are no more than primes of size at most , and bound the second product using Chebyshev's bound:
Observing that is the largest of the terms in the binomial expansion of , we see that
Combinining the last two inequalities and rearranging a little gives
The left-hand side rises much faster than the right, and so this inequality can only be true over some finite set of . It should be straightforward to convince yourself that it fails for , for example, and thus we must have . In fact, we can do quite a bit better than with a bit more work, but the bound suffices for our purposes. What we have now shown is that our initial assumption – no prime in the range - can only possibly hold when . But it's easily seen that the assumption is also false in that range, just by considering the sequence of primes , and . Each prime in the sequence is less than twice its predecessor, and so there must be a prime between and for any . QED