Finding primes
This is the main blog page for the Polymath4 project "Finding primes".
The main aim of the project is to resolve the following conjecture:
- (Strong) conjecture. There exists deterministic algorithm which, when given an integer k, is guaranteed to find a prime of at least k digits in length of time polynomial in k. You may assume as many standard conjectures in number theory (e.g. the generalised Riemann hypothesis) as necessary, but avoid powerful conjectures in complexity theory (e.g. P=BPP) if possible.
Since primality is known to be in P by the AKS algorithm, we may assume a primality oracle that decides whether any given number is prime in unit time. Weaker versions of the strong conjecture are proposed below.
Contributions and corrections to this wiki page (and its subpages) are very welcome.
Threads
- proposal for the project (Inactive)
- The current research thread (Active)
- The current discussion thread (Active)
Easier versions of the problem
- Semi-weak conjecture: There exists a probabilistic algorithm which, when given an integer k, has probability at least 1/2 (say) of finding a prime of at least k digits in length in time polynomial in k, using o(k) random bits. (Note that it is known that one can do this with O(k) random bits; and if one can use just [math]\displaystyle{ O(\log k) }[/math] random bits, one is done by exhausting over the random bits.)
One way to solve the semi-weak conjecture would be to find an "explorable" set of size [math]\displaystyle{ \exp(o(k)) }[/math] of integers, a significant fraction (at least [math]\displaystyle{ k^{-O(1)} }[/math]) of which are primes at least k digits in length. By explorable, we mean that one has a quick way to produce a reasonably uniformly distributed element of that set using o(k) random bits (e.g. the set might be an interval, arithmetic progression, or other explicitly parameterisable set, or one might have an explicit expander graph on this set).
- Weak conjecture: There exists a deterministic algorithm, which, when given an integer k, is guaranteed to find a prime of at least [math]\displaystyle{ \omega(\log k) }[/math] digits in length in time polynomial in k, where [math]\displaystyle{ \omega(\log k) }[/math] grows faster than [math]\displaystyle{ \log k }[/math] as [math]\displaystyle{ k \to \infty }[/math]. (Equivalently, find a deterministic algorithm which, when given k, is guaranteed to find a prime of at least k digits in time [math]\displaystyle{ \exp(o(k)) }[/math]. Note that the semi-weak conjecture implies the weak conjecture).
- Strong conjecture with factoring: Same as the original conjecture, but assume that one has a factoring oracle will return the factors of any number given to it (as a stream of bits) in unit time. (This oracle can be given "for free" if factoring is in P, but we can also assume the existence of this oracle even if factoring is not in P.)
- Semi-weak conjecture with factoring: Same as the semi-weak conjecture, but with a factoring oracle.
- Weak conjecture with factoring: Same as the weak conjecture, but with a factoring oracle. (Note that there are deterministic sieve algorithms, such as the quadratic sieve, which are believed to run in subexponential time [math]\displaystyle{ \exp(o(k)) }[/math], so this oracle is in some sense available "for free" for this problem assuming this belief.)
The weak conjecture with factoring is the easiest of the six, and so perhaps one should focus on that one first.
It is also of interest to see whether the conjecture becomes easier to solve assuming that P=BPP or P=BQP.
Partial results
- P=NP implies a deterministic algorithm to find primes
- An efficient algorithm exists if Cramer's conjecture holds
- k-digit primes can be found with high probability using O(k) random bits
- If [Pseudo-random generators (PRG)|pseudorandom number generators] exist, the conjecture holds, as one can derandomise the above algorithm by substituting the pseudorandom number generator for the random string of bits.
- The function field analogue (i.e. finding degree k irreducible polynomials over a finite field) is known [AL???]
- The analogue over the rationals is even easier: Eisenstein's criterion provides plenty of irreducible polynomials of degree k, e.g. [math]\displaystyle{ x^k-2 }[/math].
- Assuming a suitably quantitative version of Schinzel's hypothesis H (namely, the Bateman-Horn conjecture), the weak conjecture is true, as one can simply search over all k-digit numbers of the form [math]\displaystyle{ n^d-2 }[/math] (say) for any fixed d to give an algorithm that works in time [math]\displaystyle{ O(10^{k/d}) }[/math]. By randomly sampling such integers, one also establishes the semi-weak conjecture (assuming that one has the expected density for primes in hypothesis H).
- It's easy to deterministically find large square-free numbers; just multiply lots of small primes together. However, it is not known how to test a number for square-freeness deterministically and quickly (unless one has a factoring oracle, in which case it is trivial).
- However, it's not so obvious how to find large pairs [math]\displaystyle{ n,n+1 }[/math] of consecutive square-free numbers, though such pairs must exist by counting arguments (the density of square-free numbers is [math]\displaystyle{ 6/\pi^2 \approx 60\% }[/math]). This could be a good variant problem.
- It is conjectured that every element of Sylvester's sequence is square-free. If so, this solves the above problem.
- However, it's not so obvious how to find large pairs [math]\displaystyle{ n,n+1 }[/math] of consecutive square-free numbers, though such pairs must exist by counting arguments (the density of square-free numbers is [math]\displaystyle{ 6/\pi^2 \approx 60\% }[/math]). This could be a good variant problem.
- If a fast algorithm to find primes exists, then an explicit fast algorithm to find primes exists: simply run all algorithms of size at most A simultaneously for time [math]\displaystyle{ k^A }[/math], then increment A from 1 onward, until one gets a hit.
Here are the current "world records" for the fastest way to deterministically (and provably) obtain a k-digit prime (ignoring errors of [math]\displaystyle{ \exp(o(k)) }[/math] in the run time):
- [math]\displaystyle{ O(10^k) }[/math]: The trivial algorithm of testing each of the k-digit numbers in turn until one gets a hit.
- [math]\displaystyle{ O(10^k)^{3/4} }[/math]: Test the k-digit numbers of the form [math]\displaystyle{ a^2+b^4 }[/math] for primality (the Friedlander-Iwaniec theorem guarantees a hit for k large enough).
- [math]\displaystyle{ O(10^k)^{2/3} }[/math]: Test the k-digit numbers of the form [math]\displaystyle{ a^3+2b^3 }[/math] for primality (a hit is guaranteed for large k by a result of Heath-Brown).
- [math]\displaystyle{ O(10^k)^{0.525} }[/math]: Test the k-digit numbers from, say, [math]\displaystyle{ 10^k }[/math] onwards until one gets a hit (here we use the [BHP2001] bound on prime gaps).
- [math]\displaystyle{ O(10^k)^{1/2} }[/math] assuming RH: Test the k-digit numbers from, say, [math]\displaystyle{ 10^k }[/math] onwards until one gets a hit (here we use the RH bound on prime gaps).
Note that if one can get [math]\displaystyle{ O(10^k)^{\varepsilon} }[/math] for each fixed [math]\displaystyle{ \varepsilon \gt 0 }[/math], then one has established the weak conjecture.
A slight variant of the problem: assuming a factoring oracle, given an algorithm that runs in k steps, how large a prime is the algorithm guaranteed to produce in the worst-case scenario? (Note that this is very different from what the algorithm is heuristically predicted to produce in the average-case scenario.)
Here is a simple algorithm that produces a prime larger than about [math]\displaystyle{ k \log k }[/math] in k steps, based on Euclid's proof of the infinitude of primes:
- Initialise [math]\displaystyle{ p_1=2 }[/math].
- Once [math]\displaystyle{ p_1,\ldots,p_{k-1} }[/math] are computed, define [math]\displaystyle{ p_k }[/math] to be the largest prime factor of [math]\displaystyle{ p_1 \ldots p_{k-1}+1 }[/math].
This generates k distinct primes; the prime number theorem tells us that the k^th prime has size about [math]\displaystyle{ k \log k }[/math], so at least one of the primes produced here has size at least [math]\displaystyle{ k \log k }[/math].
If one instead works with Sylvester's sequence [math]\displaystyle{ a_k = a_1 \ldots a_{k-1}+1 }[/math], then it is known [O1985] that the number of primes less than n that can divide any one of the [math]\displaystyle{ a_k }[/math] is [math]\displaystyle{ O( n / \log n \log \log \log n ) }[/math] rather than [math]\displaystyle{ O(n/\log n) }[/math] (the prime number theorem bound). If we then factor the first k elements of this sequence, we must get a prime of size at least [math]\displaystyle{ k \log k \log \log \log k }[/math] or so.
One can do better still by working with the Fermat numbers, [math]\displaystyle{ F_n = 2^{2^n}+1 }[/math]. It is known [KLS2002] that the number of primes dividing any of the [math]\displaystyle{ F_n }[/math] is [math]\displaystyle{ O(\sqrt{n}/\log n) }[/math], so in particular if we factor the first k Fermat numbers, we get a prime almost of size [math]\displaystyle{ k^2 }[/math]. In fact, it's a classical result of Euler that the prime divisors of [math]\displaystyle{ F_n }[/math] are all at least [math]\displaystyle{ 2^{n+2}+1 }[/math], so we can obtain a large prime simply by factoring a Fermat number.
Assuming RH, one can find a prime larger than [math]\displaystyle{ k^2 }[/math] in about k steps by the method indicated earlier (start at [math]\displaystyle{ k^2 }[/math] and increment until one hits a prime). Unconditionally, one gets a prime larger than [math]\displaystyle{ k^{1/0.525} }[/math] by this method. Currently this is the world record.
Variants of the problem
- Can one derandomise factoring of polynomials over finite fields? Note that if one could do this even in the quadratic case, this would allow for efficient location of quadratic non-residues, which is not known.
Finding consecutive square-free integers
- Toy problem What is the fastest way to find consecutive square-free integers n, n+1 of at least k digits in length assuming a square-free oracle (a special case of the factoring oracle)?
Note that finding one large square-free number is easy: just multiply lots of distinct small primes together. Also, as the density of square-free numbers is 6/\pi^2 \approx 60\%, a counting argument shows that pairs of consecutive square-free numbers exist in abundance, and are thus easy to find probabilistically. The density can be increased to arbitrarily close to 1 using the W-trick, so one could also ask for, say, long arithmetic progressions of square-free numbers also.
If Sylvester's sequence is square-free, then this solves the problem.
Testing a number for square-freeness is trivial with a factoring oracle. No unconditional polynomial-time deterministic algorithm for square-freeness seems to be known. (It is listed as an open problem in this paper from 1994.
Unconditionally, there are asymptotics for the number of square-free numbers in [1,N] with an error term of [math]\displaystyle{ O(N^{1/2}) }[/math], which leads to a [math]\displaystyle{ O(10^k)^{1/2} }[/math] algorithm (just search an interval of this length for square-freeness). This is the current unconditional world record. Assuming RH, this can be improved to [math]\displaystyle{ O(10^k)^{17/54+\varepsilon} }[/math].
There is an asymptotic [HB1984]for the number of consecutive square free numbers with an error term of [math]\displaystyle{ O(N^{7/11}) }[/math], leading to a [math]\displaystyle{ O(10^k)^{17/54+\varepsilon} }[/math], though this is inferior to the above bounds.
It is known [FT????] that the interval [math]\displaystyle{ [N, N+N^{3/14}] }[/math] contains at least one square-free number; Granville [G????] improved this to [math]\displaystyle{ [N,N+N^{\varepsilon}] }[/math] assuming the ABC conjecture. If the density in such intervals can be made to exceed 50% (perhaps after first applying the W-trick) then we would have solved the weak version of this toy problem assuming ABC.
Finding pseudoprimes
- Problem: How quickly can one find a solution to the congruence [math]\displaystyle{ 2^{n-1}=2 \hbox{ mod } n }[/math] with n at least k digits in length?
It is conjectured that the largest gap between k-digit pseudoprimes is poly(k) (this is weaker than Cramer's conjecture, and would imply a polynomial time solution to the above problem).
It is conjectured that the number [math]\displaystyle{ \mathcal{P}_a(x) }[/math] of pseudoprimes to any base [math]\displaystyle{ a \neq \pm1 }[/math] satisfies
- [math]\displaystyle{ \displaystyle \log\mathcal{P}_a(x) = \log(x) - \log(x)\frac{\log\log\log(x)}{\log\log(x)}(1+o(1)). }[/math]
However, the best known bounds seem to be
- [math]\displaystyle{ \displaystyle \log\mathcal{P}_a(x) \leq \log(x) - \frac{1}{2}\log(x)\frac{\log\log\log(x)}{\log\log(x)}(1+o(1)), }[/math]
and
- [math]\displaystyle{ \displaystyle \log\mathcal{P}_a(x) \geq (\frac{\log(x)}{\log a})^{E/(E+1)-o(1)}, }[/math]
where E is a constant which is conjectured to equal 1, but is only known to have value at least 1-1/(2\sqrt{e}). These results can be found in [P1981], [P1982], [P1989]. Better bounds are known to hold on average [EP1986].
Observations and strategies
- For the conjecture with factoring, it would suffice to show that every polylog(N)-length interval in [N,2N] contains a number which is not [math]\displaystyle{ N^\varepsilon }[/math]-smooth (or [math]\displaystyle{ \log^{O(1)} N }[/math] smooth, if one only wants to solve the weak conjecture with factoring). One way to start approaching this sort of problem is to take iterated product sets of sets of medium sized primes and show that they spread out to fill all short intervals in [N,2N].
- One can combine this with the "W-trick". For instance, if one lets W be the product of all the primes less than k, and if one can show that not every element of the progression [math]\displaystyle{ \{ Wn+1: 1 \leq n \leq k^{100} \} }[/math] (say) is [math]\displaystyle{ k^A }[/math]-smooth, then one has found a prime with at least [math]\displaystyle{ A \log k }[/math] digits in polynomial time. If one can make A arbitrarily large, one has solved the baby problem with factoring.
- If one can find explicitly enumerable sets of k-digit numbers of size [math]\displaystyle{ O(10^{\varepsilon k}) }[/math] for any fixed [math]\displaystyle{ \varepsilon \gt 0 }[/math] that are guaranteed to contain a prime, then one has solved the weak conjecture. If instead one can find randomly samplable sets of k-digit numbers of size [math]\displaystyle{ O(10^{\varepsilon k}) }[/math] that contain a large fraction (e.g. [math]\displaystyle{ \exp(-o(k)) }[/math] of primes), one has solved the semi-weak conjecture. If one can find explicit sets of poly(k) size that are guaranteed to contain a prime, one has solved the strong conjecture.
- If a primality test can be expressed as a constant-degree polynomial over a finite field from the digits of the number-under-test to {0,1}, then known pseudorandom generators for polynomials can be used to derandomize finding primes the same way as under the full P=BPP assumption. So, instead of requiring a general PRG, the proposal is trying to put the primality test into a fully derandomizable subclass of P.
- If one can find a sequence of easily constructible, mutually coprime integers whose prime factors are contained in a sparse subset of the primes, this generates (with the aid of a factoring oracle) a sparse set of primes. So far the sparsest set of this type has come from factoring the Fermat numbers.
Negative results and other obstructions
- Using sieve theory to find a non-smooth number in a short interval runs into the difficulty that most sieve theory methods are insensitive to the starting point of an interval. Given that every integer in [1,n] is n-smooth, this implies that sieve theory is unlikely to establish the existence of a non-n-smooth number in any interval of length n, and in particular to find a non-[math]\displaystyle{ O(k^O(1)) }[/math] smooth integer in any interval that can be searched in polynomial time. Indeed, there are heuristic arguments that suggest that even augmented with basic Fourier analysis and assuming RH, this method is unlikely to work, although they haven't yet been fully formalized.
- Average-case results for the primes cannot be used directly, because one could delete all constructible integers from the set of primes (a negligible set) without significantly impacting the average case results, while causing the conjecture to fail for this modified set of primes.
- It is not possible to prove the conjecture purely by "soft" (i.e. relativisable) complexity theory methods which do not use any additional information on the primes rather than their density. Details are at Oracle counterexample to finding pseudoprimes.
How could the conjecture be false?
A world in which the conjecture is negative would be very strange, especially with factoring or if the weak conjecture fails also. In particular, the following would have to be true:
- All numbers with at least k digits that are "constructible" in the sense that they are the output of a Turing machine program of length [math]\displaystyle{ O(\log k) }[/math] that runs in time [math]\displaystyle{ k^{O(1)} }[/math], are composite. [If the conjecture fails with factoring, then these numbers are furthermore [math]\displaystyle{ \exp(\varepsilon k) }[/math] smooth for any fixed [math]\displaystyle{ \varepsilon \gt 0 }[/math]; if even the weak conjecture with factoring fails, they are [math]\displaystyle{ k^{O(1)} }[/math]-smooth.]
- In fact, all constructible numbers are not only composite/smooth, but they sit inside intervals of length [math]\displaystyle{ k^{100} }[/math], all of whose elements are composite/smooth. (One can replace 100 by any other fixed number.)
- In particular, Cramer's conjecture fails; the gaps between adjacent k-digit primes exceeds [math]\displaystyle{ k^{100} }[/math] around every constructible number.
- No pseudorandom number generators exist. In particular, this implies that one-way functions do not exist.
- [math]\displaystyle{ P \neq NP }[/math]. More specifically, the decision problem "Does there exist a prime in a given interval?" is in NP, but cannot lie in P if the conjecture fails.
Relevant concepts
Complexity theory concepts:
- Complexity classes
- Pseudo-random generators (PRG)
- Full derandomization
- one-way function
- Impagliazzo's five worlds: Algorithmica, Heuristica, Pessiland, Minicrypt, Cryptomania
- Kolmogorov complexity
- Oracles
Number theory concepts:
- Cramer's random model for the primes
- Prime gaps
- Generic prime
- Smooth number
- The W-trick
- Sylvester's sequence
- Siegel zero
- Asymptotic notation
- Fermat number
- Pseudoprimes
Other:
Note: articles with external links should eventually be replaced by in-house pages that can provide information that is more dedicated to this project.
Relevant conjectures
- P=NP
- P=BPP
- P=promise-BPP
- P=BQP
- existence of PRG
- existence of one-way functions
- whether DTIME(2^n) has subexponential circuits
- The Riemann Hypothesis (RH)
- the Hardy-Littlewood prime tuples conjecture
- the ABC conjecture
- Cramer's conjecture
- Schinzel's hypothesis H
- discrete log in P
- factoring in P
Relevant results
- Brun-Titchmarsh inequality
- Erdos-Kac theorem
- Bertrand's postulate
- Friedlander-Iwaniec theorem
- Conditional on the Riemann hypothesis
Relevant papers
- [AMcC1994] L. Adelman, K. McCurley, Open problems in number theoretic complexity, II, Lecture Notes In Computer Science; Vol. 877, Proceedings of the First International Symposium on Algorithmic Number Theory 291 - 322 (1994)
- [AL???] L. Adelman, H. Lenstra, "Finding irreducible polynomials over finite fields", ???
- [BH1996] R. C. Baker and G. Harman, “The difference between consecutive primes,” Proc. Lond. Math. Soc., series 3, 72 (1996) 261–280.
- [BHP2001] R. C. Baker, G. Harman and J. Pintz, The difference between consecutive primes, II, Proceedings of the London Mathematical Society 83, (2001), 532–562.
- [BH1998] A. Balog, T. Wooley, “On strings of consecutive integers with no large prime factors”, J. Austral. Math. Soc. Ser. A 64 (1998), no. 2, 266–276
- [D1998] P. Dusart, Autour de la fonction qui compte le nombre de nombres premiers, 1998
- [EP1986] P. Erdős and C. Pomerance, On the number of false witnesses for a composite number, Math. Comp. 46 (1986), 259-279.
- [FT????] M. Filaseta, O. Trifonov, On gaps between squarefree numbers, Analytic number theory: Proceedings of a conference in honor of Paul Bateman
- [GW2002] O. Goldreich, A. Wigderson, Derandomization that is rarely wrong from short advice that is typically good
- [G???] A. Granville, ABC Allows Us to Count Squarefrees, ???
- [G1995] A. Granville, "Harald Cramér and the distribution of prime numbers", Scandinavian Actuarial J. 1 (1995), 12—28.
- [HB1984] D.R. Heath-Brown, The square sieve and consecutive square-free numbers, Math. Ann. 266 (1984), 251-259
- [IW1997] R. Impagliazzo, A. Wigderson, P = BPP unless E has sub-exponential circuits. In Proceedings of the 29th ACM Symposium on Theory of Computing, pages 220–229, 1997.
- [J2006] R. Jones, The density of prime divisors in the arithmetic dynamics of quadratic polynomials
- [KLS2002] Křížek, Luca and Somer, On the convergence of series of reciprocals of primes related to the Fermat numbers, J. Number Theory 97(2002), 95–112
- [M1994] U. Maurer, Fast Generation of Prime Numbers and Secure Public-Key Cryptographic Parameters, Journal of Cryptology 8 (1994), 123-155
- [O1985] Odoni, On the prime divisors of the sequence [math]\displaystyle{ w_n+1 = 1+ w_1 \ldots w_n }[/math]. J. London Math. Soc. (2), 32(1):1–11, 1985.
- [P1981] C. Pomerance, On the distribution of pseudoprimes, Math. Comp. 37 (1981), 587-593.
- [P1982] C. Pomerance, A new lower bound for the pseudoprime counting function, Illinois J. Math. 26 (1982), 4-9.
- [P1989] C. Pomerance, Two methods in elementary analytic number theory, Number theory and applications, R. A. Mollin, ed., Kluwer Academic Publishers, Dordrecht, 1989, pp. 135-161.
- [S2006] K. Soundararajan, The distribution of the primes (survey)
- [S2006b] K. Soundararajan, Small gaps between prime numbers: The work of Goldston-Pintz-Yildirim
- [T2006] L. Trevisan, Pseudorandomness and Combinatorial Constructions
Please help out by adding links or completing references, or adding further references to the list above!
External links
- Complexity Zoo
- "Finding Primes", Lance Fortnow, August 4 2009