# Finding primes

This is the main blog page for the Polymath4 project "Finding primes".

The main aim of the project is to resolve the following conjecture:

(Strong) conjecture. There exists deterministic algorithm which, when given an integer k, is guaranteed to find a prime of at least k digits in length of time polynomial in k. You may assume as many standard conjectures in number theory (e.g. the generalised Riemann hypothesis) as necessary, but avoid powerful conjectures in complexity theory (e.g. P=BPP) if possible.

Since primality is known to be in P by the AKS algorithm, we may assume a primality oracle that decides whether any given number is prime in unit time. Weaker versions of the strong conjecture are proposed below.

Contributions and corrections to this wiki page (and its subpages) are very welcome.

1. proposal for the project (Inactive)
2. The current research thread (Active)
3. The current discussion thread (Active)

## Easier versions of the problem

Semi-weak conjecture: There exists a probabilistic algorithm which, when given an integer k, has probability at least 1/2 (say) of finding a prime of at least k digits in length in time polynomial in k, using o(k) random bits. (Note that it is known that one can do this with O(k) random bits; and if one can use just $O(\log k)$ random bits, one is done by exhausting over the random bits.)

One way to solve the semi-weak conjecture would be to find an "explorable" set of size $\exp(o(k))$ of integers, a significant fraction (at least $k^{-O(1)}$) of which are primes at least k digits in length. By explorable, we mean that one has a quick way to produce a reasonably uniformly distributed element of that set using o(k) random bits (e.g. the set might be an interval, arithmetic progression, or other explicitly parameterisable set, or one might have an explicit expander graph on this set).

Weak conjecture: There exists a deterministic algorithm, which, when given an integer k, is guaranteed to find a prime of at least $\omega(\log k)$ digits in length in time polynomial in k, where $\omega(\log k)$ grows faster than $\log k$ as $k \to \infty$. (Equivalently, find a deterministic algorithm which, when given k, is guaranteed to find a prime of at least k digits in time $\exp(o(k))$. Note that the semi-weak conjecture implies the weak conjecture).
Strong conjecture with factoring: Same as the original conjecture, but assume that one has a factoring oracle will return the factors of any number given to it (as a stream of bits) in unit time. (This oracle can be given "for free" if factoring is in P, but we can also assume the existence of this oracle even if factoring is not in P.)
Semi-weak conjecture with factoring: Same as the semi-weak conjecture, but with a factoring oracle.
Weak conjecture with factoring: Same as the weak conjecture, but with a factoring oracle. (Note that there are deterministic sieve algorithms, such as the quadratic sieve, which are believed to run in subexponential time $\exp(o(k))$, so this oracle is in some sense available "for free" for this problem assuming this belief.)

The weak conjecture with factoring is the easiest of the six, and so perhaps one should focus on that one first.

It is also of interest to see whether the conjecture becomes easier to solve assuming that P=BPP or P=BQP.

## Partial results

1. P=NP implies a deterministic algorithm to find primes
2. An efficient algorithm exists if Cramer's conjecture holds
3. k-digit primes can be found with high probability using O(k) random bits
1. If [Pseudo-random generators (PRG)|pseudorandom number generators] exist, the conjecture holds, as one can derandomise the above algorithm by substituting the pseudorandom number generator for the random string of bits.
4. The function field analogue (i.e. finding degree k irreducible polynomials over a finite field) is known (see Adelman-Lenstra)
1. The analogue over the rationals is even easier: Eisenstein's criterion provides plenty of irreducible polynomials of degree k, e.g. $x^k-2$.
5. Assuming a suitably quantitative version of Schinzel's hypothesis H (namely, the Bateman-Horn conjecture), the weak conjecture is true, as one can simply search over all k-digit numbers of the form $n^d-2$ (say) for any fixed d to give an algorithm that works in time $O(10^{k/d})$. By randomly sampling such integers, one also establishes the semi-weak conjecture (assuming that one has the expected density for primes in hypothesis H).
6. It's easy to deterministically find large square-free numbers; just multiply lots of small primes together. However, it is not known how to test a number for square-freeness deterministically and quickly (unless one has a factoring oracle, in which case it is trivial).
1. However, it's not so obvious how to find large pairs $n,n+1$ of consecutive square-free numbers, though such pairs must exist by counting arguments (the density of square-free numbers is $6/\pi^2 \approx 60\%$). This could be a good variant problem.
1. It is conjectured that every element of Sylvester's sequence is square-free. If so, this solves the above problem.
7. If a fast algorithm to find primes exists, then an explicit fast algorithm to find primes exists: simply run all algorithms of size at most A simultaneously for time $k^A$, then increment A from 1 onward, until one gets a hit.

Here are the current "world records" for the fastest way to deterministically (and provably) obtain a k-digit prime (ignoring errors of $\exp(o(k))$ in the run time):

1. $O(10^k)$: The trivial algorithm of testing each of the k-digit numbers in turn until one gets a hit.
2. $O(10^k)^{3/4}$: Test the k-digit numbers of the form $a^2+b^4$ for primality (the Friedlander-Iwaniec theorem guarantees a hit for k large enough).
3. $O(10^k)^{2/3}$: Test the k-digit numbers of the form $a^3+2b^3$ for primality (a hit is guaranteed for large k by a result of Heath-Brown).
4. $O(10^k)^{0.525}$: Test the k-digit numbers from, say, $10^k$ onwards until one gets a hit (here we use the Baker-Harman-Pintz bound on prime gaps).
5. $O(10^k)^{1/2}$ assuming RH: Test the k-digit numbers from, say, $10^k$ onwards until one gets a hit (here we use the RH bound on prime gaps).

Note that if one can get $O(10^k)^{\varepsilon}$ for each fixed $\varepsilon \gt 0$, then one has established the weak conjecture.

A slight variant of the problem: assuming a factoring oracle, given an algorithm that runs in k steps, how large a prime is the algorithm guaranteed to produce in the worst-case scenario? (Note that this is very different from what the algorithm is heuristically predicted to produce in the average-case scenario.)

Here is a simple algorithm that produces a prime larger than about $k \log k$ in k steps, based on Euclid's proof of the infinitude of primes:

• Initialise $p_1=2$.
• Once $p_1,\ldots,p_{k-1}$ are computed, define $p_k$ to be the largest prime factor of $p_1 \ldots p_{k-1}+1$.

This generates k distinct primes; the prime number theorem tells us that the k^th prime has size about $k \log k$, so at least one of the primes produced here has size at least $k \log k$.

If one instead works with Sylvester's sequence $a_k = a_1 \ldots a_{k-1}+1$, then it is a result of Odoni that the number of primes less than n that can divide any one of the $a_k$ is $O( n / \log n \log \log \log n )$ rather than $O(n/\log n)$ (the prime number theorem bound). If we then factor the first k elements of this sequence, we must get a prime of size at least $k \log k \log \log \log k$ or so.

One can do better still by working with the Fermat numbers, $F_n = 2^{2^n}+1$. It is a result of Křížek, Luca and Somer that the number of primes dividing any of the $F_n$ is $O(\sqrt{n}/\log n)$, so in particular if we factor the first k Fermat numbers, we get a prime almost of size $k^2$. In fact, it's a classical result of Euler that the prime divisors of $F_n$ are all at least $2^{n+2}+1$, so we can obtain a large prime simply by factoring a Fermat number.

Assuming RH, one can find a prime larger than $k^2$ in about k steps by the method indicated earlier (start at $k^2$ and increment until one hits a prime). Unconditionally, one gets a prime larger than $k^{1/0.525}$ by this method. Currently this is the world record.

## Variants of the problem

1. Can one derandomise factoring of polynomials over finite fields? Note that if one could do this even in the quadratic case, this would allow for efficient location of quadratic non-residues, which is not known.

## Observations and strategies

1. For the conjecture with factoring, it would suffice to show that every polylog(N)-length interval in [N,2N] contains a number which is not $N^\varepsilon$-smooth (or $\log^{O(1)} N$ smooth, if one only wants to solve the weak conjecture with factoring). One way to start approaching this sort of problem is to take iterated product sets of sets of medium sized primes and show that they spread out to fill all short intervals in [N,2N].
1. One can combine this with the "W-trick". For instance, if one lets W be the product of all the primes less than k, and if one can show that not every element of the progression $\{ Wn+1: 1 \leq n \leq k^{100} \}$ (say) is $k^A$-smooth, then one has found a prime with at least $A \log k$ digits in polynomial time. If one can make A arbitrarily large, one has solved the baby problem with factoring.
2. If one can find explicitly enumerable sets of k-digit numbers of size $O(10^{\varepsilon k})$ for any fixed $\varepsilon \gt 0$ that are guaranteed to contain a prime, then one has solved the weak conjecture. If instead one can find randomly samplable sets of k-digit numbers of size $O(10^{\varepsilon k})$ that contain a large fraction (e.g. $\exp(-o(k))$ of primes), one has solved the semi-weak conjecture. If one can find explicit sets of poly(k) size that are guaranteed to contain a prime, one has solved the strong conjecture.
3. If a primality test can be expressed as a constant-degree polynomial over a finite field from the digits of the number-under-test to {0,1}, then known pseudorandom generators for polynomials can be used to derandomize finding primes the same way as under the full P=BPP assumption. So, instead of requiring a general PRG, the proposal is trying to put the primality test into a fully derandomizable subclass of P.
4. If one can find a sequence of easily constructible, mutually coprime integers whose prime factors are contained in a sparse subset of the primes, this generates (with the aid of a factoring oracle) a sparse set of primes. So far the sparsest set of this type has come from factoring the Fermat numbers.

## Negative results and other obstructions

1. Using sieve theory to find a non-smooth number in a short interval runs into the difficulty that most sieve theory methods are insensitive to the starting point of an interval. Given that every integer in [1,n] is n-smooth, this implies that sieve theory is unlikely to establish the existence of a non-n-smooth number in any interval of length n, and in particular to find a non-$O(k^O(1))$ smooth integer in any interval that can be searched in polynomial time. Indeed, there are heuristic arguments that suggest that even augmented with basic Fourier analysis and assuming RH, this method is unlikely to work, although they haven't yet been fully formalized.
2. Average-case results for the primes cannot be used directly, because one could delete all constructible integers from the set of primes (a negligible set) without significantly impacting the average case results, while causing the conjecture to fail for this modified set of primes.
3. It is not possible to prove the conjecture purely by "soft" (i.e. relativisable) complexity theory methods which do not use any additional information on the primes rather than their density. Details are at Oracle counterexample to finding pseudoprimes.

## How could the conjecture be false?

A world in which the conjecture is negative would be very strange, especially with factoring or if the weak conjecture fails also. In particular, the following would have to be true:

1. All numbers with at least k digits that are "constructible" in the sense that they are the output of a Turing machine program of length $O(\log k)$ that runs in time $k^{O(1)}$, are composite. [If the conjecture fails with factoring, then these numbers are furthermore $\exp(\varepsilon k)$ smooth for any fixed $\varepsilon \gt 0$; if even the weak conjecture with factoring fails, they are $k^{O(1)}$-smooth.]
2. In fact, all constructible numbers are not only composite/smooth, but they sit inside intervals of length $k^{100}$, all of whose elements are composite/smooth. (One can replace 100 by any other fixed number.)
1. In particular, Cramer's conjecture fails; the gaps between adjacent k-digit primes exceeds $k^{100}$ around every constructible number.
3. No pseudorandom number generators exist. In particular, this implies that one-way functions do not exist.
4. $P \neq NP$. More specifically, the decision problem "Does there exist a prime in a given interval?" is in NP, but cannot lie in P if the conjecture fails.

## Relevant concepts

Complexity theory concepts:

1. Complexity classes
2. Pseudo-random generators (PRG)
3. Full derandomization
4. one-way function
5. Impagliazzo's five worlds: Algorithmica, Heuristica, Pessiland, Minicrypt, Cryptomania
6. Kolmogorov complexity
7. Oracles

Number theory concepts:

Other:

Note: articles with external links should eventually be replaced by in-house pages that can provide information that is more dedicated to this project.

## Relevant papers

1. Adelman, Lenstra, "Finding irreducible polynomials over finite fields"
2. R. C. Baker and G. Harman, “The difference between consecutive primes,” Proc. Lond. Math. Soc., series 3, 72 (1996) 261–280.
3. R. C. Baker, G. Harman and J. Pintz, The difference between consecutive primes, II, Proceedings of the London Mathematical Society 83, (2001), 532–562.
4. A. Balog, T. Wooley, “On strings of consecutive integers with no large prime factors”, J. Austral. Math. Soc. Ser. A 64 (1998), no. 2, 266–276
5. P. Dusart, Autour de la fonction qui compte le nombre de nombres premiers
6. O. Goldreich, A. Wigderson, Derandomization that is rarely wrong from short advice that is typically good
7. A. Granville, "Harald Cramér and the distribution of prime numbers", Scandinavian Actuarial J. 1 (1995), 12—28.
8. D.R. Heath-Brown, The square sieve and consecutive square-free numbers, Math. Ann. 266 (1984), 251-259
9. R. Impagliazzo, A. Wigderson, P = BPP unless E has sub-exponential circuits. In Proceedings of the 29th ACM Symposium on Theory of Computing, pages 220–229, 1997.
10. R. Jones, The density of prime divisors in the arithmetic dynamics of quadratic polynomials
11. Křížek, Luca and Somer, On the convergence of series of reciprocals of primes related to the Fermat numbers, J. Number Theory 97(2002), 95–112
12. U. Maurer, Fast Generation of Prime Numbers and Secure Public-Key Cryptographic Parameters, Journal of Cryptology 8 (1994), 123-155
13. Odoni, On the prime divisors of the sequence wn+1 = 1+ w1 · · ·wn. J. London Math. Soc. (2), 32(1):1–11, 1985.
14. K. Soundararajan, The distribution of the primes (survey)
15. K. Soundararajan, Small gaps between prime numbers: The work of Goldston-Pintz-Yildirim
16. L. Trevisan, Pseudorandomness and Combinatorial Constructions