Finding narrow admissible tuples: Difference between revisions

Revision as of 10:34, 11 June 2013

For any natural number [math]\displaystyle{ k_0 }[/math], an admissible [math]\displaystyle{ k_0 }[/math]-tuple is a finite set [math]\displaystyle{ {\mathcal H} }[/math] of integers of cardinality [math]\displaystyle{ k_0 }[/math] which avoids at least one residue class modulo [math]\displaystyle{ p }[/math] for each prime [math]\displaystyle{ p }[/math]. (Note that one only needs to check those primes [math]\displaystyle{ p }[/math] of size at most [math]\displaystyle{ k_0 }[/math], so this is a finitely checkable condition.) Let [math]\displaystyle{ H(k_0) }[/math] denote the minimal diameter [math]\displaystyle{ \max {\mathcal H} - \min {\mathcal H} }[/math] of an admissible [math]\displaystyle{ k_0 }[/math]-tuple. As part of the Polymath8 project, we would like to find as good an upper bound on [math]\displaystyle{ H(k_0) }[/math] as possible for given values of [math]\displaystyle{ k_0 }[/math]. To a lesser extent, we would also be interested in lower bounds on this quantity. There is some scattered numerical evidence that the optimal value of H is roughly of size [math]\displaystyle{ k_0 \log k_0 + k_0 }[/math] for [math]\displaystyle{ k_0 }[/math] in the range of interest.

Upper bounds

Upper bounds are primarily constructed through various "sieves" that delete one residue class modulo [math]\displaystyle{ p }[/math] from an interval for a lot of primes [math]\displaystyle{ p }[/math]. Examples of sieves, in roughly increasing order of efficiency, are listed below.

Zhang sieve

The Zhang sieve uses the tuple

[math]\displaystyle{ {\mathcal H} = \{p_{m+1}, \ldots, p_{m+k_0}\} }[/math]

where [math]\displaystyle{ m }[/math] is taken to optimize the diameter [math]\displaystyle{ p_{m+k_0}-p_{m+1} }[/math] while staying admissible (in practice, this basically means making [math]\displaystyle{ m }[/math] as small as possible). Certainly any [math]\displaystyle{ m }[/math] with [math]\displaystyle{ p_{m+1} \gt k_0 }[/math] works, but this is not optimal. Applying the prime number theorem then gives the upper bound [math]\displaystyle{ H \leq (1+o(1)) k_0\log k_0 }[/math].

Hensley-Richards sieve

The Hensley-Richards sieve [HR1973], [HR1973b], [R1974] uses the tuple

[math]\displaystyle{ {\mathcal H} = \{-p_{m+\lfloor k_0/2\rfloor - 1}, \ldots, -p_{m+1}, -1, +1, p_{m+1},\ldots, p_{m+\lfloor k_0/2+1/2\rfloor-1}\} }[/math]

where m is again optimised to minimize the diameter while staying admissible.

Asymmetric Hensley-Richards sieve

The asymmetric Hensley-Richard sieve uses the tuple

[math]\displaystyle{ {\mathcal H} = \{-p_{m+\lfloor k_0/2\rfloor - 1-i}, \ldots, -p_{m+1}, -1, +1, p_{m+1},\ldots, p_{m+\lfloor k_0/2+1/2\rfloor-1+i}\} }[/math]

where [math]\displaystyle{ i }[/math] is an integer and [math]\displaystyle{ i,m }[/math] are optimised to minimize the diameter while staying admissible.

Schinzel sieve

Given [math]\displaystyle{ 0\lt y\lt z }[/math], the Schinzel sieve (discussed in [HR1973], [CJ2001] first sieves by [math]\displaystyle{ 1\bmod p }[/math] for primes [math]\displaystyle{ p \le y }[/math] and by [math]\displaystyle{ 0\bmod p }[/math] for primes [math]\displaystyle{ y \lt p \le z }[/math]. For a given choice of [math]\displaystyle{ y }[/math], the parameter [math]\displaystyle{ z }[/math] is minimized subject to ensuring that the first [math]\displaystyle{ k_0 }[/math] survivors (after the first) form an admissible sequence [math]\displaystyle{ \mathcal{H} }[/math], so the only free parameter is [math]\displaystyle{ y }[/math], which is chosen to minimize the diameter of [math]\displaystyle{ \mathcal{H} }[/math]. The case [math]\displaystyle{ y=1 }[/math] corresponds to a sieve of Eratosthenes, which will typically yield the same sequence as Zhang with the minimal (but not necessarily optimal) value of [math]\displaystyle{ m }[/math] that yields an admissible [math]\displaystyle{ k_0 }[/math]-tuple. As originally proposed, the Schinzel sieve works over the positive integers, but one can instead sieve intervals centered about the origin, or asymmetric intervals, as with the Hensley-Richards sieve.

Greedy sieve

For a given interval (e.g., [math]\displaystyle{ [1,x] }[/math], [math]\displaystyle{ [-x,x] }[/math], or asymmetric [math]\displaystyle{ [x_0,x_1] }[/math]) one sieves a single residue class [math]\displaystyle{ a \bmod p }[/math] for increasing primes [math]\displaystyle{ p=2,3,5,\ldots }[/math], with [math]\displaystyle{ a }[/math] chosen to maximize the number of survivors. Ties can be broken in a number of ways: minimize [math]\displaystyle{ a\in[0,p-1] }[/math], maximize [math]\displaystyle{ a\in [0,p-1] }[/math], minimize [math]\displaystyle{ |a-\lfloor p/2\rfloor| }[/math], or randomly. If not all residue classes modulo [math]\displaystyle{ p }[/math] are occupied by survivors, then [math]\displaystyle{ a }[/math] will be chosen so that no survivors are sieved. This necessarily occurs once [math]\displaystyle{ p }[/math] exceeds the number of survivors but typically happens much sooner. One then chooses the narrowest [math]\displaystyle{ k_0 }[/math]-tuple [math]\displaystyle{ {\mathcal H} }[/math] among the survivors (if there are fewer than [math]\displaystyle{ k_0 }[/math] survivors, retry with a wider interval).

Greedy-Schinzel sieve

Heuristically, the performance of the greedy sieve is significantly improved by starting with a Schinzel sieve with [math]\displaystyle{ y=2 }[/math] and [math]\displaystyle{ z=\sqrt{x_1-x_0} }[/math] and then continuing in a greedy fashion This method was proposed by Sutherland and originally referred to as a "greedy-greedy" approach. This nomenclature arose from the fact that one optimization that can be applied to the standard Schinzel sieve on a given interval is to "greedily" avoid sieving modulo primes where the set of survivors is already admissible (this may occur for primes less than the minimal value of [math]\displaystyle{ z }[/math] that yields [math]\displaystyle{ k_0 }[/math]-survivors), while a second optimization is to use a value of [math]\displaystyle{ z }[/math] that is intentionally smaller than necessary and switch to greedy sieving for primes greater than [math]\displaystyle{ z }[/math]. With the choice [math]\displaystyle{ z=\sqrt{x_1-x_0} }[/math], unless the initial interval is much larger than necessary, all primes up to [math]\displaystyle{ z }[/math] will require a residue class to be sieved and the first "greedy" seldom applies.

Seeded greedy sieve

Given an initial sequence [math]\displaystyle{ {\mathcal S} }[/math] that is known to contain an admissible [math]\displaystyle{ k_0 }[/math]-tuple, one can apply greedy sieving to the minimal interval containing [math]\displaystyle{ {\mathcal S} }[/math] until an admissible sequence of survivors remains, and then choose the narrowest [math]\displaystyle{ k_0 }[/math]=tuple it contains. The sieving methods above can be viewed as the special case where [math]\displaystyle{ {\mathcal S} }[/math] is the set of integers in some interval. The main difference is that the choice of [math]\displaystyle{ {\mathcal S} }[/math] affects when ties occur and how they are broken with greedy sieving. One approach is to take [math]\displaystyle{ {\mathcal S} }[/math] to be the union of two [math]\displaystyle{ k_0 }[/math]-tuples that lie in roughly the same interval (see Iterated merging) below.

Iterated merging

Given an admissible [math]\displaystyle{ k_0 }[/math]-tuple [math]\displaystyle{ \mathcal{H}_1 }[/math], one can attempt to improve it using an iterated merging approach suggested by Castryck. One first uses a greedy (or greedy-Schinzel) sieve to construct an admissible [math]\displaystyle{ k_0 }[/math]-tuple [math]\displaystyle{ \mathcal{H}_2 }[/math] in roughly the same interval as [math]\displaystyle{ \mathcal{H}_1 }[/math], then performs a randomized greedy sieve using the seed set [math]\displaystyle{ \mathcal{S} = \mathcal{H}_1 \cup \mathcal{H}_2 }[/math] to obtain an admissible [math]\displaystyle{ k_0 }[/math]-tuple [math]\displaystyle{ \mathcal{H}_3 }[/math]. If [math]\displaystyle{ \mathcal{H}_3 }[/math] is narrower than [math]\displaystyle{ \mathcal{H}_2 }[/math], replace [math]\displaystyle{ \mathcal{H}_2 }[/math] with [math]\displaystyle{ \mathcal{H}_3 }[/math], otherwise try again with a new [math]\displaystyle{ \mathcal{H}_3 }[/math]. Eventually the diameter of [math]\displaystyle{ \mathcal{H}_2 }[/math] will become less than or equal to that of [math]\displaystyle{ \mathcal{H}_1 }[/math]. As long as [math]\displaystyle{ \mathcal{H}_1\ne \mathcal{H}_2 }[/math], one can continue to attempt to improve [math]\displaystyle{ \mathcal{H}_2 }[/math], but in practice one stops after some number of retries.

As described by Sutherland, one can then replace [math]\displaystyle{ \mathcal{H}_1 }[/math] with [math]\displaystyle{ \mathcal{H}_2 }[/math] and begin the process anew, yielding a randomized algorithm that can be run indefinitely. Key parameters to this algorithm are the choice of the interval used when constructing [math]\displaystyle{ \mathcal{H}_2 }[/math], which is typically made wider than the minimal interval containing [math]\displaystyle{ \mathcal{H}_1 }[/math] by a small factor [math]\displaystyle{ \delta }[/math] on each side (Sutherland suggests [math]\displaystyle{ \delta = 0.0025 }[/math]), and the number of failed attempts allowed while attempting to impove [math]\displaystyle{ \mathcal{H}_2 }[/math].

Eventually this process will tend to converge to particular [math]\displaystyle{ \mathcal{H}_1 }[/math] that it cannot improve (or more generally, a set of similar [math]\displaystyle{ \mathcal{H}_1 }[/math]'s with the same diameter). Interleaving iterated merging with the local optimizations described below often allows the algorithm to make further progress.

Iterated merging can be viewed as a form of simulated annealing. The set [math]\displaystyle{ \mathcal{S} }[/math] initially contains at least two admissible [math]\displaystyle{ k_0 }[/math]-tuples (typically many more), and as the algorithm proceeds the set [math]\displaystyle{ \mathcal{S} }[/math] converges toward [math]\displaystyle{ \mathcal{H}_1 }[/math] and the number of admissible [math]\displaystyle{ k_0 }[/math]-tuples it contains declines. One can regard the cardinality of the difference between [math]\displaystyle{ \mathcal{S} }[/math] and [math]\displaystyle{ \mathcal{H}_1 }[/math] as a measure of the "temperature" of a gradually cooling system, since the number of choices available to the algorithm declines as this cardinality is reduced (more precisely, one may consider the entropy of the possible sequence of tie-breaking choices available for a given [math]\displaystyle{ \mathcal{S} }[/math]).

Local optimizations

Let [math]\displaystyle{ \mathcal H = \{h_1,\ldots, h_{k_0}\} }[/math] be an admissible [math]\displaystyle{ k_0 }[/math]-tuple with endpoints [math]\displaystyle{ h_1 }[/math] and [math]\displaystyle{ h_{k_0} }[/math], and let [math]\displaystyle{ \mathcal I }[/math] be the interval [math]\displaystyle{ [h_1,h_{k_0}] }[/math]. If there exists an integer [math]\displaystyle{ h\in\mathcal I }[/math] such that removing one of [math]\displaystyle{ \mathcal H }[/math]'s endpoints and inserting [math]\displaystyle{ h }[/math] yields an admissible [math]\displaystyle{ k_0 }[/math]-tuple [math]\displaystyle{ \mathcal H' }[/math], then call [math]\displaystyle{ \mathcal H }[/math] contractible, and if not, say that [math]\displaystyle{ \mathcal H }[/math] non-contractible. Note that [math]\displaystyle{ \mathcal H' }[/math] necessarily has smaller diameter than [math]\displaystyle{ \mathcal H }[/math]. Any of the sieving methods described above may produce admissible [math]\displaystyle{ k_0 }[/math]-tuples that are contractible, so it is worth testing for contractibility as a post-processing step after sieving and replacing [math]\displaystyle{ \mathcal H }[/math] by [math]\displaystyle{ \mathcal H' }[/math] if this test succeeds.

We can also shift [math]\displaystyle{ \mathcal H }[/math] to the left by removing its right end point [math]\displaystyle{ h_{k_0} }[/math] and replacing it with the greatest integer [math]\displaystyle{ h_0 \lt h_1 }[/math] that yields an admissible [math]\displaystyle{ k_0 }[/math]-tuple [math]\displaystyle{ \mathcal H' }[/math], and we can similarly shift [math]\displaystyle{ \mathcal H }[/math] to the right. The diameter of [math]\displaystyle{ \mathcal H' }[/math] need not be less than [math]\displaystyle{ \mathcal H }[/math], but if it is, it provides a useful replacement. More generally, by shifting [math]\displaystyle{ \mathcal H }[/math] repeatedly we can produce a sequence of admissible [math]\displaystyle{ k_0 }[/math]-tuples that lie successively further to the left or right. In general the diameter of these tuples may grow as we do so, but it will also occasionally decline, and we may be able to find a shifted [math]\displaystyle{ \mathcal H' }[/math] with smaller diameter than [math]\displaystyle{ \mathcal H }[/math].

A more sophisticated local optimization involves a process of ``adjustment" proposed by Savitt. Let [math]\displaystyle{ \mathcal H }[/math] be an admissible [math]\displaystyle{ k_0 }[/math]-tuple. For a prime [math]\displaystyle{ p }[/math] and an integer [math]\displaystyle{ a }[/math], let [math]\displaystyle{ [a;p] }[/math] denote the residue class [math]\displaystyle{ a\bmod p }[/math], i.e. the set of integers [math]\displaystyle{ \{ x : x = a \bmod p\} }[/math]. Call [math]\displaystyle{ [a;p] }[/math] occupied if it contains an element of [math]\displaystyle{ \mathcal H }[/math].

Suppose that [math]\displaystyle{ [a;p] }[/math] and [math]\displaystyle{ [b;q] }[/math] are occupied residue classes, for some distinct primes [math]\displaystyle{ p }[/math] and [math]\displaystyle{ q }[/math], and that [math]\displaystyle{ [a';p] }[/math] and [math]\displaystyle{ [b';q] }[/math] are unoccupied. Let [math]\displaystyle{ \mathcal U }[/math] be the intersection of [math]\displaystyle{ \mathcal H }[/math] with [math]\displaystyle{ [a;p] \cup [b;q] }[/math], and let [math]\displaystyle{ \mathcal V }[/math] be a subset of the integers that lie in the intersection of the interval [math]\displaystyle{ I }[/math] containing [math]\displaystyle{ H }[/math] and the set [math]\displaystyle{ [a';p] \cup [b';q] }[/math] such that the set [math]\displaystyle{ \mathcal H' }[/math] formed by removing the elements of [math]\displaystyle{ \mathcal U }[/math] from [math]\displaystyle{ \mathcal H }[/math] and adding the elements of [math]\displaystyle{ \mathcal V }[/math] is admissible. A necessary (and often sufficient) condition for and integer [math]\displaystyle{ v }[/math] to lie in [math]\displaystyle{ \mathcal V }[/math] is that [math]\displaystyle{ v }[/math] must not lie in a residue class [math]\displaystyle{ [c;r] }[/math] that is the unique unoccupied residue class modulo [math]\displaystyle{ r }[/math] for any prime [math]\displaystyle{ r }[/math] other than [math]\displaystyle{ p }[/math] or [math]\displaystyle{ q }[/math].

The admissible set [math]\displaystyle{ \mathcal H' }[/math] lies in the interval [math]\displaystyle{ \mathcal I }[/math] containing [math]\displaystyle{ \mathcal H }[/math], so its diameter is no greater than that of [math]\displaystyle{ \mathcal H }[/math], however its cardinality may differ. If it happens that [math]\displaystyle{ \mathcal H' }[/math] contains more elements than [math]\displaystyle{ \mathcal H }[/math], then by eliminating points at either end of [math]\displaystyle{ \mathcal H' }[/math] we obtain an admissible [math]\displaystyle{ k_0 }[/math]-tuple that is narrower than [math]\displaystyle{ \mathcal H }[/math] and may ``adjust" [math]\displaystyle{ \mathcal H }[/math] by replacing it with [math]\displaystyle{ \mathcal H' }[/math]. The process of adjustment can often be applied repeatedly, yielding a sequence of successively narrower admissible [math]\displaystyle{ k_0 }[/math]-tuples.

Further refinements

Lower bounds

There is a substantial amount of literature on bounding the quantity [math]\displaystyle{ \pi(x+y)-\pi(x) }[/math], the number of primes in a shifted interval [math]\displaystyle{ [x+1,x+y] }[/math], where [math]\displaystyle{ x,y }[/math] are natural numbers. As a general rule, whenever a bound of the form

[math]\displaystyle{ \pi(x+y) - \pi(x) \leq F(y) }[/math] (*)

is established for some function [math]\displaystyle{ F(y) }[/math] of [math]\displaystyle{ y }[/math], the method of proof also gives a bound of the form

[math]\displaystyle{ k_0 \leq F( H(k_0)+1 ). }[/math] (**)

Indeed, if one assumes the prime tuples conjecture, any admissible [math]\displaystyle{ k_0 }[/math]-tuple of diameter [math]\displaystyle{ H }[/math] can be translated into an interval of the form [math]\displaystyle{ [x+1,x+H+1] }[/math] for some [math]\displaystyle{ x }[/math]. In the opposite direction, all known bounds of the form (*) proceed by using the fact that for [math]\displaystyle{ x\gt y }[/math], the set of primes between [math]\displaystyle{ x+1 }[/math] and [math]\displaystyle{ x+y }[/math] is admissible, so the method of proof of (*) invariably also gives (**) as well.

Examples of lower bounds are as follows;

Brun-Titchmarsh inequality

The Brun-Titchmarsh theorem gives

[math]\displaystyle{ \pi(x+y) - \pi(x) \leq (1 + o(1)) \frac{2y}{\log y} }[/math]

which then gives the lower bound

[math]\displaystyle{ H(k_0) \geq (\frac{1}{2}-o(1)) k_0 \log k_0 }[/math].

Montgomery and Vaughan deleted the o(1) error from the Brun-Titchmarsh theorem [MV1973, Corollary 2], giving the more precise inequality

[math]\displaystyle{ k_0 \leq 2 \frac{H(k_0)+1}{\log (H(k_0)+1)}. }[/math]

First Montgomery-Vaughan large sieve inequality

The first Montgomery-Vaughan large sieve inequality [MV1973, Theorem 1] gives

[math]\displaystyle{ k_0 (\sum_{q \leq Q} \frac{\mu^2(q)}{\phi(q)}) \leq H(k_0)+1 + Q^2 }[/math]

for any [math]\displaystyle{ Q \gt 1 }[/math], which is a parameter that one can optimise over (the optimal value is comparable to [math]\displaystyle{ H(k_0)^{1/2} }[/math]).

Second Montgomery-Vaughan large sieve inequality

The second Montgomery-Vaughan large sieve inequality [MV1973, Corollary 1] gives

[math]\displaystyle{ k_0 \leq (\sum_{q \leq z} (H(k_0)+1+cqz)^{-1} \mu(q)^2 \prod_{p|q} \frac{1}{p-1})^{-1} }[/math]

for any [math]\displaystyle{ z \gt 1 }[/math], which is a parameter similar to [math]\displaystyle{ Q }[/math] in the previous inequality, and [math]\displaystyle{ c }[/math] is an absolute constant. In the original paper of Montgomery and Vaughan, [math]\displaystyle{ c }[/math] was taken to be [math]\displaystyle{ 3/2 }[/math]; this was then reduced to [math]\displaystyle{ \sqrt{22}/\pi }[/math] [B1995, p.162] and then to [math]\displaystyle{ 3.2/\pi }[/math] [M1978]. It is conjectured that [math]\displaystyle{ c }[/math] can be taken to in fact be [math]\displaystyle{ 1 }[/math].

Benchmarks

Efforts to fill in the blank fields in this table are very welcome.

[math]\displaystyle{ k_0 }[/math]	3,500,000	181,000	34,429	26,024	23,283	10,719	5,000	2,000	1,000	672
Upper bounds
Zhang sieve	59,093,364	2,486,370	411,932	303,558	268,536	114,806	49,578	17,766	8,212	5,216
Hensley-Richards sieve	57,554,086	2,422,558	402,790	297,454	262,794	112,868	48,634	17,726	8,258	5,314
Asymmetric Hensley-Richards			401,700	297,076	262,566	112,646	48,484
Schinzel sieve
greedy-Schinzel sieve						108,694	46,968	17,054	7,854	5.030
Best known tuple	57,554,086	2,345,896	386,532	285,210	252,804	108,462	46,824	16,984	7,808	5,026*, 4998
Predictions
[math]\displaystyle{ k_0 \log k_0 + k_0 }[/math]	56,238,957	2,372,232	394,096	290,604	257,405	110,119	47,586	17,202	7,907	5,046
Lower bounds
MV with [math]\displaystyle{ c=1 }[/math] (conjectural)			234,642	172,924
MV with [math]\displaystyle{ c=3.2/\pi }[/math]			234,322	172,719
MV with [math]\displaystyle{ c=\sqrt{22}/\pi }[/math]			227,078	167,860
Second Montgomery-Vaughan			226,987	167,793
Brun-Titchmarsh			211,046	155,555
First Montgomery-Vaughan			196,729	145,711

* The best known tuple for [math]\displaystyle{ k_0 = 672 }[/math] has width 4998, due to Engelsma; 5026 is the width of the best known tuple that was found by the methods that yielded the entries to the left of 5026 in the "best known tuple" row.

Revision as of 10:27, 11 June 2013 view source Savitt (talk \| contribs) 7 edits →‎Benchmarks ← Older edit		Revision as of 10:34, 11 June 2013 view source Hannes (talk \| contribs) 38 edits m →‎Benchmarks Newer edit →
Line 295:		Line 295:
	\|}		\|}

	* The best known tuple for <math>k_0 = 672</math> has width 4998, due to Engelsma; 5026 is the width of the best known tuple that was found by the methods that yielded the entries to the left of 5026 in the "best known tuple" row.		<nowiki>*</nowiki> The best known tuple for <math>k_0 = 672</math> has width 4998, due to Engelsma; 5026 is the width of the best known tuple that was found by the methods that yielded the entries to the left of 5026 in the "best known tuple" row.