Selberg sieve variational problem

Let [math]\displaystyle{ M_k }[/math] be the quantity

[math]\displaystyle{ \displaystyle M_k := \sup_F \frac{\sum_{m=1}^k J_k^{(m)}(F)}{I_k(F)} }[/math]

where [math]\displaystyle{ F }[/math] ranges over square-integrable functions on the simplex

[math]\displaystyle{ \displaystyle {\mathcal R}_k := \{ (t_1,\ldots,t_k) \in [0,+\infty)^k: t_1+\ldots+t_k \leq 1 \} }[/math]

with [math]\displaystyle{ I_k, J_k^{(m)} }[/math] being the quadratic forms

[math]\displaystyle{ \displaystyle I_k(F) := \int_{{\mathcal R}_k} F(t_1,\ldots,t_k)^2\ dt_1 \ldots dt_k }[/math]

and

[math]\displaystyle{ \displaystyle J_k^{(m)}(F) := \int_{{\mathcal R}_{k-1}} (\int_0^{1-\sum_{i \neq m} t_i} F(t_1,\ldots,t_k)\ dt_m)^2 dt_1 \ldots dt_{m-1} dt_{m+1} \ldots dt_k. }[/math]

It is known that [math]\displaystyle{ DHL[k,m+1] }[/math] holds whenever [math]\displaystyle{ EH[\theta] }[/math] holds and [math]\displaystyle{ M_k \gt \frac{2m}{\theta} }[/math]. Thus for instance, [math]\displaystyle{ M_k \gt 2 }[/math] implies [math]\displaystyle{ DHL[k,2] }[/math] on the Elliott-Halberstam conjecture, and [math]\displaystyle{ M_k\gt 4 }[/math] implies [math]\displaystyle{ DHL[k,2] }[/math] unconditionally.

Upper bounds

We have the upper bound

[math]\displaystyle{ \displaystyle M_k \leq \frac{k}{k-1} \log k }[/math] (1)

that is proven as follows.

The key estimate is

[math]\displaystyle{ \displaystyle \int_0^{1-t_2-\ldots-t_k} F(t_1,\ldots,t_k)\ dt_1)^2 \leq \frac{\log k}{k-1} \int_0^{1-t_2-\ldots-t_k} F(t_1,\ldots,t_k)^2 (1 - t_1-\ldots-t_k+ kt_1)\ dt_1. }[/math]. (2)

Assuming this estimate, we may integrate in [math]\displaystyle{ t_2,\ldots,t_k }[/math] to conclude that

[math]\displaystyle{ \displaystyle J_k^{(1)}(F) \leq \frac{\log k}{k-1} \int F^2 (1-t_1-\ldots-t_k+kt_1)\ dt_1 \ldots dt_k }[/math]

which symmetrises to

[math]\displaystyle{ \sum_{m=1}^k J_k^{(m)}(F) \leq k \frac{\log k}{k-1} \int F^2\ dt_1 \ldots dt_k }[/math]

giving the desired upper bound (1).

It remains to prove (2). By Cauchy-Schwarz, it suffices to show that

[math]\displaystyle{ \displaystyle \int_0^{1-t_2-\ldots-t_k} \frac{dt_1}{1 - t_1-\ldots-t_k+ kt_1} \leq \frac{\log k}{k-1}. }[/math]

But writing [math]\displaystyle{ s = t_2+\ldots+t_k }[/math], the left-hand side evaluates to

[math]\displaystyle{ \frac{1}{k-1} (\log k(1-s) - \log (1-s) ) = \frac{\log k}{k-1} }[/math]

as required.

Lower bounds

We will need some parameters [math]\displaystyle{ c, T, \tau \gt 0 }[/math] and [math]\displaystyle{ a \gt 1 }[/math] to be chosen later (in practice we take c close to [math]\displaystyle{ 1/\log k }[/math], T a small multiple of c, and [math]\displaystyle{ \tau }[/math] a small multiple of c/k.

For any symmetric function F on the simplex [math]\displaystyle{ {\mathcal R}_k }[/math], one has

[math]\displaystyle{ J_k^{(1)}(F) \leq \frac{M_k}{k} I_k(F) }[/math]

and so by scaling, if F is a symmetric function on the dilated simplex [math]\displaystyle{ r \cdot {\mathcal R}_k }[/math], one has

[math]\displaystyle{ J_k^{(1)}(F) \leq \frac{r M_k}{k} I_k(F) }[/math]

after adjusting the definition of the functionals [math]\displaystyle{ I_k, J_k^{(1)} }[/math] suitably for this rescaled simplex.

Now let us apply this inequality r in the interval [math]\displaystyle{ [1,1+\tau] }[/math] and to truncated tensor product functions

[math]\displaystyle{ F(t_1,\ldots,t_k) = 1_{t_1+\ldots+t_k\leq r} \prod_{i=1}^k m_2^{-1/2} g(t_i) }[/math]

for some bounded measurable [math]\displaystyle{ g: [0,T] \to {\mathbf R} }[/math], not identically zero, with [math]\displaystyle{ m_2 := \int_0^T g(t)^2\ dt }[/math]. We have the probabilistic interpretations

[math]\displaystyle{ J_k^{(1)}(F) := m_2^{-1} {\mathbf E} ( \int_{[0, r - S_{k-1}]} g(t)\ dt)^2 }[/math]

and

[math]\displaystyle{ I(F) := m_2^{-1} {\mathbf E} \int_{[0,r - S_{k-1}]} g(t)^2\ dt }[/math]

[math]\displaystyle{ = {\mathbf P} (S_k \leq r) }[/math]

where [math]\displaystyle{ S_{k-1} := X_1 + \ldots X_{k-1} }[/math], [math]\displaystyle{ S_k := X_1 + \ldots + X_k }[/math] and [math]\displaystyle{ X_1,\ldots,X_k }[/math] are iid random variables in [0,T] with law [math]\displaystyle{ m_2^{-1} g(t)^2\ dt }[/math], and we adopt the convention that [math]\displaystyle{ \displaystyle \int_{[a,b]} f }[/math] vanishes when [math]\displaystyle{ b \lt a }[/math]. We thus have

[math]\displaystyle{ {\mathbf E} ( \int_{[0, r - S_{k-1}]} g(t)\ dt)^2 \leq \frac{r M_k}{k} m_2 {\mathbf P} ( S_k \leq r ) }[/math] (*)

for any r.

We now introduce the random function [math]\displaystyle{ h = h_r }[/math] by

[math]\displaystyle{ h(t) := \frac{1}{r - S_{k-1} + (k-1) t} 1_{S_{k-1} \lt r}. }[/math]

Observe that if [math]\displaystyle{ S_{k-1} \lt r }[/math], then

[math]\displaystyle{ \int_{[0, r-S_{k-1}]} h(t)\ dt = \frac{\log k}{k-1} }[/math]

and hence by the Legendre identity

[math]\displaystyle{ ( \int_{[0, r - S_{k-1}]} g(t)\ dt)^2 = \frac{\log k}{k-1} \int_{[0, r - S_{k-1}]} \frac{g(t)^2}{h(t)}\ dt - \frac{1}{2} \int_{[0,r-S_{k-1}]} \int_{[0,r-S_{k-1}]} \frac{(g(s) h(t)-g(t) h(s))^2}{h(s) h(t)}\ ds dt. }[/math]

We also note that (using the iid nature of the [math]\displaystyle{ X_i }[/math] to symmetrise)

[math]\displaystyle{ {\mathbf E} \int_{[0, r - S_{k-1}]} g(t)^2/h(t)\ dt = m_2 {\mathbf E} 1_{S_k \leq r} / h( X_k ) }[/math]

[math]\displaystyle{ = m_2 {\mathbf E} 1_{S_k \leq r} (1 - X_1 - \ldots - X_k + k X_k ) }[/math]

[math]\displaystyle{ = m_2 {\mathbf E} 1_{S_k \leq r} }[/math]

[math]\displaystyle{ = m_2 {\mathbf P}( S_k \leq r ). }[/math]

Inserting these bounds into (*) and rearranging, we conclude that

[math]\displaystyle{ r \Delta_k {\mathbf P} ( S_k \leq r ) \leq \frac{k}{2m_2} {\mathbf E} \int_{[0,r-S_{k-1}]} \int_{[0,r-S_{k-1}]} \frac{(g(s) h(t)-g(t) h(s))^2}{h(s) h(t)}\ ds dt }[/math]

where [math]\displaystyle{ \Delta_k := \frac{k}{k-1} \log k - M_k }[/math] is the defect from the upper bound. Splitting the integrand into regions where s or t is larger than or less than T, we obtain

[math]\displaystyle{ r \Delta_k {\mathbf P} ( S_k \leq r ) \leq Y_1 + Y_2 }[/math]

where

[math]\displaystyle{ Y_1 := \frac{k}{m_2} {\mathbf E} \int_{[0,T]} \int_{[T,r-S_{k-1}]} \frac{g(t)^2}{h(t)} h(s)\ ds dt }[/math]

and

[math]\displaystyle{ Y_2 := \frac{k}{2 m_2} {\mathbf E} \int_{[0,\min(r-S_{k-1},T)]} \int_{[0,\min(r-S_{k-1},T)]} \frac{(g(s) h(t)-g(t) h(s))^2}{h(s) h(t)}\ ds dt. }[/math]

We now focus on [math]\displaystyle{ Y_1 }[/math]. It is only non-zero when [math]\displaystyle{ S_{k-1} \leq r-T }[/math]. Bounding [math]\displaystyle{ h(s) \leq \frac{1}{(k-1)s} }[/math], we see that

[math]\displaystyle{ Y_1 \leq \frac{k}{(k-1) m_2} {\mathbf E} \int_0^T \frac{g(t)^2}{h(t)}\ dt \times \log_+ \frac{r-S_{k-1}}{T} }[/math]

where [math]\displaystyle{ \log_+(x) }[/math] is equal to [math]\displaystyle{ \log x }[/math] when [math]\displaystyle{ x \geq 1 }[/math] and zero otherwise. We can rewrite this as

[math]\displaystyle{ Y_1 \leq \frac{k}{k-1} {\mathbf E} 1_{S_k \leq r} \frac{1}{h(X_k)} \log_+ \frac{r-S_{k-1}}{T}. }[/math]

We write [math]\displaystyle{ \frac{1}{h(X_k)} = r-S_k + kX_k }[/math] and [math]\displaystyle{ \frac{r-S_{k-1}}{T} = \frac{r-S_k}{T} + \frac{X_k}{T} }[/math]. Using the bound [math]\displaystyle{ \log_+(x+y) \leq \log_+(x) + \log_+(1+y) }[/math] we have

[math]\displaystyle{ \log_+ \frac{r-S_{k-1}}{T} \leq \log_+ \frac{r-S_{k}}{T} + \log(1 + \frac{X_k}{T}) }[/math]

and thus (bounding [math]\displaystyle{ \log(1+\frac{X_k}{T}) \leq \frac{X_k}{T} }[/math].

[math]\displaystyle{ Y_1 \leq \frac{k}{k-1} {\mathbf E} (r-S_k + kX_k) \log_+ \frac{r-S_{k}}{T} + (r-S_k)_+ \frac{X_k}{T} + k X_k \log(1+\frac{X_k}{T}) }[/math].

Symmetrising, we conclude that

[math]\displaystyle{ Y_1 \leq \frac{k}{k-1} (Z_1 + Z_2 + Z_3) }[/math]

where

[math]\displaystyle{ Z_1 := {\mathbf E} r \log_+ \frac{r-S_{k}}{T} }[/math]

[math]\displaystyle{ Z_2 := {\mathbf E} (r-S_k)_+ \frac{S_k}{kT} }[/math]

[math]\displaystyle{ Z_3 := m_2^{-1} \int_0^T kt \log(1 + \frac{t}{T}) g(t)^2\ dt. }[/math]

For [math]\displaystyle{ Z_2 }[/math], which is a tiny term, we use the crude bound

[math]\displaystyle{ Z_2 \leq \frac{r^2}{4kT}. }[/math]

For [math]\displaystyle{ Z_1 }[/math], we use the bound

[math]\displaystyle{ \log_+ x \leq \frac{(x+2a\log a - a)^2}{4a^2 \log a} }[/math]

which can be verified because the LHS is concave for [math]\displaystyle{ x \geq 1 }[/math], while the RHS is convex and is tangent to the LHS as x=a. We then have

[math]\displaystyle{ \log_+ \frac{r-S_{k}}{T} \leq \frac{(r-S_k+2aT\log a-aT)^2}{4a^2 T^2\log a} }[/math]

and thus

[math]\displaystyle{ Z_1 \leq r (\frac{(r-k\mu+2aT\log a-aT)^2 + k \sigma^2}{4a^2 T^2 \log a} ) }[/math]

where

[math]\displaystyle{ \mu := m_2^{-1} \int_0^T t g(t)^2\ dt }[/math]

[math]\displaystyle{ \sigma^2 := m_2^{-1} \int_0^T t^2 g(t)^2\ dt - \mu^2. }[/math]

A good choice for [math]\displaystyle{ a=a[r] }[/math] here is [math]\displaystyle{ a = \frac{r-k\mu}{T} }[/math], in which case the formula simplifies a bit to

[math]\displaystyle{ Z_1 \leq r (\log \frac{r-k\mu}{T} + \frac{k \sigma^2}{4a^2 T^2 \log a}) }[/math]

Thus far, our arguments have been valid for arbitrary functions [math]\displaystyle{ g }[/math]. We now specialise to functions of the form

[math]\displaystyle{ g(t) := \frac{1}{c+(k-1)t}. }[/math]

Note the identity

[math]\displaystyle{ \displaystyle g(t) - h(t) = (r - S_{k-1} - c) g(t) h(t) }[/math]

on [math]\displaystyle{ [0, \min(r-S_{k-1},T)] }[/math]. Thus

[math]\displaystyle{ Y_2 = \frac{k}{2 m_2} {\mathbf E} \int_{[0,\min(r-S_{k-1},T)]} \int_{[0,\min(r-S_{k-1},T)]} \frac{((g-h)(s) h(t)-(g-h)(t) h(s))^2}{h(s) h(t)}\ ds dt }[/math]

[math]\displaystyle{ = \frac{k}{2 m_2} {\mathbf E} (r - S_{k-1} - c)^2 \int_{[0,\min(r-S_{k-1},T)]} \int_{[0,\min(r-S_{k-1},T)]} (g(s)-g(t))^2 h(s) h(t)\ ds dt. }[/math]

Bounding [math]\displaystyle{ (g(s)-g(t))^2 \leq g(s)^2+g(t)^2 }[/math] and using symmetry, we conclude

[math]\displaystyle{ Y_2 \leq \frac{k}{m_2} {\mathbf E} (r - S_{k-1} - c)^2 \int_{[0,\min(r-S_{k-1},T)]} \int_{[0,\min(r-S_{k-1},T)]} g(s)^2 h(s) h(t)\ ds dt. }[/math]

Since [math]\displaystyle{ \int_0^{r-S_{k-1}} h(t)\ dt = \frac{\log k}{k-1} }[/math], we conclude that

[math]\displaystyle{ Y_2 \leq \frac{k}{k-1} Z_4 }[/math]

where [math]\displaystyle{ Z_4 = Z_4[r] }[/math] is the quantity

[math]\displaystyle{ Z_4 := \frac{\log k}{m_2} {\mathbf E} (r - S_{k-1} - c)^2 \int_{[0,\min(r-S_{k-1},T)]} g(s)^2 h_r(s)\ ds. }[/math]

Putting all this together, we have

[math]\displaystyle{ r \Delta_k {\mathbf P} ( S_k \leq r ) \leq \frac{k}{k-1} (r (\log \frac{r-k\mu}{T} + \frac{k \sigma^2}{4a^2 T^2 \log a}) + \frac{r^2}{4kT} + Z_3 + Z_4[r] ). }[/math]

At this point we encounter a technical problem that [math]\displaystyle{ Z_4 }[/math] diverges logarithmically (up to a cap of [math]\displaystyle{ \log k }[/math]) as [math]\displaystyle{ S_{k-1} }[/math] approaches r. To deal with this issue we average in r, and specifically over the interval [math]\displaystyle{ [1,1+\tau] }[/math]. Using the slightly crude bound

[math]\displaystyle{ \int_0^1 (1+u\tau) 1_{x \gt 1+u\tau}\ du \leq \frac{1+\tau/2}{(1-k\mu)^2} (x-k\mu)^2 }[/math]

for all x and some [math]\displaystyle{ \delta \gt 0 }[/math], then

[math]\displaystyle{ \int_0^1 (1+u\tau) {\mathbf P} ( S_k \leq 1+u\tau )\ du \leq (1 + \frac{\tau}{2}) (1 - \frac{k \sigma^2}{(1-k\mu)^2}) }[/math]

provided that [math]\displaystyle{ k \mu \lt 1 }[/math], and hence

[math]\displaystyle{ \Delta_k (1 + \frac{\tau}{2}) (1 - \frac{k \sigma^2}{(1-k\mu)^2}) \leq \frac{k}{k-1} ( \frac{1}{\tau} \int_1^{1+\tau} (r (\log \frac{r-k\mu}{T} + \frac{k \sigma^2}{4a^2 T^2 \log a}) + \frac{r^2}{4kT}) dr + Z_3 + \int_0^1 Z_4[1+u\tau]\ du ). }[/math]

If [math]\displaystyle{ \tau \leq c }[/math], we may bound

[math]\displaystyle{ \int_0^1 Z_4[1+u\tau]\ du := \frac{\log k}{m_2} {\mathbf E} ((1 - S_{k-1})^2 + c^2) \int_{[0,T)]} g(s)^2 (\int_0^1 h_{1+u\tau}(s) 1_{1+a\tau-S_{k-1} \geq s}\ du)\ ds. }[/math]

Observe that

[math]\displaystyle{ \int_0^1 h_{1+u\tau}(s) 1_{1+u\tau-S_{k-1} \geq s}\ du = \int_0^1 \frac{du}{1-S_{k-1}+u\tau+(k-1)s} 1_{1-S_{k-1}+u\tau \geq s} }[/math]

[math]\displaystyle{ = \frac{1}{\tau} \int_{[\max(ks, 1-S_{k-1}+(k-1)s), 1-S_{k-1}+\tau+(k-1)s]} \frac{du}{u} }[/math]

[math]\displaystyle{ \leq \frac{1}{\tau} \log \frac{ks+\tau}{ks} }[/math]

and so

[math]\displaystyle{ \int_0^1 Z_4[1+a\tau]\ da \leq W \frac{\log k}{\tau} {\mathbf E} ((1 - S_{k-1})^2 + c^2) }[/math]

[math]\displaystyle{ = W \frac{\log k}{\tau} ((1 - (k-1)\mu)^2 + (k-1) \sigma^2 + c^2) }[/math]

where

[math]\displaystyle{ W := m_2^{-1} \int_{[0,T)]} g(s)^2 \log(1+\frac{\tau}{ks})\ ds. }[/math]

We thus arrive at the final bound

[math]\displaystyle{ \Delta_k \leq \frac{k}{k-1} \frac{ \frac{1}{\tau} \int_1^{1+\tau} (r (\log \frac{r-k\mu}{T} + \frac{k \sigma^2}{4a^2 T^2 \log a}) + \frac{r^2}{4kT}) dr + Z_3 + W \frac{\log k}{\tau} ((1 - (k-1)\mu)^2 + (k-1) \sigma^2 + c^2)}{(1 + \frac{\tau}{2}) (1 - \frac{k \sigma^2}{(1-k\mu)^2})} }[/math]

provided that [math]\displaystyle{ k \mu \lt 1 }[/math] and the denominator is positive.

If we set [math]\displaystyle{ c = \tau := 1/\log k }[/math], with [math]\displaystyle{ T }[/math] a small multiple of c, then [math]\displaystyle{ \mu \approx \frac{1}{k}(1 - \frac{A}{\log k}) }[/math] for a large absolute constant A, and [math]\displaystyle{ \sigma^2 }[/math] is a small multiple of [math]\displaystyle{ \frac{1}{k \log^2 k} }[/math]. This makes the denominator comparable to 1; if we then set [math]\displaystyle{ a \sim c/T }[/math], then one can check that all the terms in the numerator are O(1), finally giving the bound

[math]\displaystyle{ \Delta_k = O(1) }[/math]

and thus we have the lower bound

[math]\displaystyle{ M_k \geq\frac{k}{k-1} \log k - O(1) = \log k - O(1) }[/math].

World records

[math]\displaystyle{ k }[/math]	[math]\displaystyle{ M_k }[/math]		[math]\displaystyle{ M'_k }[/math]		[math]\displaystyle{ M''_k }[/math]
	Lower	Upper	Lower	Upper	Lower	Upper
2	1.38593...	1.38593...	2	2	2	2
3	1.646	1.648	1.842	2.080	1.914	2.38593...
4	1.845	1.848	1.937	2.198		2.648
5	2.001162	2.011797	2.059	2.311		2.848
10	2.53	2.55842
20	3.05	3.1534
30	3.34	3.51848
40	3.52	3.793466
50	3.66	3.99186
59	4.06	4.1479398

For k>2, all upper bounds on [math]\displaystyle{ M_k }[/math] come from (1). Upper bounds on [math]\displaystyle{ M'_k }[/math] come from the inequality [math]\displaystyle{ M'_k \leq \frac{k}{k-1} M_{k-1} }[/math] that follows from an averaging argument, and upper bounds on [math]\displaystyle{ M''_k }[/math] (on EH, using the prism [math]\displaystyle{ \{ t_1+\ldots+t_{k-1},t_k \leq 1\} }[/math] as the domain) come from the inequality [math]\displaystyle{ M''_k \leq M_{k-1} + 1 }[/math] by comparing [math]\displaystyle{ M''_k }[/math] with a variational problem on the prism (details here).

For k=2, [math]\displaystyle{ M'_2=M''_2 = 2 }[/math] can be computed exactly by taking F to be the indicator function of the unit square (for the lower bound), and by using Cauchy-Schwarz (for the upper bound). [math]\displaystyle{ M_2=\frac{1}{1-W(1/e)} \approx 1.385893 }[/math] can be computed exactly as the solution to the equation [math]\displaystyle{ 2-\frac{1}{x} + \log(1-\frac{1}{x}) = 0 }[/math].

Selberg sieve variational problem

Contents

Upper bounds

Lower bounds

More general variational problems

World records

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools