Selberg sieve variational problem

From Polymath1Wiki

(Difference between revisions)
Jump to: navigation, search
(World records)
(World records)
Line 338: Line 338:
| [ 1.937]
| [ 1.937]
| 2.198
| 2.198
| [ 1.951]
| 2.648
| 2.648

Revision as of 16:56, 23 December 2013

Let Mk be the quantity

\displaystyle M_k := \sup_F \frac{\sum_{m=1}^k J_k^{(m)}(F)}{I_k(F)}

where F ranges over square-integrable functions on the simplex

\displaystyle {\mathcal R}_k := \{ (t_1,\ldots,t_k) \in [0,+\infty)^k: t_1+\ldots+t_k \leq 1 \}

with I_k, J_k^{(m)} being the quadratic forms

\displaystyle I_k(F) := \int_{{\mathcal R}_k} F(t_1,\ldots,t_k)^2\ dt_1 \ldots dt_k


\displaystyle J_k^{(m)}(F) := \int_{{\mathcal R}_{k-1}} (\int_0^{1-\sum_{i \neq m} t_i} F(t_1,\ldots,t_k)\ dt_m)^2 dt_1 \ldots dt_{m-1} dt_{m+1} \ldots dt_k.

It is known that DHL[k,m + 1] holds whenever EH[θ] holds and M_k > \frac{2m}{\theta}. Thus for instance, Mk > 2 implies DHL[k,2] on the Elliott-Halberstam conjecture, and Mk > 4 implies DHL[k,2] unconditionally.


Upper bounds

We have the upper bound

\displaystyle M_k \leq \frac{k}{k-1} \log k (1)

that is proven as follows.

The key estimate is

 \displaystyle \int_0^{1-t_2-\ldots-t_k} F(t_1,\ldots,t_k)\ dt_1)^2 \leq \frac{\log k}{k-1} \int_0^{1-t_2-\ldots-t_k} F(t_1,\ldots,t_k)^2 (1 - t_1-\ldots-t_k+ kt_1)\ dt_1.. (2)

Assuming this estimate, we may integrate in t_2,\ldots,t_k to conclude that

\displaystyle J_k^{(1)}(F) \leq \frac{\log k}{k-1} \int F^2 (1-t_1-\ldots-t_k+kt_1)\ dt_1 \ldots dt_k

which symmetrises to

\sum_{m=1}^k J_k^{(m)}(F) \leq k \frac{\log k}{k-1} \int F^2\ dt_1 \ldots dt_k

giving the desired upper bound (1).

It remains to prove (2). By Cauchy-Schwarz, it suffices to show that

\displaystyle \int_0^{1-t_2-\ldots-t_k} \frac{dt_1}{1 - t_1-\ldots-t_k+ kt_1} \leq \frac{\log k}{k-1}.

But writing s = t_2+\ldots+t_k, the left-hand side evaluates to

\frac{1}{k-1} (\log k(1-s) - \log (1-s) ) = \frac{\log k}{k-1}

as required.

Lower bounds

We will need some parameters c,T,τ > 0 and a > 1 to be chosen later (in practice we take c close to 1 / logk, T a small multiple of c, and τ a small multiple of c/k.

For any symmetric function F on the simplex {\mathcal R}_k, one has

J_k^{(1)}(F) \leq \frac{M_k}{k} I_k(F)

and so by scaling, if F is a symmetric function on the dilated simplex r \cdot {\mathcal R}_k, one has

J_k^{(1)}(F) \leq \frac{r M_k}{k} I_k(F)

after adjusting the definition of the functionals I_k, J_k^{(1)} suitably for this rescaled simplex.

Now let us apply this inequality r in the interval [1,1 + τ] and to truncated tensor product functions

F(t_1,\ldots,t_k) = 1_{t_1+\ldots+t_k\leq r} \prod_{i=1}^k m_2^{-1/2} g(t_i)

for some bounded measurable g: [0,T] \to {\mathbf R}, not identically zero, with m_2 := \int_0^T g(t)^2\ dt. We have the probabilistic interpretations

J_k^{(1)}(F) := m_2^{-1} {\mathbf E} ( \int_{[0, r - S_{k-1}]} g(t)\ dt)^2


I(F) := m_2^{-1} {\mathbf E} \int_{[0,r - S_{k-1}]} g(t)^2\ dt
 = {\mathbf P} (S_k \leq r)

where S_{k-1} := X_1 + \ldots X_{k-1}, S_k := X_1 + \ldots + X_k and X_1,\ldots,X_k are iid random variables in [0,T] with law m_2^{-1} g(t)^2\ dt, and we adopt the convention that \displaystyle \int_{[a,b]} f vanishes when b < a. We thus have

 {\mathbf E} ( \int_{[0, r - S_{k-1}]} g(t)\ dt)^2 \leq \frac{r M_k}{k} m_2 {\mathbf P} ( S_k \leq r ) (*)

for any r.

We now introduce the random function h = hr by

 h(t) := \frac{1}{r - S_{k-1} + (k-1) t} 1_{S_{k-1} < r}.

Observe that if Sk − 1 < r, then

 \int_{[0, r-S_{k-1}]} h(t)\ dt = \frac{\log k}{k-1}

and hence by the Legendre identity

( \int_{[0, r - S_{k-1}]} g(t)\ dt)^2 = \frac{\log k}{k-1} \int_{[0, r - S_{k-1}]} \frac{g(t)^2}{h(t)}\ dt - \frac{1}{2} \int_{[0,r-S_{k-1}]} \int_{[0,r-S_{k-1}]} \frac{(g(s) h(t)-g(t) h(s))^2}{h(s) h(t)}\ ds dt.

We also note that (using the iid nature of the Xi to symmetrise)

 {\mathbf E} \int_{[0, r - S_{k-1}]} g(t)^2/h(t)\ dt = m_2 {\mathbf E} 1_{S_k \leq r} / h( X_k )
 = m_2 {\mathbf E} 1_{S_k \leq r} (1 - X_1 - \ldots - X_k + k X_k )
 = m_2 {\mathbf E} 1_{S_k \leq r}
 = m_2 {\mathbf P}( S_k \leq r ).

Inserting these bounds into (*) and rearranging, we conclude that

 r \Delta_k {\mathbf P} ( S_k \leq r ) \leq \frac{k}{2m_2} {\mathbf E} \int_{[0,r-S_{k-1}]} \int_{[0,r-S_{k-1}]} \frac{(g(s) h(t)-g(t) h(s))^2}{h(s) h(t)}\ ds dt

where \Delta_k := \frac{k}{k-1} \log k - M_k is the defect from the upper bound. Splitting the integrand into regions where s or t is larger than or less than T, we obtain

 r \Delta_k {\mathbf P} ( S_k \leq r ) \leq Y_1 + Y_2


Y_1 := \frac{k}{m_2} {\mathbf E} \int_{[0,T]} \int_{[T,r-S_{k-1}]} \frac{g(t)^2}{h(t)} h(s)\ ds dt


Y_2 := \frac{k}{2 m_2} {\mathbf E} \int_{[0,\min(r-S_{k-1},T)]} \int_{[0,\min(r-S_{k-1},T)]} \frac{(g(s) h(t)-g(t) h(s))^2}{h(s) h(t)}\ ds dt.

We now focus on Y1. It is only non-zero when S_{k-1} \leq r-T. Bounding h(s) \leq \frac{1}{(k-1)s}, we see that

Y_1 \leq \frac{k}{(k-1) m_2} {\mathbf E} \int_0^T \frac{g(t)^2}{h(t)}\ dt \times \log_+ \frac{r-S_{k-1}}{T}

where log + (x) is equal to logx when x \geq 1 and zero otherwise. We can rewrite this as

Y_1 \leq \frac{k}{k-1} {\mathbf E} 1_{S_k \leq r} \frac{1}{h(X_k)} \log_+ \frac{r-S_{k-1}}{T}.

We write \frac{1}{h(X_k)} = r-S_k + kX_k and \frac{r-S_{k-1}}{T} = \frac{r-S_k}{T} + \frac{X_k}{T}. Using the bound \log_+(x+y) \leq \log_+(x) + \log_+(1+y) we have

\log_+ \frac{r-S_{k-1}}{T} \leq \log_+ \frac{r-S_{k}}{T} + \log(1 + \frac{X_k}{T})

and thus (bounding \log(1+\frac{X_k}{T}) \leq \frac{X_k}{T}).

Y_1 \leq \frac{k}{k-1} {\mathbf E} (r-S_k + kX_k) \log_+ \frac{r-S_{k}}{T} + (r-S_k)_+ \frac{X_k}{T} + k X_k \log(1+\frac{X_k}{T}).

Symmetrising, we conclude that

Y_1 \leq \frac{k}{k-1} (Z_1 + Z_2 + Z_3)


Z_1 := {\mathbf E} r \log_+ \frac{r-S_{k}}{T}
Z_2 := {\mathbf E} (r-S_k)_+ \frac{S_k}{kT}
Z_3 := m_2^{-1} \int_0^T kt \log(1 + \frac{t}{T}) g(t)^2\ dt.

For Z2, which is a tiny term, we use the crude bound

Z_2 \leq \frac{r^2}{4kT}.

For Z1, we use the bound

\log_+ x \leq \frac{(x+2a\log a - a)^2}{4a^2 \log a}

valid for any a > 1, which can be verified because the LHS is concave for x \geq 1, while the RHS is convex and is tangent to the LHS as x=a. We then have

\log_+ \frac{r-S_{k}}{T} \leq \frac{(r-S_k+2aT\log a-aT)^2}{4a^2 T^2\log a}

and thus

Z_1 \leq r (\frac{(r-k\mu+2aT\log a-aT)^2 + k \sigma^2}{4a^2 T^2 \log a} )


 \mu := m_2^{-1} \int_0^T t g(t)^2\ dt
 \sigma^2 := m_2^{-1} \int_0^T t^2 g(t)^2\ dt - \mu^2.

A good choice for a = a[r] here is a = \frac{r-k\mu}{T} (assuming 1-k\mu \geq T), in which case the formula simplifies to

Z_1 \leq r (\log \frac{r-k\mu}{T} + \frac{k \sigma^2}{4(r-k\mu)^2 \log a})

Thus far, our arguments have been valid for arbitrary functions g. We now specialise to functions of the form

 g(t) := \frac{1}{c+(k-1)t}.

Note the identity

\displaystyle g(t) - h(t) = (r - S_{k-1} - c) g(t) h(t)

on [0,min(rSk − 1,T)]. Thus

Y_2 = \frac{k}{2 m_2} {\mathbf E} \int_{[0,\min(r-S_{k-1},T)]} \int_{[0,\min(r-S_{k-1},T)]} \frac{((g-h)(s) h(t)-(g-h)(t) h(s))^2}{h(s) h(t)}\ ds dt
= \frac{k}{2 m_2} {\mathbf E} (r - S_{k-1} - c)^2 \int_{[0,\min(r-S_{k-1},T)]} \int_{[0,\min(r-S_{k-1},T)]} (g(s)-g(t))^2 h(s) h(t)\ ds dt.

Bounding (g(s)-g(t))^2 \leq g(s)^2+g(t)^2 and using symmetry, we conclude

Y_2 \leq \frac{k}{m_2} {\mathbf E} (r - S_{k-1} - c)^2 \int_{[0,\min(r-S_{k-1},T)]} \int_{[0,\min(r-S_{k-1},T)]} g(s)^2 h(s) h(t)\ ds dt.

Since \int_0^{r-S_{k-1}} h(t)\ dt = \frac{\log k}{k-1}, we conclude that

Y_2 \leq \frac{k}{k-1} Z_4

where Z4 = Z4[r] is the quantity

Z_4 := \frac{\log k}{m_2} {\mathbf E} (r - S_{k-1} - c)^2 \int_{[0,\min(r-S_{k-1},T)]} g(s)^2 h_r(s)\ ds.

Putting all this together, we have

 r \Delta_k {\mathbf P} ( S_k \leq r ) \leq \frac{k}{k-1} (r (\log \frac{r-k\mu}{T} + \frac{k \sigma^2}{4(r-k\mu)^2 \log a}) + \frac{r^2}{4kT} + Z_3 + Z_4[r] ).

At this point we encounter a technical problem that Z4 diverges logarithmically (up to a cap of logk) as Sk − 1 approaches r. To deal with this issue we average in r, and specifically over the interval [1,1 + τ]. One can calculate that

 \int_0^1 (1+u\tau) 1_{x > 1+u\tau}\ du \leq (1+\tau/2) \frac{(x-k\mu)^2}{(1+\tau-k\mu)^2}

for all x if we have 1-k\mu \geq \tau (because the two expressions touch at x = 1 + τ, with the RHS being convex with slope at least (1 + τ / 2) / τ there. and the LHS lying underneath this tangent line). Assuming this, we conclude that

 \int_0^1 (1+u\tau) {\mathbf P} ( S_k \leq 1+u\tau )\ du \leq (1 + \frac{\tau}{2}) (1 - \frac{k \sigma^2}{(1+\tau-k\mu)^2})

provided that kμ < 1 − τ, and hence

 \Delta_k (1 + \frac{\tau}{2}) (1 - \frac{k \sigma^2}{(1+\tau-k\mu)^2}) \leq \frac{k}{k-1} ( \frac{1}{\tau} \int_1^{1+\tau} (r (\log \frac{r-k\mu}{T} + \frac{k \sigma^2}{4a^2 T^2 \log a}) + \frac{r^2}{4kT}) dr + Z_3 + \int_0^1 Z_4[1+u\tau]\ du ).

Now we deal with the Z4 integral. We split into two contributions, depending on whether 1+u\tau-S_{k-1} \leq 2c or not. If 1+u\tau-S_{k-1}\leq 2c, then we may bound

 (1+u\tau-S_{k-1}-c)^2 \leq c^2

when 1+u\tau-S_{k-1} \geq 0, so this portion of \int_0^1 Z_4[1+u\tau]\ du may be bounded by

 \frac{\log k}{m_2} c^2 {\mathbf E} \int_{[0,T]} g(s)^2 (\int_0^1 h_{1+u\tau}(s) 1_{1+a\tau-S_{k-1} \geq s}\ du)\ ds.

Observe that

\int_0^1 h_{1+u\tau}(s) 1_{1+u\tau-S_{k-1} \geq s}\ du = \int_0^1 \frac{du}{1-S_{k-1}+u\tau+(k-1)s} 1_{1-S_{k-1}+u\tau \geq s}
 = \frac{1}{\tau} \int_{[\max(ks, 1-S_{k-1}+(k-1)s), 1-S_{k-1}+\tau+(k-1)s]} \frac{du}{u}
 \leq \frac{1}{\tau} \log \frac{ks+\tau}{ks}

and so this portion of \int_0^1 Z_4[1+a\tau]\ da is bounded by



W := m_2^{-1} \int_{[0,T)]} g(s)^2 \log(1+\frac{\tau}{ks})\ ds.
X := \frac{\log k}{\tau} c^2

As for the portion when 1 + uτ − Sk − 1 > 2c, we bound h_{1+u\tau}(s) \leq \frac{1}{2c+(k-1)t}, and so this portion of \int_0^1 Z_4[1+u\tau]\ du may be bounded by

\int_0^1 \frac{\log k}{c} {\mathbf E} (1 + u\tau - S_{k-1} - c)^2 V\ du
= VU


V := \frac{c}{m_2} \int_0^T \frac{g(t)^2}{2c + (k-1)t}\ dt.
U := \frac{\log k}{c} \int_0^1 (1 + u\tau - (k-1)\mu - c)^2+ (k-1)\sigma^2\ du

We thus arrive at the final bound

 \Delta_k \leq \frac{k}{k-1} \frac{ \frac{1}{\tau} \int_1^{1+\tau} (r (\log \frac{r-k\mu}{T} + \frac{k \sigma^2}{4(r-k\mu)^2 \log a}) + \frac{r^2}{4kT}) dr + Z_3 + W X + VU}{ (1+\tau/2)(1 - \frac{k\sigma^2}{(1+\tau-k\mu)^2}}

provided that kμ < 1 − τ and the denominator is positive.

Asymptotic analysis

We work in the asymptotic regime k \to \infty. Setting

c:= \frac{1}{\log k} + \frac{\alpha}{\log^2 k}
T:= \frac{\beta}{\log k}
\tau := \frac{\gamma}{\log k}

for some absolute constants \alpha \in {\mathbf R} and β,γ > 0, one calculates

1-k\mu= \frac{\alpha+\log\beta- 1+ o(1)}{\log k}
k\sigma^2 = \frac{\beta+o(1)}{\log^2 k}

and so the constraint 1 − kμ > τ becomes

\alpha + \log \beta - 1 \geq \gamma.

With r =1 + \frac{u\gamma}{\log k} for 0 \leq u \leq 1, one has

 a = \frac{\alpha+\log \beta-1+u \gamma +o(1)}{\beta}
 Z_1 \leq \log a + \frac{1}{4\beta a \log^2 a}

and one also calculates

Z2,Z3 = o(1)
W = \int_0^\infty \frac{1}{(1+t)^2} \log(1+\frac{\gamma}{t})\ dt +o(1)
 = \frac{\gamma \log(\gamma)}{\gamma-1} + o(1)
X = \frac{1}{\gamma} + o(1)
 V = \int_0^1\frac{dt}{(1+t)^2 (2+t)} + o(1) = 1 - \log(2) + o(1)
 U = \int_0^1 (u\gamma  +\alpha+\log \beta - 2)^2\ du + \beta + o(1)
 = \frac{(\gamma+\alpha+\log\beta -2)^3 - (\alpha+\log\beta-2)^3}{3}+ \beta + o(1)

and hence

\Delta \leq \frac{ \int_0^1 (\log a(u) + \frac{1}{4\beta a(u) \log^2 a(u)})\ du + \frac{\log(\gamma)}{\gamma-1} + \frac{1-\log(2)}{3} ((\gamma+\alpha+\log\beta -2)^3 - (\alpha+\log\beta-2)^3+3\beta) }{1 - \frac{\beta}{(\alpha+\log \beta-1+\gamma)^2} }+ o(1)

assuming that α + logβ − 1 > γ and

 1 - \frac{\beta}{(\alpha+\log \beta-1+\gamma)^2} > 0,

and where a(u):=\frac{\alpha+\log \beta-1+u \gamma}{\beta}.

In particular, by setting α,β,γ as absolute constants obeying these constraints, we have \Delta \leq O(1), and so

Mk = logk + O(1).

If we set c = τ: = 1 / logk, with T a small multiple of c, then \mu \approx \frac{1}{k}(1 - \frac{A}{\log k}) for a large absolute constant A, and σ2 is a small multiple of \frac{1}{k \log^2 k}. This makes the denominator comparable to 1; one can check that all the terms in the numerator are O(1), finally giving the bound

Δk = O(1)

and thus we have the lower bound

M_k \geq\frac{k}{k-1} \log k - O(1) = \log k - O(1).

More general variational problems

It appears that for the purposes of establish DHL type theorems, one can increase the range of F in which one is taking suprema over (and extending the range of integration in the definition of J_k^{(m)}(F) accordingly). Firstly, one can enlarge the simplex {\mathcal R}_k to the larger region

{\mathcal R}'_k = \{ (t_1,\ldots,t_k) \in [0,1]^k: t_1+\ldots+t_k \leq 1 + \min(t_1,\ldots,t_k) \}

provided that one works with a generalisation of EH[θ] which controls more general Dirichlet convolutions than the von Mangoldt function (a precise assertion in this regard may be found in BFI). In fact one should be able to work in any larger region R for which

R + R \subset \{ (t_1,\ldots,t_k) \in [0,2/\theta]^k: t_1+\ldots+t_k \leq 2 + \max(t_1,\ldots,t_k) \} \cup \frac{2}{\theta} \cdot {\mathcal R}_k

provided that all the marginal distributions of F are supported on {\mathcal R}_{k-1}, thus (assuming F is symmetric)

\int_0^\infty F(t_1,\ldots,t_{k-1},t_k)\ dt_k = 0 when t_1+\ldots+t_{k-1} > 1.

For instance, one can take R = \frac{1}{\theta} \cdot {\mathcal R}_k, or one can take R = \{ (t_1,\ldots,t_k) \in [0,1/\theta]^k: t_1 +\ldots +t_{k-1} \leq 1 \} (although the latter option breaks the symmetry for F). Perhaps other choices are also possible.

World records

k Mk M'k M''k
Lower Upper Lower Upper Lower Upper
2 1.38593... 1.38593... 2 2 2 2
3 1.646 1.648 1.842 2.080 1.914 2.38593...
4 1.845 1.848 1.937 2.198 1.951 2.648
5 2.001162 2.011797 2.059 2.311 2.848
10 2.53 2.55842 2.7466
20 3.05 3.1534 3.2716
30 3.34 3.51848 3.6079
40 3.52 3.783467 3.8564
50 3.66 3.99186 4.0540
55 4.0051689 4.0816 4.1396
59 4.06 4.1478399 4.2030

For k>2, all upper bounds on Mk come from (1). Upper bounds on M'k come from the inequality M'_k \leq \frac{k}{k-1} M_{k-1} that follows from an averaging argument, and upper bounds on M''k (on EH, using the prism \{ t_1+\ldots+t_{k-1},t_k \leq 1\} as the domain) come from the inequality M''_k \leq M_{k-1} + 1 by comparing M''k with a variational problem on the prism (details here).

For k=2, M'2 = M''2 = 2 can be computed exactly by taking F to be the indicator function of the unit square (for the lower bound), and by using Cauchy-Schwarz (for the upper bound). M_2=\frac{1}{1-W(1/e)} \approx 1.38593 can be computed exactly as the solution to the equation 2-\frac{1}{x} + \log(1-\frac{1}{x}) = 0.

Personal tools