Difference between revisions of "Selberg sieve variational problem"

From Polymath1Wiki
Jump to: navigation, search
(World records)
(Lower bounds)
Line 34: Line 34:
 
== Lower bounds ==
 
== Lower bounds ==
  
We will need some parameters <math>c, T, \tau > 0</math> to be chosen later (in practice we take c close to <math>1/\log k</math>, T a small multiple of c, and <math>\tau</math> a small multiple of c/k).
+
We will need some parameters <math>c, T, \tau > 0</math> and <math>a > 1</math> to be chosen later (in practice we take c close to <math>1/\log k</math>, T a small multiple of c, and <math>\tau</math> a small multiple of c/k, and <math>a</math> a large absolute constant).
  
 
For any symmetric function F on the simplex <math>{\mathcal R}_k</math>, one has
 
For any symmetric function F on the simplex <math>{\mathcal R}_k</math>, one has
Line 46: Line 46:
 
after adjusting the definition of the functionals <math>I_k, J_k^{(1)}</math> suitably for this rescaled simplex.
 
after adjusting the definition of the functionals <math>I_k, J_k^{(1)}</math> suitably for this rescaled simplex.
  
Now let us apply this inequality to truncated tensor product functions
+
Now let us apply this inequality r in the interval <math>[1,1+\tau]</math> and to truncated tensor product functions
  
 
:<math>F(t_1,\ldots,t_k) = 1_{t_1+\ldots+t_k\leq r} \prod_{i=1}^k m_2^{-1/2} g(t_i)</math>
 
:<math>F(t_1,\ldots,t_k) = 1_{t_1+\ldots+t_k\leq r} \prod_{i=1}^k m_2^{-1/2} g(t_i)</math>
Line 59: Line 59:
 
:<math> = {\mathbf P} (S_k \leq r)</math>
 
:<math> = {\mathbf P} (S_k \leq r)</math>
  
where <math>S_{k-1} := X_1 + \ldots X_{k-1}</math>, <math>S_k := X_1 + \ldots + X_k</math> and <math>X_1,\ldots,X_k</math> are iid random variables in [0,T] with law <math>m_2^{-1} g(t)^2\ dt</math>, and we adopt the convention that <math>\int_{[a,b]} f</math> vanishes when <math>b < a</math>.  We thus have
+
where <math>S_{k-1} := X_1 + \ldots X_{k-1}</math>, <math>S_k := X_1 + \ldots + X_k</math> and <math>X_1,\ldots,X_k</math> are iid random variables in [0,T] with law <math>m_2^{-1} g(t)^2\ dt</math>, and we adopt the convention that <math>\displaystyle \int_{[a,b]} f</math> vanishes when <math>b < a</math>.  We thus have
  
 
:<math> {\mathbf E} ( \int_{[0, r - S_{k-1}]} g(t)\ dt)^2 \leq \frac{r M_k}{k} m_2 {\mathbf P} ( S_k \leq r ) </math> (*)
 
:<math> {\mathbf E} ( \int_{[0, r - S_{k-1}]} g(t)\ dt)^2 \leq \frac{r M_k}{k} m_2 {\mathbf P} ( S_k \leq r ) </math> (*)
Line 128: Line 128:
 
:<math>Z_3 := m_2^{-1} \int_0^T kt \log(1 + \frac{t}{T}) g(t)^2\ dt.</math>
 
:<math>Z_3 := m_2^{-1} \int_0^T kt \log(1 + \frac{t}{T}) g(t)^2\ dt.</math>
  
If we have the pointwise bound
+
For <math>Z_2</math>, which is a tiny term, we use the crude bound
  
:<math> r \log_+ \frac{r-s}{T} + (r-s)_+ \frac{s}{kT} \leq C (s - s_0)^2 + C'</math> (*)
+
:<math>Z_2 \leq \frac{r^2}{4kT}.</math>
  
for some <math>C, C', s_0 > 0</math> and all s, then we have
+
For <math>Z_1</math>, we use the bound
  
:<math>Z_1 + Z_2 \leq C {\mathbf E} (S_k - s_0)^2 </math>
+
:<math>\log_+ x \leq \frac{(x+4a\log a - a)^2}{4a^2 \log a}</math>
:<math> = C ( (k \mu - s_0)^2 + k \sigma^2 ) + C'</math>
+
 
 +
which can be verified because the LHS is concave for <math>x \geq 1</math>, while the RHS is convex and is tangent to the LHS as x=a.  We then have
 +
 
 +
:<math>\log_+ \frac{r-S_{k}}{T} \leq \frac{(r-S_k+4aT\log a-aT)^2}{4a^2 T^2\log a}</math>
 +
 
 +
and thus
 +
 
 +
:<math>Z_1 \leq r \frac{(r-k\mu+4aT\log a-aT)^2 + k \sigma^2}{4a^2 \log a}</math>
  
 
where
 
where
Line 148: Line 155:
 
Note the identity
 
Note the identity
  
:<math> g(t) - h(t) = (r - S_{k-1} - c) g(t) h(t)</math>
+
:<math>\displaystyle g(t) - h(t) = (r - S_{k-1} - c) g(t) h(t)</math>
  
 
on <math>[0, \min(r-S_{k-1},T)]</math>.  Thus
 
on <math>[0, \min(r-S_{k-1},T)]</math>.  Thus
Line 170: Line 177:
 
Putting all this together, we have
 
Putting all this together, we have
  
:<math> r \Delta_k {\mathbf P} ( S_k \leq r ) \leq \frac{k}{k-1} ( C ( (k \mu - s_0)^2 + k \sigma^2 ) + C' + Z_3 + Z_4[r] ).</math>
+
:<math> r \Delta_k {\mathbf P} ( S_k \leq r ) \leq \frac{k}{k-1} (r \frac{(r-k\mu+4aT\log a-aT)^2 + k \sigma^2}{4a^2 T^2\log a} + \frac{r^2}{4kT} + Z_3 + Z_4[r] ).</math>
  
At this point we encounter a technical problem that <math>Z_4</math> diverges logarithmically (up to a cap of <math>\log k</math>) as <math>S_{k-1}</math> approaches r.  To deal with this issue we average in r, and specifically over the interval <math>[1,1+\tau]</math>.  We assume that the pointwise bound (*) is valid for all r in this interval. If we have
+
At this point we encounter a technical problem that <math>Z_4</math> diverges logarithmically (up to a cap of <math>\log k</math>) as <math>S_{k-1}</math> approaches r.  To deal with this issue we average in r, and specifically over the interval <math>[1,1+\tau]</math>.  Using the slightly crude bound
  
:<math> \int_0^1 (1+a\tau) 1_{x > 1+a\tau}\ da \leq \delta (x-k\mu)^2</math>
+
:<math> \int_0^1 (1+a\tau) 1_{x > 1+a\tau}\ da \leq \frac{1+\tau/2}{(1-k\mu)^2} (x-k\mu)^2</math>
  
 
for all x and some <math>\delta > 0</math>, then
 
for all x and some <math>\delta > 0</math>, then
  
:<math> \int_0^1 (1+a\tau) {\mathbf P} ( S_k \leq 1+a\tau )\ da \leq 1 + \frac{\tau}{2} - k \delta \sigma^2</math>
+
:<math> \int_0^1 (1+a\tau) {\mathbf P} ( S_k \leq 1+a\tau )\ da \leq (1 + \frac{\tau}{2}) (1 - \frac{k \sigma^2}{(1-k\mu)^2})</math>
  
and hence
+
provided that <math>k \mu < 1</math>, and hence
  
:<math> \Delta_k (1 + \frac{a\tau}{2} - k \delta \sigma^2) \leq \frac{k}{k-1} ( C ( (k \mu - s_0)^2 + k \sigma^2 ) + C' + Z_3 + \int_0^1 Z_4[1+a\tau]\ da ).</math>
+
:<math> \Delta_k (1 + \frac{\tau}{2}) (1 - \frac{k \sigma^2}{(1-k\mu)^2}) \leq \frac{k}{k-1} ( (1+\frac{\tau}{2}) \frac{(1+\tau-k\mu+4aT\log a-aT)^2 + k \sigma^2}{4a^2 \log a} + \frac{1+\tau+\frac{\tau^2}{3}}{4kT} + Z_3 + \int_0^1 Z_4[1+a\tau]\ da ).</math>
  
 
If <math>\tau \leq c</math>, we may bound
 
If <math>\tau \leq c</math>, we may bound
Line 201: Line 208:
 
where
 
where
  
:<math>W := m_2^{-1} \frac{1}{\tau} \int_{[0,T)]} g(s)^2 \log(1+\frac{\tau}{ks})\ ds.</math>
+
:<math>W := m_2^{-1} \int_{[0,T)]} g(s)^2 \log(1+\frac{\tau}{ks})\ ds.</math>
  
 
We thus arrive at the final bound
 
We thus arrive at the final bound
  
:<math> \Delta_k \leq \frac{k}{k-1} \frac{ C ( (k \mu - s_0)^2 + k \sigma^2 ) + C' + Z_3 + W \frac{\log k}{\tau} ((1 - (k-1)\mu)^2 + (k-1) \sigma^2 + c^2)}{1 + \frac{\tau}{2} - k \delta \sigma^2}</math>
+
:<math> \Delta_k \leq \frac{k}{k-1} \frac{ (1+\frac{\tau}{2}) \frac{(1+\tau-k\mu+4aT\log a-aT)^2 + k \sigma^2}{4a^2 T^2 \log a} + \frac{1+\tau+\frac{\tau^2}{3}}{4kT} + Z_3 + W \frac{\log k}{\tau} ((1 - (k-1)\mu)^2 + (k-1) \sigma^2 + c^2)}{(1 + \frac{\tau}{2}) (1 - \frac{k \sigma^2}{(1-k\mu)^2})}</math>
  
provided that the denominator is positive.
+
provided that <math>k \mu < 1</math> and the denominator is positive.
  
If we set <math>c = \tau := 1/\log k</math>, with <math>T</math> a small multiple of c, then <math>\mu \approx \frac{1}{k}(1 - \frac{A}{\log k})</math> for a large absolute constant A, and <math>\sigma^2</math> is a small multiple of <math>\frac{1}{k \log^2 k}</math>.  This makes the denominator comparable to 1; if we then set <math>s_0 = k \mu</math>, then one can check that (*) holds with <math>C = O(\log^2 k )</math> and <math>C' = O(1)</math>; also, <math>Z_3, W = O(1)</math>, finally giving the bound
+
If we set <math>c = \tau := 1/\log k</math>, with <math>T</math> a small multiple of c, then <math>\mu \approx \frac{1}{k}(1 - \frac{A}{\log k})</math> for a large absolute constant A, and <math>\sigma^2</math> is a small multiple of <math>\frac{1}{k \log^2 k}</math>.  This makes the denominator comparable to 1; if we then set <math>a \sim c/T</math>, then one can check that all the terms in the numerator are O(1), finally giving the bound
  
 
:<math>\Delta_k = O(1)</math>
 
:<math>\Delta_k = O(1)</math>

Revision as of 16:38, 19 December 2013

Let [math]M_k[/math] be the quantity

[math]\displaystyle M_k := \sup_F \frac{\sum_{m=1}^k J_k^{(m)}(F)}{I_k(F)}[/math]

where [math]F[/math] ranges over square-integrable functions on the simplex

[math]\displaystyle {\mathcal R}_k := \{ (t_1,\ldots,t_k) \in [0,+\infty)^k: t_1+\ldots+t_k \leq 1 \}[/math]

with [math]I_k, J_k^{(m)}[/math] being the quadratic forms

[math]\displaystyle I_k(F) := \int_{{\mathcal R}_k} F(t_1,\ldots,t_k)^2\ dt_1 \ldots dt_k[/math]

and

[math]\displaystyle J_k^{(m)}(F) := \int_{{\mathcal R}_{k-1}} (\int_0^{1-\sum_{i \neq m} t_i} F(t_1,\ldots,t_k)\ dt_m)^2 dt_1 \ldots dt_{m-1} dt_{m+1} \ldots dt_k.[/math]

It is known that [math]DHL[k,m+1][/math] holds whenever [math]EH[\theta][/math] holds and [math]M_k \gt \frac{2m}{\theta}[/math]. Thus for instance, [math]M_k \gt 2[/math] implies [math]DHL[k,2][/math] on the Elliott-Halberstam conjecture, and [math]M_k\gt4[/math] implies [math]DHL[k,2][/math] unconditionally.

Upper bounds

We have the upper bound

[math]\displaystyle M_k \leq \frac{k}{k-1} \log k[/math] (1)

that is proven as follows.

The key estimate is

[math] \displaystyle \int_0^{1-t_2-\ldots-t_k} F(t_1,\ldots,t_k)\ dt_1)^2 \leq \frac{\log k}{k-1} \int_0^{1-t_2-\ldots-t_k} F(t_1,\ldots,t_k)^2 (1 - t_1-\ldots-t_k+ kt_1)\ dt_1.[/math]. (2)

Assuming this estimate, we may integrate in [math]t_2,\ldots,t_k[/math] to conclude that

[math]\displaystyle J_k^{(1)}(F) \leq \frac{\log k}{k-1} \int F^2 (1-t_1-\ldots-t_k+kt_1)\ dt_1 \ldots dt_k[/math]

which symmetrises to

[math]\sum_{m=1}^k J_k^{(m)}(F) \leq k \frac{\log k}{k-1} \int F^2\ dt_1 \ldots dt_k[/math]

giving the desired upper bound (1).

It remains to prove (2). By Cauchy-Schwarz, it suffices to show that

[math]\displaystyle \int_0^{1-t_2-\ldots-t_k} \frac{dt_1}{1 - t_1-\ldots-t_k+ kt_1} \leq \frac{\log k}{k-1}.[/math]

But writing [math]s = t_2+\ldots+t_k[/math], the left-hand side evaluates to

[math]\frac{1}{k-1} (\log k(1-s) - \log (1-s) ) = \frac{\log k}{k-1}[/math]

as required.

Lower bounds

We will need some parameters [math]c, T, \tau \gt 0[/math] and [math]a \gt 1[/math] to be chosen later (in practice we take c close to [math]1/\log k[/math], T a small multiple of c, and [math]\tau[/math] a small multiple of c/k, and [math]a[/math] a large absolute constant).

For any symmetric function F on the simplex [math]{\mathcal R}_k[/math], one has

[math]J_k^{(1)}(F) \leq \frac{M_k}{k} I_k(F)[/math]

and so by scaling, if F is a symmetric function on the dilated simplex [math]r \cdot {\mathcal R}_k[/math], one has

[math]J_k^{(1)}(F) \leq \frac{r M_k}{k} I_k(F)[/math]

after adjusting the definition of the functionals [math]I_k, J_k^{(1)}[/math] suitably for this rescaled simplex.

Now let us apply this inequality r in the interval [math][1,1+\tau][/math] and to truncated tensor product functions

[math]F(t_1,\ldots,t_k) = 1_{t_1+\ldots+t_k\leq r} \prod_{i=1}^k m_2^{-1/2} g(t_i)[/math]

for some bounded measurable [math]g: [0,T] \to {\mathbf R}[/math], not identically zero, with [math]m_2 := \int_0^T g(t)^2\ dt[/math]. We have the probabilistic interpretations

[math]J_k^{(1)}(F) := m_2^{-1} {\mathbf E} ( \int_{[0, r - S_{k-1}]} g(t)\ dt)^2[/math]

and

[math]I(F) := m_2^{-1} {\mathbf E} \int_{[0,r - S_{k-1}]} g(t)^2\ dt[/math]
[math] = {\mathbf P} (S_k \leq r)[/math]

where [math]S_{k-1} := X_1 + \ldots X_{k-1}[/math], [math]S_k := X_1 + \ldots + X_k[/math] and [math]X_1,\ldots,X_k[/math] are iid random variables in [0,T] with law [math]m_2^{-1} g(t)^2\ dt[/math], and we adopt the convention that [math]\displaystyle \int_{[a,b]} f[/math] vanishes when [math]b \lt a[/math]. We thus have

[math] {\mathbf E} ( \int_{[0, r - S_{k-1}]} g(t)\ dt)^2 \leq \frac{r M_k}{k} m_2 {\mathbf P} ( S_k \leq r ) [/math] (*)

for any r.

We now introduce the random function [math]h = h_r[/math] by

[math] h(t) := \frac{1}{r - S_{k-1} + (k-1) t} 1_{S_{k-1} \lt r}.[/math]

Observe that if [math]S_{k-1} \lt r[/math], then

[math] \int_{[0, r-S_{k-1}]} h(t)\ dt = \frac{\log k}{k-1}[/math]

and hence by the Legendre identity

[math]( \int_{[0, r - S_{k-1}]} g(t)\ dt)^2 = \frac{\log k}{k-1} \int_{[0, r - S_{k-1}]} \frac{g(t)^2}{h(t)}\ dt - \frac{1}{2} \int_{[0,r-S_{k-1}]} \int_{[0,r-S_{k-1}]} \frac{(g(s) h(t)-g(t) h(s))^2}{h(s) h(t)}\ ds dt.[/math]

We also note that (using the iid nature of the [math]X_i[/math] to symmetrise)

[math] {\mathbf E} \int_{[0, r - S_{k-1}]} g(t)^2/h(t)\ dt = m_2 {\mathbf E} 1_{S_k \leq r} / h( X_k ) [/math]
[math] = m_2 {\mathbf E} 1_{S_k \leq r} (1 - X_1 - \ldots - X_k + k X_k ) [/math]
[math] = m_2 {\mathbf E} 1_{S_k \leq r} [/math]
[math] = m_2 {\mathbf P}( S_k \leq r ).[/math]

Inserting these bounds into (*) and rearranging, we conclude that

[math] r \Delta_k {\mathbf P} ( S_k \leq r ) \leq \frac{k}{2m_2} {\mathbf E} \int_{[0,r-S_{k-1}]} \int_{[0,r-S_{k-1}]} \frac{(g(s) h(t)-g(t) h(s))^2}{h(s) h(t)}\ ds dt[/math]

where [math]\Delta_k := \frac{k}{k-1} \log k - M_k[/math] is the defect from the upper bound. Splitting the integrand into regions where s or t is larger than or less than T, we obtain

[math] r \Delta_k {\mathbf P} ( S_k \leq r ) \leq Y_1 + Y_2[/math]

where

[math]Y_1 := \frac{k}{m_2} {\mathbf E} \int_{[0,T]} \int_{[T,r-S_{k-1}]} \frac{g(t)^2}{h(t)} h(s)\ ds dt[/math]

and

[math]Y_2 := \frac{k}{2 m_2} {\mathbf E} \int_{[0,\min(r-S_{k-1},T)]} \int_{[0,\min(r-S_{k-1},T)]} \frac{(g(s) h(t)-g(t) h(s))^2}{h(s) h(t)}\ ds dt.[/math]

We now focus on [math]Y_1[/math]. It is only non-zero when [math]S_{k-1} \leq r-T[/math]. Bounding [math]h(s) \leq \frac{1}{(k-1)s}[/math], we see that

[math]Y_1 \leq \frac{k}{(k-1) m_2} {\mathbf E} \int_0^T \frac{g(t)^2}{h(t)}\ dt \times \log_+ \frac{r-S_{k-1}}{T}[/math]

where [math]\log_+(x)[/math] is equal to [math]\log x[/math] when [math]x \geq 1[/math] and zero otherwise. We can rewrite this as

[math]Y_1 \leq \frac{k}{k-1} {\mathbf E} 1_{S_k \leq r} \frac{1}{h(X_k)} \log_+ \frac{r-S_{k-1}}{T}.[/math]

We write [math]\frac{1}{h(X_k)} = r-S_k + kX_k[/math] and [math]\frac{r-S_{k-1}}{T} = \frac{r-S_k}{T} + \frac{X_k}{T}[/math]. Using the bound [math]\log_+(x+y) \leq \log_+(x) + \log_+(1+y)[/math] we have

[math]\log_+ \frac{r-S_{k-1}}{T} \leq \log_+ \frac{r-S_{k}}{T} + \log(1 + \frac{X_k}{T})[/math]

and thus (bounding [math]\log(1+\frac{X_k}{T}) \leq \frac{X_k}{T}[/math].

[math]Y_1 \leq \frac{k}{k-1} {\mathbf E} (r-S_k + kX_k) \log_+ \frac{r-S_{k}}{T} + (r-S_k)_+ \frac{X_k}{T} + k X_k \log(1+\frac{X_k}{T})[/math].

Symmetrising, we conclude that

[math]Y_1 \leq \frac{k}{k-1} (Z_1 + Z_2 + Z_3)[/math]

where

[math]Z_1 := {\mathbf E} r \log_+ \frac{r-S_{k}}{T}[/math]
[math]Z_2 := {\mathbf E} (r-S_k)_+ \frac{S_k}{kT}[/math]
[math]Z_3 := m_2^{-1} \int_0^T kt \log(1 + \frac{t}{T}) g(t)^2\ dt.[/math]

For [math]Z_2[/math], which is a tiny term, we use the crude bound

[math]Z_2 \leq \frac{r^2}{4kT}.[/math]

For [math]Z_1[/math], we use the bound

[math]\log_+ x \leq \frac{(x+4a\log a - a)^2}{4a^2 \log a}[/math]

which can be verified because the LHS is concave for [math]x \geq 1[/math], while the RHS is convex and is tangent to the LHS as x=a. We then have

[math]\log_+ \frac{r-S_{k}}{T} \leq \frac{(r-S_k+4aT\log a-aT)^2}{4a^2 T^2\log a}[/math]

and thus

[math]Z_1 \leq r \frac{(r-k\mu+4aT\log a-aT)^2 + k \sigma^2}{4a^2 \log a}[/math]

where

[math] \mu := m_2^{-1} \int_0^T t g(t)^2\ dt [/math]
[math] \sigma^2 := m_2^{-1} \int_0^T t^2 g(t)^2\ dt - \mu^2. [/math]

Thus far, our arguments have been valid for arbitrary functions [math]g[/math]. We now specialise to functions of the form

[math] g(t) := \frac{1}{c+(k-1)t}.[/math]

Note the identity

[math]\displaystyle g(t) - h(t) = (r - S_{k-1} - c) g(t) h(t)[/math]

on [math][0, \min(r-S_{k-1},T)][/math]. Thus

[math]Y_2 = \frac{k}{2 m_2} {\mathbf E} \int_{[0,\min(r-S_{k-1},T)]} \int_{[0,\min(r-S_{k-1},T)]} \frac{((g-h)(s) h(t)-(g-h)(t) h(s))^2}{h(s) h(t)}\ ds dt[/math]
[math]= \frac{k}{2 m_2} {\mathbf E} (r - S_{k-1} - c)^2 \int_{[0,\min(r-S_{k-1},T)]} \int_{[0,\min(r-S_{k-1},T)]} (g(s)-g(t))^2 h(s) h(t)\ ds dt.[/math]

Bounding [math](g(s)-g(t))^2 \leq g(s)^2+g(t)^2[/math] and using symmetry, we conclude

[math]Y_2 \leq \frac{k}{m_2} {\mathbf E} (r - S_{k-1} - c)^2 \int_{[0,\min(r-S_{k-1},T)]} \int_{[0,\min(r-S_{k-1},T)]} g(s)^2 h(s) h(t)\ ds dt.[/math]

Since [math]\int_0^{r-S_{k-1}} h(t)\ dt = \frac{\log k}{k-1}[/math], we conclude that

[math]Y_2 \leq \frac{k}{k-1} Z_4[/math]

where [math]Z_4 = Z_4[r][/math] is the quantity

[math]Z_4 := \frac{\log k}{m_2} {\mathbf E} (r - S_{k-1} - c)^2 \int_{[0,\min(r-S_{k-1},T)]} g(s)^2 h_r(s)\ ds.[/math]

Putting all this together, we have

[math] r \Delta_k {\mathbf P} ( S_k \leq r ) \leq \frac{k}{k-1} (r \frac{(r-k\mu+4aT\log a-aT)^2 + k \sigma^2}{4a^2 T^2\log a} + \frac{r^2}{4kT} + Z_3 + Z_4[r] ).[/math]

At this point we encounter a technical problem that [math]Z_4[/math] diverges logarithmically (up to a cap of [math]\log k[/math]) as [math]S_{k-1}[/math] approaches r. To deal with this issue we average in r, and specifically over the interval [math][1,1+\tau][/math]. Using the slightly crude bound

[math] \int_0^1 (1+a\tau) 1_{x \gt 1+a\tau}\ da \leq \frac{1+\tau/2}{(1-k\mu)^2} (x-k\mu)^2[/math]

for all x and some [math]\delta \gt 0[/math], then

[math] \int_0^1 (1+a\tau) {\mathbf P} ( S_k \leq 1+a\tau )\ da \leq (1 + \frac{\tau}{2}) (1 - \frac{k \sigma^2}{(1-k\mu)^2})[/math]

provided that [math]k \mu \lt 1[/math], and hence

[math] \Delta_k (1 + \frac{\tau}{2}) (1 - \frac{k \sigma^2}{(1-k\mu)^2}) \leq \frac{k}{k-1} ( (1+\frac{\tau}{2}) \frac{(1+\tau-k\mu+4aT\log a-aT)^2 + k \sigma^2}{4a^2 \log a} + \frac{1+\tau+\frac{\tau^2}{3}}{4kT} + Z_3 + \int_0^1 Z_4[1+a\tau]\ da ).[/math]

If [math]\tau \leq c[/math], we may bound

[math]\int_0^1 Z_4[1+a\tau]\ da := \frac{\log k}{m_2} {\mathbf E} ((1 - S_{k-1})^2 + c^2) \int_{[0,T)]} g(s)^2 (\int_0^1 h_{1+a\tau}(s) 1_{1+a\tau-S_{k-1} \geq s}\ da)\ ds.[/math]

Observe that

[math]\int_0^1 h_{1+a\tau}(s) 1_{1+a\tau-S_{k-1} \geq s}\ da = \int_0^1 \frac{da}{1-S_{k-1}+a\tau+(k-1)s} 1_{1-S_{k-1}+a\tau \geq s}[/math]
[math] = \frac{1}{\tau} \int_{[\max(ks, 1-S_{k-1}+(k-1)s), 1-S_{k-1}+\tau+(k-1)s]} \frac{du}{u}[/math]
[math] \leq \frac{1}{\tau} \log \frac{ks+\tau}{ks}[/math]

and so

[math]\int_0^1 Z_4[1+a\tau]\ da \leq W \frac{\log k}{\tau} {\mathbf E} ((1 - S_{k-1})^2 + c^2)[/math]
[math]= W \frac{\log k}{\tau} ((1 - (k-1)\mu)^2 + (k-1) \sigma^2 + c^2)[/math]

where

[math]W := m_2^{-1} \int_{[0,T)]} g(s)^2 \log(1+\frac{\tau}{ks})\ ds.[/math]

We thus arrive at the final bound

[math] \Delta_k \leq \frac{k}{k-1} \frac{ (1+\frac{\tau}{2}) \frac{(1+\tau-k\mu+4aT\log a-aT)^2 + k \sigma^2}{4a^2 T^2 \log a} + \frac{1+\tau+\frac{\tau^2}{3}}{4kT} + Z_3 + W \frac{\log k}{\tau} ((1 - (k-1)\mu)^2 + (k-1) \sigma^2 + c^2)}{(1 + \frac{\tau}{2}) (1 - \frac{k \sigma^2}{(1-k\mu)^2})}[/math]

provided that [math]k \mu \lt 1[/math] and the denominator is positive.

If we set [math]c = \tau := 1/\log k[/math], with [math]T[/math] a small multiple of c, then [math]\mu \approx \frac{1}{k}(1 - \frac{A}{\log k})[/math] for a large absolute constant A, and [math]\sigma^2[/math] is a small multiple of [math]\frac{1}{k \log^2 k}[/math]. This makes the denominator comparable to 1; if we then set [math]a \sim c/T[/math], then one can check that all the terms in the numerator are O(1), finally giving the bound

[math]\Delta_k = O(1)[/math]

and thus we have the lower bound

[math]M_k \geq\frac{k}{k-1} \log k - O(1) = \log k - O(1)[/math].

More general variational problems

It appears that for the purposes of establish DHL type theorems, one can increase the range of F in which one is taking suprema over (and extending the range of integration in the definition of [math]J_k^{(m)}(F)[/math] accordingly). Firstly, one can enlarge the simplex [math]{\mathcal R}_k[/math] to the larger region

[math]{\mathcal R}'_k = \{ (t_1,\ldots,t_k) \in [0,1]^k: t_1+\ldots+t_k \leq 1 + \min(t_1,\ldots,t_k) \}[/math]

provided that one works with a generalisation of [math]EH[\theta][/math] which controls more general Dirichlet convolutions than the von Mangoldt function (a precise assertion in this regard may be found in BFI). In fact one should be able to work in any larger region [math]R[/math] for which

[math]R + R \subset \{ (t_1,\ldots,t_k) \in [0,2/\theta]^k: t_1+\ldots+t_k \leq 2 + \max(t_1,\ldots,t_k) \} \cup \frac{2}{\theta} \cdot {\mathcal R}_k[/math]

provided that all the marginal distributions of F are supported on [math]{\mathcal R}_{k-1}[/math], thus (assuming F is symmetric)

[math]\int_0^\infty F(t_1,\ldots,t_{k-1},t_k)\ dt_k = 0 [/math] when [math]t_1+\ldots+t_{k-1} \gt 1.[/math]

For instance, one can take [math]R = \frac{1}{\theta} \cdot {\mathcal R}_k[/math], or one can take [math]R = \{ (t_1,\ldots,t_k) \in [0,1/\theta]^k: t_1 +\ldots +t_{k-1} \leq 1 \}[/math] (although the latter option breaks the symmetry for F). Perhaps other choices are also possible.

World records

[math]k[/math] [math]M_k[/math] [math]M'_k[/math] [math]M''_k[/math]
Lower Upper Lower Upper Lower Upper
2 1.38593... 1.38593... 2 2 2 2
3 1.646 1.648 1.842 2.080 1.917 2.38593...
4 1.845 1.848 1.937 2.198 2.648
5 2.001162 2.011797 2.059 2.311 2.848
10 2.53 2.55842
20 3.05 3.1534
30 3.34 3.51848
40 3.52 3.793466
50 3.66 3.99186
59 4.06 4.1479398

For k>2, all upper bounds on [math]M_k[/math] come from (1). Upper bounds on [math]M'_k[/math] come from the inequality [math]M'_k \leq \frac{k}{k-1} M_{k-1}[/math] that follows from an averaging argument, and upper bounds on [math]M''_k[/math] (on EH, using the prism [math]\{ t_1+\ldots+t_{k-1},t_k \leq 1\}[/math] as the domain) come from the inequality [math]M''_k \leq M_{k-1} + 1[/math] by comparing [math]M''_k[/math] with a variational problem on the prism (details here).

For k=2, [math]M'_2=M''_2 = 2[/math] can be computed exactly by taking F to be the indicator function of the unit square (for the lower bound), and by using Cauchy-Schwarz (for the upper bound). [math]M_2=\frac{1}{1-W(1/e)} \approx 1.385893[/math] can be computed exactly as the solution to the equation [math]2-\frac{1}{x} + \log(1-\frac{1}{x}) = 0[/math].