Selberg sieve variational problem: Difference between revisions

From Polymath Wiki
Jump to navigationJump to search
No edit summary
 
(114 intermediate revisions by 2 users not shown)
Line 34: Line 34:
== Lower bounds ==
== Lower bounds ==


...
We will need some parameters <math>c, T, \tau > 0</math> and <math>a > 1</math> to be chosen later (in practice we take c close to <math>1/\log k</math>, T a small multiple of c, and <math>\tau</math> a small multiple of c/k.
 
For any symmetric function F on the simplex <math>{\mathcal R}_k</math>, one has
 
:<math>J_k^{(1)}(F) \leq \frac{M_k}{k} I_k(F)</math>
 
and so by scaling, if F is a symmetric function on the dilated simplex <math>r \cdot {\mathcal R}_k</math>, one has
 
:<math>J_k^{(1)}(F) \leq \frac{r M_k}{k} I_k(F)</math>
 
after adjusting the definition of the functionals <math>I_k, J_k^{(1)}</math> suitably for this rescaled simplex.
 
Now let us apply this inequality r in the interval <math>[1,1+\tau]</math> and to truncated tensor product functions
 
:<math>F(t_1,\ldots,t_k) = 1_{t_1+\ldots+t_k\leq r} \prod_{i=1}^k m_2^{-1/2} g(t_i)</math>
 
for some bounded measurable <math>g: [0,T] \to {\mathbf R}</math>, not identically zero, with <math>m_2 := \int_0^T g(t)^2\ dt</math>.  We have the probabilistic interpretations
 
:<math>J_k^{(1)}(F) := m_2^{-1} {\mathbf E} ( \int_{[0, r - S_{k-1}]} g(t)\ dt)^2</math>
 
and
 
:<math>I(F) := m_2^{-1} {\mathbf E} \int_{[0,r - S_{k-1}]} g(t)^2\ dt</math>
:<math> = {\mathbf P} (S_k \leq r)</math>
 
where <math>S_{k-1} := X_1 + \ldots X_{k-1}</math>, <math>S_k := X_1 + \ldots + X_k</math> and <math>X_1,\ldots,X_k</math> are iid random variables in [0,T] with law <math>m_2^{-1} g(t)^2\ dt</math>, and we adopt the convention that <math>\displaystyle \int_{[a,b]} f</math> vanishes when <math>b < a</math>.  We thus have
 
:<math> {\mathbf E} ( \int_{[0, r - S_{k-1}]} g(t)\ dt)^2 \leq \frac{r M_k}{k} m_2 {\mathbf P} ( S_k \leq r ) </math> (*)
 
for any r.
 
We now introduce the random function <math>h = h_r</math> by
 
:<math> h(t) := \frac{1}{r - S_{k-1} + (k-1) t} 1_{S_{k-1} < r}.</math>
 
Observe that if <math>S_{k-1} < r</math>, then
 
:<math> \int_{[0, r-S_{k-1}]} h(t)\ dt = \frac{\log k}{k-1}</math>
 
and hence by the Legendre identity
 
:<math>( \int_{[0, r - S_{k-1}]} g(t)\ dt)^2 = \frac{\log k}{k-1} \int_{[0, r - S_{k-1}]} \frac{g(t)^2}{h(t)}\ dt - \frac{1}{2} \int_{[0,r-S_{k-1}]} \int_{[0,r-S_{k-1}]} \frac{(g(s) h(t)-g(t) h(s))^2}{h(s) h(t)}\ ds dt.</math>
 
We also note that (using the iid nature of the <math>X_i</math> to symmetrise)
 
:<math> {\mathbf E} \int_{[0, r - S_{k-1}]} g(t)^2/h(t)\ dt = m_2 {\mathbf E} 1_{S_k \leq r} / h( X_k ) </math>
:<math> = m_2 {\mathbf E} 1_{S_k \leq r} (1 - X_1 - \ldots - X_k + k X_k ) </math>
:<math> = m_2 {\mathbf E} 1_{S_k \leq r} </math>
:<math> = m_2 {\mathbf P}( S_k \leq r ).</math>
 
Inserting these bounds into (*) and rearranging, we conclude that
 
:<math> r \Delta_k {\mathbf P} ( S_k \leq r ) \leq \frac{k}{2m_2} {\mathbf E} \int_{[0,r-S_{k-1}]} \int_{[0,r-S_{k-1}]} \frac{(g(s) h(t)-g(t) h(s))^2}{h(s) h(t)}\ ds dt</math>
 
where <math>\Delta_k := \frac{k}{k-1} \log k - M_k</math> is the defect from the upper bound.  Splitting the integrand into regions where s or t is larger than or less than T, we obtain
 
:<math> r \Delta_k {\mathbf P} ( S_k \leq r ) \leq Y_1 + Y_2</math>
 
where
 
:<math>Y_1 := \frac{k}{m_2} {\mathbf E} \int_{[0,T]} \int_{[T,r-S_{k-1}]} \frac{g(t)^2}{h(t)} h(s)\ ds dt</math>
 
and
 
:<math>Y_2 := \frac{k}{2 m_2} {\mathbf E} \int_{[0,\min(r-S_{k-1},T)]} \int_{[0,\min(r-S_{k-1},T)]} \frac{(g(s) h(t)-g(t) h(s))^2}{h(s) h(t)}\ ds dt.</math>
 
We now focus on <math>Y_1</math>.  It is only non-zero when <math>S_{k-1} \leq r-T</math>.  Bounding <math>h(s) \leq \frac{1}{(k-1)s}</math>, we see that
 
:<math>Y_1 \leq \frac{k}{(k-1) m_2} {\mathbf E} \int_0^T \frac{g(t)^2}{h(t)}\ dt \times \log_+ \frac{r-S_{k-1}}{T}</math>
 
where <math>\log_+(x)</math> is equal to <math>\log x</math> when <math>x \geq 1</math> and zero otherwise.  We can rewrite this as
 
:<math>Y_1 \leq \frac{k}{k-1} {\mathbf E} 1_{S_k \leq r} \frac{1}{h(X_k)} \log_+ \frac{r-S_{k-1}}{T}.</math>
 
We write <math>\frac{1}{h(X_k)} = r-S_k + kX_k</math> and <math>\frac{r-S_{k-1}}{T} = \frac{r-S_k}{T} + \frac{X_k}{T}</math>.  Using the bound <math>\log_+(x+y) \leq \log_+(x) + \log_+(1+y)</math> we have
 
<math>\log_+ \frac{r-S_{k-1}}{T} \leq \log_+ \frac{r-S_{k}}{T} + \log(1 + \frac{X_k}{T})</math>
 
and thus (bounding <math>\log(1+\frac{X_k}{T}) \leq \frac{X_k}{T}</math>).
 
:<math>Y_1 \leq \frac{k}{k-1} {\mathbf E} (r-S_k + kX_k) \log_+ \frac{r-S_{k}}{T} + (r-S_k)_+ \frac{X_k}{T} + k X_k \log(1+\frac{X_k}{T})</math>.
 
Symmetrising, we conclude that
 
:<math>Y_1 \leq \frac{k}{k-1} (Z_1 + Z_2 + Z_3)</math>
 
where
 
:<math>Z_1 := {\mathbf E} r \log_+ \frac{r-S_{k}}{T}</math>
 
:<math>Z_2 := {\mathbf E} (r-S_k)_+ \frac{S_k}{kT}</math>
 
:<math>Z_3 := m_2^{-1} \int_0^T kt \log(1 + \frac{t}{T}) g(t)^2\ dt.</math>
 
For <math>Z_2</math>, which is a tiny term, we use the crude bound
 
:<math>Z_2 \leq \frac{r^2}{4kT}.</math>
 
For <math>Z_1</math>, we use the bound
 
:<math>\log_+ x \leq \frac{(x+2a\log a - a)^2}{4a^2 \log a}</math>
 
valid for any <math>a>1</math>, which can be verified because the LHS is concave for <math>x \geq 1</math>, while the RHS is convex and is tangent to the LHS as x=a.  We then have
 
:<math>\log_+ \frac{r-S_{k}}{T} \leq \frac{(r-S_k+2aT\log a-aT)^2}{4a^2 T^2\log a}</math>
 
and thus
 
:<math>Z_1 \leq r (\frac{(r-k\mu+2aT\log a-aT)^2 + k \sigma^2}{4a^2 T^2 \log a} )</math>
 
where
 
:<math> \mu := m_2^{-1} \int_0^T t g(t)^2\ dt </math>
:<math> \sigma^2 := m_2^{-1} \int_0^T t^2 g(t)^2\ dt - \mu^2. </math>
 
A good choice for <math>a=a[r]</math> here is <math>a = \frac{r-k\mu}{T}</math> (assuming <math>1-k\mu \geq T</math>), in which case the formula simplifies to
 
:<math>Z_1 \leq r (\log \frac{r-k\mu}{T} + \frac{k \sigma^2}{4(r-k\mu)^2 \log a})</math>
 
Thus far, our arguments have been valid for arbitrary functions <math>g</math>.  We now specialise to functions of the form
 
:<math> g(t) := \frac{1}{c+(k-1)t}.</math>
 
Note the identity
 
:<math>\displaystyle g(t) - h(t) = (r - S_{k-1} - c) g(t) h(t)</math>
 
on <math>[0, \min(r-S_{k-1},T)]</math>.  Thus
 
:<math>Y_2 = \frac{k}{2 m_2} {\mathbf E} \int_{[0,\min(r-S_{k-1},T)]} \int_{[0,\min(r-S_{k-1},T)]} \frac{((g-h)(s) h(t)-(g-h)(t) h(s))^2}{h(s) h(t)}\ ds dt</math>
 
:<math>= \frac{k}{2 m_2} {\mathbf E} (r - S_{k-1} - c)^2 \int_{[0,\min(r-S_{k-1},T)]} \int_{[0,\min(r-S_{k-1},T)]} (g(s)-g(t))^2 h(s) h(t)\ ds dt.</math>
 
Bounding <math>(g(s)-g(t))^2 \leq g(s)^2+g(t)^2</math> and using symmetry, we conclude
 
:<math>Y_2 \leq \frac{k}{m_2} {\mathbf E} (r - S_{k-1} - c)^2 \int_{[0,\min(r-S_{k-1},T)]} \int_{[0,\min(r-S_{k-1},T)]} g(s)^2 h(s) h(t)\ ds dt.</math>
 
Since <math>\int_0^{r-S_{k-1}} h(t)\ dt = \frac{\log k}{k-1}</math>, we conclude that
 
:<math>Y_2 \leq \frac{k}{k-1} Z_4</math>
 
where <math>Z_4 = Z_4[r]</math> is the quantity
 
:<math>Z_4 := \frac{\log k}{m_2} {\mathbf E} (r - S_{k-1} - c)^2 \int_{[0,\min(r-S_{k-1},T)]} g(s)^2 h_r(s)\ ds.</math>
 
Putting all this together, we have
 
:<math> r \Delta_k {\mathbf P} ( S_k \leq r ) \leq \frac{k}{k-1} (r (\log \frac{r-k\mu}{T} + \frac{k \sigma^2}{4(r-k\mu)^2 \log a}) + \frac{r^2}{4kT} + Z_3 + Z_4[r] ).</math>
 
At this point we encounter a technical problem that <math>Z_4</math> diverges logarithmically (up to a cap of <math>\log k</math>) as <math>S_{k-1}</math> approaches r.  To deal with this issue we average in r, and specifically over the interval <math>[1,1+\tau]</math>.  One can calculate that
 
:<math> \int_0^1 (1+u\tau) 1_{x > 1+u\tau}\ du \leq (1+\tau/2) \frac{(x-k\mu)^2}{(1+\tau-k\mu)^2}</math>
 
for all x if we have <math>1-k\mu \geq \tau</math> (because the two expressions touch at <math>x=1+\tau</math>, with the RHS being convex with slope at least <math>(1+\tau/2)/\tau</math> there. and the LHS lying underneath this tangent line).  Assuming this, we conclude that
 
:<math> \int_0^1 (1+u\tau) {\mathbf P} ( S_k \leq 1+u\tau )\ du \leq (1 + \frac{\tau}{2}) (1 - \frac{k \sigma^2}{(1+\tau-k\mu)^2})</math>
 
provided that <math>k \mu < 1 - \tau</math>, and hence
 
:<math> \Delta_k (1 + \frac{\tau}{2}) (1 - \frac{k \sigma^2}{(1+\tau-k\mu)^2}) \leq \frac{k}{k-1} ( \frac{1}{\tau} \int_1^{1+\tau} (r (\log \frac{r-k\mu}{T} + \frac{k \sigma^2}{4a^2 T^2 \log a}) + \frac{r^2}{4kT}) dr + Z_3 + \int_0^1 Z_4[1+u\tau]\ du ).</math>
 
Now we deal with the <math>Z_4</math> integral. We split into two contributions, depending on whether <math>1+u\tau-S_{k-1} \leq 2c</math> or not.  If <math>1+u\tau-S_{k-1}\leq 2c</math>, then we may bound
:<math> (1+u\tau-S_{k-1}-c)^2 \leq c^2</math>
 
when <math>1+u\tau-S_{k-1} \geq 0</math>, so this portion of <math>\int_0^1 Z_4[1+u\tau]\ du</math> may be bounded by
 
:<math> \frac{\log k}{m_2} c^2 {\mathbf E} \int_{[0,T]} g(s)^2 (\int_0^1 h_{1+u\tau}(s) 1_{1+a\tau-S_{k-1} \geq s}\ du)\ ds.</math>
 
Observe that
 
:<math>\int_0^1 h_{1+u\tau}(s) 1_{1+u\tau-S_{k-1} \geq s}\ du = \int_0^1 \frac{du}{1-S_{k-1}+u\tau+(k-1)s} 1_{1-S_{k-1}+u\tau \geq s}</math>
:<math> = \frac{1}{\tau} \int_{[\max(ks, 1-S_{k-1}+(k-1)s), 1-S_{k-1}+\tau+(k-1)s]} \frac{du}{u}</math>
:<math> \leq \frac{1}{\tau} \log \frac{ks+\tau}{ks}</math>
 
and so this portion of <math>\int_0^1 Z_4[1+a\tau]\ da</math> is bounded by
:<math>W X</math>
 
where
 
:<math>W := m_2^{-1} \int_{[0,T)]} g(s)^2 \log(1+\frac{\tau}{ks})\ ds.</math>
:<math>X := \frac{\log k}{\tau} c^2</math>
 
As for the portion when <math>1+u\tau-S_{k-1} > 2c</math>, we bound <math>h_{1+u\tau}(s) \leq \frac{1}{2c+(k-1)t}</math>, and so this portion of <math>\int_0^1 Z_4[1+u\tau]\ du</math> may be bounded by
 
:<math>\int_0^1 \frac{\log k}{c} {\mathbf E} (1 + u\tau - S_{k-1} - c)^2 V\ du</math>
:<math>= V U</math>
 
where
 
:<math>V := \frac{c}{m_2} \int_0^T \frac{g(t)^2}{2c + (k-1)t}\ dt.</math>
:<math>U := \frac{\log k}{c} \int_0^1 (1 + u\tau - (k-1)\mu - c)^2+ (k-1)\sigma^2\ du</math>
 
We thus arrive at the final bound
 
:<math> \Delta_k \leq \frac{k}{k-1} \frac{ \frac{1}{\tau} \int_1^{1+\tau} (r (\log \frac{r-k\mu}{T} + \frac{k \sigma^2}{4(r-k\mu)^2 \log a}) + \frac{r^2}{4kT}) dr + Z_3 + W X + VU}{ (1+\tau/2)(1 - \frac{k\sigma^2}{(1+\tau-k\mu)^2})}</math>
 
provided that <math>k \mu < 1 - \tau</math> and the denominator is positive.
 
=== Asymptotic analysis ===
 
We work in the asymptotic regime <math>k \to \infty</math>.  Setting
 
:<math>c:= \frac{1}{\log k} + \frac{\alpha}{\log^2 k}</math>
:<math>T:= \frac{\beta}{\log k}</math>
:<math>\tau := \frac{\gamma}{\log k}</math>
 
for some absolute constants <math>\alpha \in {\mathbf R}</math> and <math>\beta,\gamma > 0</math>, one calculates
 
:<math>1-k\mu= \frac{\alpha+\log\beta- 1+ o(1)}{\log k}</math>
:<math>k\sigma^2 = \frac{\beta+o(1)}{\log^2 k}</math>
 
and so the constraint <math>1-k\mu > \tau</math> becomes
 
:<math>\alpha + \log \beta + \gamma \leq 1.</math>
 
With <math>r =1 + \frac{u\gamma}{\log k}</math> for <math>0 \leq u \leq 1</math>, one has
 
:<math> a = \frac{1-\alpha-\log \beta+u \gamma +o(1)}{\beta}</math>
:<math> Z_1 \leq \log a + \frac{1}{4\beta a \log^2 a}</math>
 
and one also calculates
:<math> Z_2, Z_3 = o(1)</math>
:<math>W = \int_0^\infty \frac{1}{(1+t)^2} \log(1+\frac{\gamma}{t})\ dt +o(1)</math>
:<math> = \frac{\gamma \log(\gamma)}{\gamma-1} + o(1)</math>
:<math>X = \frac{1}{\gamma} + o(1)</math>
:<math> V = \int_0^1\frac{dt}{(1+t)^2 (2+t)} + o(1) = 1 - \log(2) + o(1)</math>
:<math> U = \int_0^1 (u\gamma  -\alpha-\log \beta)^2\ du + \beta + o(1)</math>
:<math> = \frac{(\gamma-\alpha-\log\beta)^3 + (\alpha+\log\beta)^3}{3}+ \beta + o(1)</math>
 
and hence
 
:<math>\Delta \leq \frac{ \int_0^1 (\log a(u) + \frac{1}{4\beta a(u) \log^2 a(u)})\ du + \frac{\log(\gamma)}{\gamma-1} + \frac{1-\log(2)}{3} ((\gamma-\alpha-\log\beta)^3 + (\alpha+\log\beta)^3+3\beta) }{1 - \frac{\beta}{(1-\alpha-\log \beta-\gamma)^2} }+ o(1)</math>
 
assuming that <math>\alpha+\log \beta + \gamma< 1</math> and
:<math> 1 - \frac{\beta}{(1-\alpha-\log \beta-\gamma)^2} > 0,</math>
and where <math>a(u):=\frac{1-\alpha-\log \beta+u \gamma}{\beta}</math>.
 
In particular, by setting <math>\alpha,\beta,\gamma</math> as absolute constants obeying these constraints, we have <math>\Delta \leq O(1)</math>, and so
 
:<math> M_k = \log k+O(1).</math>
 
== More general variational problems ==
 
It appears that for the purposes of establish DHL type theorems, one can increase the range of F in which one is taking suprema over (and extending the range of integration in the definition of <math>J_k^{(m)}(F)</math> accordingly).  Firstly, one can enlarge the simplex <math>{\mathcal R}_k</math> to the larger region
 
:<math>{\mathcal R}'_k = \{ (t_1,\ldots,t_k) \in [0,1]^k: t_1+\ldots+t_k \leq 1 + \min(t_1,\ldots,t_k) \}</math>
 
provided that one works with a generalisation of <math>EH[\theta]</math> which controls more general Dirichlet convolutions than the von Mangoldt function (a precise assertion in this regard may be found in BFI).  In fact (as shown [http://terrytao.wordpress.com/2013/12/20/polymath8b-iv-enlarging-the-sieve-support-more-efficient-numerics-and-explicit-asymptotics/ here]) one can work in any larger region <math>R</math> for which
 
:<math>R + R \subset \{ (t_1,\ldots,t_k) \in [0,2/\theta]^k: t_1+\ldots+t_k \leq 2 + \max(t_1,\ldots,t_k) \} \cup \frac{2}{\theta} \cdot {\mathcal R}_k</math>
 
provided that all the marginal distributions of F are supported on <math>{\mathcal R}_{k-1}</math>, thus (assuming F is symmetric)
 
:<math>\int_0^\infty F(t_1,\ldots,t_{k-1},t_k)\ dt_k = 0 </math> when <math>t_1+\ldots+t_{k-1} > 1.</math>
 
For instance, one can take <math>R = \frac{1}{\theta} \cdot {\mathcal R}_k</math>, or one can take <math>R = \{ (t_1,\ldots,t_k) \in [0,1/\theta]^k: t_1 +\ldots +t_{k-1} \leq 1 \}</math> (although the latter option breaks the symmetry for F).  See [http://terrytao.wordpress.com/2014/01/08/polymath8b-v-stretching-the-sieve-support-further/ this blog post] for more discussion.
 
If the marginal distributions of F are supported in <math>(1+\varepsilon) \cdot {\mathcal R}_{k-1}</math> instead of <math>{\mathcal R}_{k-1}</math>, one still has a usable lower bound in which <math>J_k^{(m)}(F)</math> is replaced by the slightly smaller quantity <math>J_{k,\varepsilon}^{(m)}(F)</math>; see [http://terrytao.wordpress.com/2014/01/08/polymath8b-v-stretching-the-sieve-support-further/ this blog post] for more discussion.


== World records ==
== World records ==


 
{| class="wikitable" border=1
{| class="wikitable"
! <math>k</math>
! <math>k</math>
! Lower bound
! colspan="3" | <math>M_k</math>
! Upper bound
! colspan="2" | <math>M'_k</math>
! colspan="2" | <math>M''_k</math> (prism)
! colspan="2" | <math>M''_k</math> (symmetric)
! colspan="2" | <math>M''_k</math> (non-convex)
! colspan="2" | <math>M_{k,\varepsilon,1/2}</math>
|-
!
! 1D
! Lower  
! Upper  
! Lower
! Upper
! Lower
! Upper
! Lower
! Upper
! Lower
! Upper
! Lower
! Upper
|-
| 2
| 1.383
| [http://terrytao.wordpress.com/2013/12/08/polymath8b-iii-numerical-optimisation-of-the-variational-problem-and-a-search-for-new-sieves/#comment-257360 1.38593...]
| [http://terrytao.wordpress.com/2013/12/08/polymath8b-iii-numerical-optimisation-of-the-variational-problem-and-a-search-for-new-sieves/#comment-257360 1.38593...]
| 2
| 2
| 2
| 2
| 2
| 2
|
|
| [http://terrytao.wordpress.com/2014/01/08/polymath8b-v-stretching-the-sieve-support-further/#comment-263557 1.690608]
| 2
|-
| 3
| 1.635
| [http://users.ugent.be/~ibogaert/KrylovMk/ 1.64644]
| 1.648
| [http://terrytao.wordpress.com/2014/01/08/polymath8b-v-stretching-the-sieve-support-further/#comment-263186 1.8429]
| 2.080
| [http://terrytao.wordpress.com/2013/12/08/polymath8b-iii-numerical-optimisation-of-the-variational-problem-and-a-search-for-new-sieves/#comment-257276 1.914]
| 2.38593...
| [http://terrytao.wordpress.com/2014/01/08/polymath8b-v-stretching-the-sieve-support-further/#comment-264262 1.902626]
| 3
| [http://terrytao.wordpress.com/2014/01/17/polymath8b-vi-a-low-dimensional-variational-problem/#comment-267087 1.847]
| 3
| [http://terrytao.wordpress.com/2014/04/14/polymath8b-x-writing-the-paper-and-chasing-down-loose-ends/#comment-326155 1.91726]
| [http://terrytao.wordpress.com/2014/02/09/polymath8b-viii-time-to-start-writing-up-the-results/#comment-271610 2.4142]
|-
|-
| 4
| 4
| [http://terrytao.wordpress.com/2013/11/22/polymath8b-ii-optimising-the-variational-problem-and-the-sieve/#comment-253082 1.845]
| 1.820
| [http://users.ugent.be/~ibogaert/KrylovMk/ 1.845407]
| 1.848
| 1.848
| [http://terrytao.wordpress.com/2013/12/08/polymath8b-iii-numerical-optimisation-of-the-variational-problem-and-a-search-for-new-sieves/#comment-257276 1.937]
| 2.198
| [http://terrytao.wordpress.com/2013/12/20/polymath8b-iv-enlarging-the-sieve-support-more-efficient-numerics-and-explicit-asymptotics/#comment-258581 1.951]
| 2.648
| [http://terrytao.wordpress.com/2013/12/08/polymath8b-iii-numerical-optimisation-of-the-variational-problem-and-a-search-for-new-sieves/#comment-257276 1.937]
| 4
|
|
| [http://terrytao.wordpress.com/2014/02/21/polymath8b-ix-large-quadratic-programs/#comment-273973 2.05411]
| [http://terrytao.wordpress.com/2014/02/09/polymath8b-viii-time-to-start-writing-up-the-results/#comment-271610 2.5946]
|-
|-
| 5
| 5
| [http://arxiv.org/abs/1311.4600 2.001162]  
| 1.965
| [http://users.ugent.be/~ibogaert/KrylovMk/ 2.007145]
| 2.011797
| 2.011797
| [http://terrytao.wordpress.com/2013/12/08/polymath8b-iii-numerical-optimisation-of-the-variational-problem-and-a-search-for-new-sieves/#comment-257360 2.059]
| 2.311
|
| 2.848
|
|
|
|
| [http://terrytao.wordpress.com/2014/02/21/polymath8b-ix-large-quadratic-programs/#comment-272523 2.20264]
| [http://terrytao.wordpress.com/2014/02/09/polymath8b-viii-time-to-start-writing-up-the-results/#comment-271610 2.7466]
|-
|-
| 10
| 10
| [http://terrytao.wordpress.com/2013/11/19/polymath8b-bounded-intervals-with-many-primes-after-maynard/#comment-251896 2.53]
| 2.409
| [http://users.ugent.be/~ibogaert/KrylovMk/ 2.54547]
| 2.55842
| 2.55842
|
| 2.7466
|
|
|
|
|
|
| [http://terrytao.wordpress.com/2014/02/21/polymath8b-ix-large-quadratic-programs/#comment-272462 2.68235]
| [http://terrytao.wordpress.com/2014/02/09/polymath8b-viii-time-to-start-writing-up-the-results/#comment-271610 3.2716]
|-  
|-  
| 20
| 20
| [http://terrytao.wordpress.com/2013/11/19/polymath8b-bounded-intervals-with-many-primes-after-maynard/#comment-251896 3.05]
| 2.810
| [http://users.ugent.be/~ibogaert/KrylovMk/ 3.12756]
| 3.1534
| 3.1534
|
| 3.2716
|
|
|
|
|
|
| [http://terrytao.wordpress.com/2014/02/21/polymath8b-ix-large-quadratic-programs/#comment-272462 3.21470]
| [http://terrytao.wordpress.com/2014/02/09/polymath8b-viii-time-to-start-writing-up-the-results/#comment-271610 3.8564]
|-
|-
| 30
| 30
| [http://terrytao.wordpress.com/2013/11/19/polymath8b-bounded-intervals-with-many-primes-after-maynard/#comment-251896 3.34]
| 3.015
| [http://users.ugent.be/~ibogaert/KrylovMk/ 3.48313]
| 3.51848
| 3.51848
|
| 3.6079
|
|
|
|
|
|
| [http://terrytao.wordpress.com/2014/02/21/polymath8b-ix-large-quadratic-programs/#comment-272462 3.51943]
| [http://terrytao.wordpress.com/2014/02/09/polymath8b-viii-time-to-start-writing-up-the-results/#comment-271610 4.2182]
|-
|-
| 40
| 40
| [http://terrytao.wordpress.com/2013/11/19/polymath8b-bounded-intervals-with-many-primes-after-maynard/#comment-251896 3.52]
| 3.145
| 3.793466
| [http://users.ugent.be/~ibogaert/KrylovMk/ 3.73919]
| 3.783467
|
| 3.8564
|
|
|
|
|
|
| [http://terrytao.wordpress.com/2014/02/21/polymath8b-ix-large-quadratic-programs/#comment-272462 3.71480]
| [http://terrytao.wordpress.com/2014/02/09/polymath8b-viii-time-to-start-writing-up-the-results/#comment-271610 4.4815]
|-
|-
| 50
| 50
| [http://terrytao.wordpress.com/2013/11/19/polymath8b-bounded-intervals-with-many-primes-after-maynard/#comment-251896 3.66]
| 3.236
| [http://users.ugent.be/~ibogaert/KrylovMk/ 3.93586]
| 3.99186
| 3.99186
|
| 4.0540
|
|
|
|
|
|
| [http://terrytao.wordpress.com/2014/02/21/polymath8b-ix-large-quadratic-programs/#comment-297456 4.0043]
| [http://terrytao.wordpress.com/2014/02/09/polymath8b-viii-time-to-start-writing-up-the-results/#comment-271610 4.6889]
|-
| 51
| 3.244
| [http://users.ugent.be/~ibogaert/KrylovMk/ 3.95304]
| 4.0105
|
| 4.0717
|
|
|
|
|
|
| [http://terrytao.wordpress.com/2014/02/21/polymath8b-ix-large-quadratic-programs/#comment-272506 4.011910]
| [http://terrytao.wordpress.com/2014/02/09/polymath8b-viii-time-to-start-writing-up-the-results/#comment-271610 4.7075]
|-
| 53
| 3.259
| [http://users.ugent.be/~ibogaert/KrylovMk/ 3.986213]
| 4.0466
|
| 4.1062
|
|
|
|
|
|
| [http://terrytao.wordpress.com/2014/02/09/polymath8b-viii-time-to-start-writing-up-the-results/#comment-271862 4.000161]
| [http://terrytao.wordpress.com/2014/02/09/polymath8b-viii-time-to-start-writing-up-the-results/#comment-271610 4.744]
|-
| 54
| 3.266
| [http://users.ugent.be/~ibogaert/KrylovMk/ 4.00223]
| 4.0642
|
| 4.1230
|-
| 55
| 3.273
| [http://users.ugent.be/~ibogaert/KrylovMk/ 4.01788]
| 4.0816
|
| 4.1396
|-
|-
| 59
| 59
| [http://terrytao.wordpress.com/2013/11/22/polymath8b-ii-optimising-the-variational-problem-and-the-sieve/#comment-255681 3.95608]
| 3.299
| 4.1479398
| [http://users.ugent.be/~ibogaert/KrylovMk/ 4.07704]
| 4.1478399
|
| 4.2030
|}
|}


All upper bounds come from (1).
 
{| class="wikitable" border=1
! <math>k</math>
! colspan="1" | <math>M_{k,\varepsilon,1}</math>
! colspan="1" | <math>M'_{k,\varepsilon,1}</math> (sym)
! colspan="1" | <math>M''_{k,\varepsilon,1}</math> (prism)
! colspan="1" | <math>M''_{k,\varepsilon,1}</math> (sym)
! colspan="1" | <math>M''_{k,\varepsilon,1}</math> (non-convex)
! colspan="1" | <math>\hat M_{k,\varepsilon,1}</math>
! colspan="1" | <math>\hat M'_{k,\varepsilon,1}</math> (sym)
! colspan="1" | <math>\hat M''_{k,\varepsilon,1}</math> (sym)
! colspan="1" | <math>\hat M''_{k,\varepsilon,1}</math> (prism)
|-
| 2
| [http://terrytao.wordpress.com/2013/12/08/polymath8b-iii-numerical-optimisation-of-the-variational-problem-and-a-search-for-new-sieves/#comment-257360 1.38593...]
| 2
| 2
| 2
| 2
| [http://terrytao.wordpress.com/2014/01/08/polymath8b-v-stretching-the-sieve-support-further/#comment-263557 1.690608]
| 2
|-
| 3
| [http://terrytao.wordpress.com/2013/12/20/polymath8b-iv-enlarging-the-sieve-support-more-efficient-numerics-and-explicit-asymptotics/#comment-262398 1.8615]
| [http://terrytao.wordpress.com/2014/01/17/polymath8b-vi-a-low-dimensional-variational-problem/#comment-266257 1.956713]
| [http://terrytao.wordpress.com/2014/01/08/polymath8b-v-stretching-the-sieve-support-further/#comment-265050 1.936708]
| [http://terrytao.wordpress.com/2014/01/17/polymath8b-vi-a-low-dimensional-variational-problem/#comment-266724 1.962998]
| [http://terrytao.wordpress.com/2014/01/17/polymath8b-vi-a-low-dimensional-variational-problem/#comments 1.9400]
| [http://terrytao.wordpress.com/2014/04/14/polymath8b-x-writing-the-paper-and-chasing-down-loose-ends/#comment-326155 1.91726]
| [http://terrytao.wordpress.com/2014/01/17/polymath8b-vi-a-low-dimensional-variational-problem/#comment-268443 1.992]
| [http://terrytao.wordpress.com/2014/02/09/polymath8b-viii-time-to-start-writing-up-the-results/#comment-270725 2.0012]
| [http://terrytao.wordpress.com/2014/02/09/polymath8b-viii-time-to-start-writing-up-the-results/#comment-271881 1.965]
|-
| 4
| [http://terrytao.wordpress.com/2013/12/20/polymath8b-iv-enlarging-the-sieve-support-more-efficient-numerics-and-explicit-asymptotics/#comment-262403 2.0023]
| [http://terrytao.wordpress.com/2013/12/20/polymath8b-iv-enlarging-the-sieve-support-more-efficient-numerics-and-explicit-asymptotics/#comment-262403 2.0023]
| [http://terrytao.wordpress.com/2013/12/20/polymath8b-iv-enlarging-the-sieve-support-more-efficient-numerics-and-explicit-asymptotics/#comment-262403 2.0023]
| [http://terrytao.wordpress.com/2013/12/20/polymath8b-iv-enlarging-the-sieve-support-more-efficient-numerics-and-explicit-asymptotics/#comment-262403 2.0023]
|
| [http://terrytao.wordpress.com/2014/02/21/polymath8b-ix-large-quadratic-programs/#comment-273973 2.05411]
|}
 
 
 
For k>2, all upper bounds on <math>M_k</math> come from (1).  Upper bounds on <math>M'_k</math> come from the inequality <math>M'_k \leq \frac{k}{k-1} M_{k-1}</math> that follows from an averaging argument, and upper bounds on <math>M''_k</math> (on EH, using the prism <math>\{ t_1+\ldots+t_{k-1},t_k \leq 1\}</math> as the domain) come from the inequality <math>M''_k \leq M_{k-1} + 1</math> by comparing <math>M''_k</math> with a variational problem on the prism (details [http://terrytao.wordpress.com/2013/12/08/polymath8b-iii-numerical-optimisation-of-the-variational-problem-and-a-search-for-new-sieves/#comment-257013 here]).  The 1D bound <math> 4k(k-1)/j_{k-2}^2</math> is the optimal value for <math>M_k</math> when the underlying function <math>F</math> is restricted to be of the "one-dimensional" form <math>F(t_1,\ldots,t_k) = f(t_1+\ldots+t_k)</math>.
 
For k=2, <math>M'_2=M''_2 = 2</math> can be computed exactly by taking F to be the indicator function of the unit square (for the lower bound), and by using Cauchy-Schwarz (for the upper bound).  <math>M_2=\frac{1}{1-W(1/e)} \approx 1.38593</math> [http://terrytao.wordpress.com/2013/12/08/polymath8b-iii-numerical-optimisation-of-the-variational-problem-and-a-search-for-new-sieves/#comment-257360 can be computed exactly] as the solution to the equation <math>2-\frac{1}{x} + \log(1-\frac{1}{x}) = 0</math>.
 
The quantity <math>M_{k,\varepsilon,\theta}</math> is defined as the supremum of <math> \sum_{m=1}^k J_{k,\varepsilon}^{(m)}(F) / I_k(F)</math> where F is now supported on <math>[0,1/\theta]^k \cap (1+\varepsilon) \cdot {\mathcal R}_k</math>, and <math>J_{k,\varepsilon}^{(m)}</math> is defined as <math>J_k^{(m)}</math> but now restricted to <math>(1-\varepsilon) \cdot {\mathcal R}_k</math>, and <math>0 \leq \varepsilon \leq \max(1/(k-1),1/\theta-1)</math> is a parameter to be optimised in.  The quantity <math>M'_{k,\varepsilon,\theta}</math> is defined similarly, but with F now supported on <math>[0,1/\theta]^k \cap (1+\varepsilon) \cdot {\mathcal R}'_k \cap \max(\frac{k}{k-1},\frac{1}{\theta}) \cdot {\mathcal R}_k</math>.  Finally, <math>M''_{k,\varepsilon,\theta}</math> is also defined similarly, but with F supported on a region R as above, with all marginals supported on <math>(1+\varepsilon) \cdot {\mathcal R}_{k-1}</math>.
 
The quantities <math>\hat M_{k,\varepsilon,\theta}, \hat M'_{k,\varepsilon,\theta}, \hat M''_{k,\varepsilon,\theta}</math> are defined similarly to <math>M_{k,\varepsilon,\theta}, M'_{k,\varepsilon,\theta}, M''_{k,\varepsilon,\theta}</math>, but with the truncation of <math>R</math> to <math>[0,1/\theta]^k</math> removed.
 
For k=3, we also have the non-convex candidate <math>R = \{ x+y,x+z \leq 1 \} \cup \{ x+y,y+z\leq 1 \} \cup \{ x+z,y+z \leq 1 \}</math>.
 
The crude upper bound of <math>k</math> for any of the <math>M_k</math> type quantities comes from the parity problem obstruction that each separate event "<math>n+h_i</math> prime" can occur with probability at most 1/2.
 
Here are some [[notes on polytope decomposition]] for the k=3 case.
 
{| class="wikitable" border=1
|-
! Quantity
! Polytope constraints
! Vanishing marginals?
! Epsilon trick?
! Need GEH?
|-
| <math>M_k</math>
| <math>t_1+\dots+t_k \leq 1</math>
| No
| No
| No
|-
| <math>M'_k</math>
| <math>t_1+\dots+t_k - t_j \leq 1</math> for all <math>j</math>
| No
| No
| Yes
|-
| <math>M''_k</math> (prism)
| <math>t_1+\dots+t_{k-1} \leq 1</math>
<math>t_k \leq 1/\theta</math>
| Yes
| No
| Yes
|-
| <math>M''_k</math> (symmetric)
| <math>t_1+\dots+t_k \leq \max(\frac{k}{k-1},\frac{1}{\theta})</math>
<math>t_j \leq 1/\theta</math> for all <math>j</math>
| Yes
| No
| Yes
|-
| <math>M''_3</math> (nonconvex)
<math>(k,\theta)=(3,1)</math>
| <math>t_1+t_2,t_1+t_3 \leq 1</math> OR
<math>t_1+t_2,t_2+t_3 \leq 1</math> OR


== More general variational problems ==
<math>t_1+t_3,t_1+t_3 \leq 1</math>
| Yes
| No
| Yes
|-
| <math>M''_3</math> (nonconvex II)
<math>(k,\theta)=(3,1)</math>
| <math>2t_1+t_2+t_3,t_1+2t_2+t_3 \leq 2</math> OR
<math>2t_1+t_2+t_3,t_1+t_2+2t_3 \leq 2</math> OR


It appears that for the purposes of establish DHL type theorems, one can increase the range of F in which one is taking suprema over.  Firstly, one can enlarge the simplex <math>{\mathcal R}_k</math> to the larger region
<math>t_1+2t_2+t_3,t_1+t_2+2t_3 \leq 2</math>; AND ALSO


:<math>{\mathcal R}'_k = \{ (t_1,\ldots,t_k) \in [0,1]^k: t_1+\ldots+t_k \leq 1 + \min(t_1,\ldots,t_k) \}</math>
<math>t_1,t_2,t_3 \leq 1</math>
| Yes
| No
| Yes
|-
| <math>\hat M''_k</math> (symmetric)
| <math>t_1+\dots+t_k \leq \max(\frac{k}{k-1},\frac{1}{\theta})</math>
| Yes
| No
| Yes!
|-
| <math>\hat M''_k</math> (prism)
| <math>t_1+\dots+t_{k-1} \leq 1</math>


provided that one works with a generalisation of <math>EH[\theta]</math> which controls more general Dirichlet convolutions than the von Mangoldt function (a precise assertion in this regard may be found in BFI).  In fact one should be able to work in the even larger region
<math>t_k \leq \frac{2}{\theta}</math>
| Yes
| No
| Yes!
|-
| <math>M_{k,\varepsilon,\theta}</math>
| <math>t_1+\dots+t_k \leq 1+\varepsilon</math>
<math>t_1+\dots+t_k \leq \max(\frac{k}{k-1},\frac{1}{\theta})</math>


:<math>{\mathcal R}''_{k,\theta} = \{ (t_1,\ldots,t_k) \in [0,1/\theta]^k: t_1+\ldots+t_k \leq 1 + \max(t_1,\ldots,t_k) \} \cup \frac{1}{\theta} \cdot {\mathcal R}_k</math>
<math>t_j \leq \frac{1}{\theta}</math> for all <math>j</math>
| No
| Yes
| Yes
|-
| <math>M'_{k,\varepsilon,\theta}</math> (symmetric)
| <math>t_1+\dots+t_k-t_j \leq 1+\varepsilon</math> for all <math>j</math>
<math>t_1+\dots+t_k \leq \max(\frac{k}{k-1},\frac{1}{\theta})</math>


provided that all the marginal distributions of F are supported on <math>{\mathcal R}_{k-1}</math>, thus (assuming F is symmetric)
<math>t_j \leq \frac{1}{\theta}</math> for all <math>j</math>
| No
| Yes
| Yes
|-
| <math>M''_{k,\varepsilon,\theta}</math> (prism)
| <math>t_1+\dots+t_{k-1} \leq 1</math>
<math>t_k \leq \frac{1}{\theta}</math>
| Yes
| Yes
| Yes
|-
| <math>M''_{k,\varepsilon,\theta}</math> (symmetric)
| <math>t_1+\dots+t_k \leq \max(\frac{k}{k-1},\frac{1}{\theta})</math>
<math>t_j \leq \frac{1}{\theta}</math> for all <math>j</math>
| Yes
| Yes
| Yes
|-
| <math>M''_{3,\varepsilon,1}</math> (non-convex)
<math>(k,\theta)=(3,1)</math>
| <math>t_1+t_2,t_1+t_3 \leq 1</math> OR
<math>t_1+t_2,t_2+t_3 \leq 1</math> OR


:<math>\int_0^\infty F(t_1,\ldots,t_{k-1},t_k)\ dt_k = 0 </math> when <math>t_1+\ldots+t_{k-1} > 1.</math>
<math>t_1+t_3,t_1+t_3 \leq 1</math>
| Yes
| Yes
| Yes
|-
| <math>\hat M_{k,\varepsilon,\theta}</math>
| <math>t_1+\dots+t_k \leq 1+\varepsilon</math>
<math>t_1+\dots+t_k \leq \max(\frac{k}{k-1},\frac{1}{\theta})</math>
| No
| Yes
| Yes!
|-
| <math>\hat M'_{k,\varepsilon,\theta}</math> (symmetric)
| <math>t_1+\dots+t_k-t_j \leq 1+\varepsilon</math> for all <math>j</math>
<math>t_1+\dots+t_k \leq \max(\frac{k}{k-1},\frac{1}{\theta})</math>
| No
| Yes
| Yes!
|-
| <math>\hat M''_{k,\varepsilon,\theta}</math> (symmetric)
| <math>t_1+\dots+t_k \leq \max(\frac{k}{k-1},\frac{1}{\theta})</math>
| Yes
| Yes
| Yes!
|}

Latest revision as of 19:30, 23 October 2014

Let [math]\displaystyle{ M_k }[/math] be the quantity

[math]\displaystyle{ \displaystyle M_k := \sup_F \frac{\sum_{m=1}^k J_k^{(m)}(F)}{I_k(F)} }[/math]

where [math]\displaystyle{ F }[/math] ranges over square-integrable functions on the simplex

[math]\displaystyle{ \displaystyle {\mathcal R}_k := \{ (t_1,\ldots,t_k) \in [0,+\infty)^k: t_1+\ldots+t_k \leq 1 \} }[/math]

with [math]\displaystyle{ I_k, J_k^{(m)} }[/math] being the quadratic forms

[math]\displaystyle{ \displaystyle I_k(F) := \int_{{\mathcal R}_k} F(t_1,\ldots,t_k)^2\ dt_1 \ldots dt_k }[/math]

and

[math]\displaystyle{ \displaystyle J_k^{(m)}(F) := \int_{{\mathcal R}_{k-1}} (\int_0^{1-\sum_{i \neq m} t_i} F(t_1,\ldots,t_k)\ dt_m)^2 dt_1 \ldots dt_{m-1} dt_{m+1} \ldots dt_k. }[/math]

It is known that [math]\displaystyle{ DHL[k,m+1] }[/math] holds whenever [math]\displaystyle{ EH[\theta] }[/math] holds and [math]\displaystyle{ M_k \gt \frac{2m}{\theta} }[/math]. Thus for instance, [math]\displaystyle{ M_k \gt 2 }[/math] implies [math]\displaystyle{ DHL[k,2] }[/math] on the Elliott-Halberstam conjecture, and [math]\displaystyle{ M_k\gt 4 }[/math] implies [math]\displaystyle{ DHL[k,2] }[/math] unconditionally.

Upper bounds

We have the upper bound

[math]\displaystyle{ \displaystyle M_k \leq \frac{k}{k-1} \log k }[/math] (1)

that is proven as follows.

The key estimate is

[math]\displaystyle{ \displaystyle \int_0^{1-t_2-\ldots-t_k} F(t_1,\ldots,t_k)\ dt_1)^2 \leq \frac{\log k}{k-1} \int_0^{1-t_2-\ldots-t_k} F(t_1,\ldots,t_k)^2 (1 - t_1-\ldots-t_k+ kt_1)\ dt_1. }[/math]. (2)

Assuming this estimate, we may integrate in [math]\displaystyle{ t_2,\ldots,t_k }[/math] to conclude that

[math]\displaystyle{ \displaystyle J_k^{(1)}(F) \leq \frac{\log k}{k-1} \int F^2 (1-t_1-\ldots-t_k+kt_1)\ dt_1 \ldots dt_k }[/math]

which symmetrises to

[math]\displaystyle{ \sum_{m=1}^k J_k^{(m)}(F) \leq k \frac{\log k}{k-1} \int F^2\ dt_1 \ldots dt_k }[/math]

giving the desired upper bound (1).

It remains to prove (2). By Cauchy-Schwarz, it suffices to show that

[math]\displaystyle{ \displaystyle \int_0^{1-t_2-\ldots-t_k} \frac{dt_1}{1 - t_1-\ldots-t_k+ kt_1} \leq \frac{\log k}{k-1}. }[/math]

But writing [math]\displaystyle{ s = t_2+\ldots+t_k }[/math], the left-hand side evaluates to

[math]\displaystyle{ \frac{1}{k-1} (\log k(1-s) - \log (1-s) ) = \frac{\log k}{k-1} }[/math]

as required.

Lower bounds

We will need some parameters [math]\displaystyle{ c, T, \tau \gt 0 }[/math] and [math]\displaystyle{ a \gt 1 }[/math] to be chosen later (in practice we take c close to [math]\displaystyle{ 1/\log k }[/math], T a small multiple of c, and [math]\displaystyle{ \tau }[/math] a small multiple of c/k.

For any symmetric function F on the simplex [math]\displaystyle{ {\mathcal R}_k }[/math], one has

[math]\displaystyle{ J_k^{(1)}(F) \leq \frac{M_k}{k} I_k(F) }[/math]

and so by scaling, if F is a symmetric function on the dilated simplex [math]\displaystyle{ r \cdot {\mathcal R}_k }[/math], one has

[math]\displaystyle{ J_k^{(1)}(F) \leq \frac{r M_k}{k} I_k(F) }[/math]

after adjusting the definition of the functionals [math]\displaystyle{ I_k, J_k^{(1)} }[/math] suitably for this rescaled simplex.

Now let us apply this inequality r in the interval [math]\displaystyle{ [1,1+\tau] }[/math] and to truncated tensor product functions

[math]\displaystyle{ F(t_1,\ldots,t_k) = 1_{t_1+\ldots+t_k\leq r} \prod_{i=1}^k m_2^{-1/2} g(t_i) }[/math]

for some bounded measurable [math]\displaystyle{ g: [0,T] \to {\mathbf R} }[/math], not identically zero, with [math]\displaystyle{ m_2 := \int_0^T g(t)^2\ dt }[/math]. We have the probabilistic interpretations

[math]\displaystyle{ J_k^{(1)}(F) := m_2^{-1} {\mathbf E} ( \int_{[0, r - S_{k-1}]} g(t)\ dt)^2 }[/math]

and

[math]\displaystyle{ I(F) := m_2^{-1} {\mathbf E} \int_{[0,r - S_{k-1}]} g(t)^2\ dt }[/math]
[math]\displaystyle{ = {\mathbf P} (S_k \leq r) }[/math]

where [math]\displaystyle{ S_{k-1} := X_1 + \ldots X_{k-1} }[/math], [math]\displaystyle{ S_k := X_1 + \ldots + X_k }[/math] and [math]\displaystyle{ X_1,\ldots,X_k }[/math] are iid random variables in [0,T] with law [math]\displaystyle{ m_2^{-1} g(t)^2\ dt }[/math], and we adopt the convention that [math]\displaystyle{ \displaystyle \int_{[a,b]} f }[/math] vanishes when [math]\displaystyle{ b \lt a }[/math]. We thus have

[math]\displaystyle{ {\mathbf E} ( \int_{[0, r - S_{k-1}]} g(t)\ dt)^2 \leq \frac{r M_k}{k} m_2 {\mathbf P} ( S_k \leq r ) }[/math] (*)

for any r.

We now introduce the random function [math]\displaystyle{ h = h_r }[/math] by

[math]\displaystyle{ h(t) := \frac{1}{r - S_{k-1} + (k-1) t} 1_{S_{k-1} \lt r}. }[/math]

Observe that if [math]\displaystyle{ S_{k-1} \lt r }[/math], then

[math]\displaystyle{ \int_{[0, r-S_{k-1}]} h(t)\ dt = \frac{\log k}{k-1} }[/math]

and hence by the Legendre identity

[math]\displaystyle{ ( \int_{[0, r - S_{k-1}]} g(t)\ dt)^2 = \frac{\log k}{k-1} \int_{[0, r - S_{k-1}]} \frac{g(t)^2}{h(t)}\ dt - \frac{1}{2} \int_{[0,r-S_{k-1}]} \int_{[0,r-S_{k-1}]} \frac{(g(s) h(t)-g(t) h(s))^2}{h(s) h(t)}\ ds dt. }[/math]

We also note that (using the iid nature of the [math]\displaystyle{ X_i }[/math] to symmetrise)

[math]\displaystyle{ {\mathbf E} \int_{[0, r - S_{k-1}]} g(t)^2/h(t)\ dt = m_2 {\mathbf E} 1_{S_k \leq r} / h( X_k ) }[/math]
[math]\displaystyle{ = m_2 {\mathbf E} 1_{S_k \leq r} (1 - X_1 - \ldots - X_k + k X_k ) }[/math]
[math]\displaystyle{ = m_2 {\mathbf E} 1_{S_k \leq r} }[/math]
[math]\displaystyle{ = m_2 {\mathbf P}( S_k \leq r ). }[/math]

Inserting these bounds into (*) and rearranging, we conclude that

[math]\displaystyle{ r \Delta_k {\mathbf P} ( S_k \leq r ) \leq \frac{k}{2m_2} {\mathbf E} \int_{[0,r-S_{k-1}]} \int_{[0,r-S_{k-1}]} \frac{(g(s) h(t)-g(t) h(s))^2}{h(s) h(t)}\ ds dt }[/math]

where [math]\displaystyle{ \Delta_k := \frac{k}{k-1} \log k - M_k }[/math] is the defect from the upper bound. Splitting the integrand into regions where s or t is larger than or less than T, we obtain

[math]\displaystyle{ r \Delta_k {\mathbf P} ( S_k \leq r ) \leq Y_1 + Y_2 }[/math]

where

[math]\displaystyle{ Y_1 := \frac{k}{m_2} {\mathbf E} \int_{[0,T]} \int_{[T,r-S_{k-1}]} \frac{g(t)^2}{h(t)} h(s)\ ds dt }[/math]

and

[math]\displaystyle{ Y_2 := \frac{k}{2 m_2} {\mathbf E} \int_{[0,\min(r-S_{k-1},T)]} \int_{[0,\min(r-S_{k-1},T)]} \frac{(g(s) h(t)-g(t) h(s))^2}{h(s) h(t)}\ ds dt. }[/math]

We now focus on [math]\displaystyle{ Y_1 }[/math]. It is only non-zero when [math]\displaystyle{ S_{k-1} \leq r-T }[/math]. Bounding [math]\displaystyle{ h(s) \leq \frac{1}{(k-1)s} }[/math], we see that

[math]\displaystyle{ Y_1 \leq \frac{k}{(k-1) m_2} {\mathbf E} \int_0^T \frac{g(t)^2}{h(t)}\ dt \times \log_+ \frac{r-S_{k-1}}{T} }[/math]

where [math]\displaystyle{ \log_+(x) }[/math] is equal to [math]\displaystyle{ \log x }[/math] when [math]\displaystyle{ x \geq 1 }[/math] and zero otherwise. We can rewrite this as

[math]\displaystyle{ Y_1 \leq \frac{k}{k-1} {\mathbf E} 1_{S_k \leq r} \frac{1}{h(X_k)} \log_+ \frac{r-S_{k-1}}{T}. }[/math]

We write [math]\displaystyle{ \frac{1}{h(X_k)} = r-S_k + kX_k }[/math] and [math]\displaystyle{ \frac{r-S_{k-1}}{T} = \frac{r-S_k}{T} + \frac{X_k}{T} }[/math]. Using the bound [math]\displaystyle{ \log_+(x+y) \leq \log_+(x) + \log_+(1+y) }[/math] we have

[math]\displaystyle{ \log_+ \frac{r-S_{k-1}}{T} \leq \log_+ \frac{r-S_{k}}{T} + \log(1 + \frac{X_k}{T}) }[/math]

and thus (bounding [math]\displaystyle{ \log(1+\frac{X_k}{T}) \leq \frac{X_k}{T} }[/math]).

[math]\displaystyle{ Y_1 \leq \frac{k}{k-1} {\mathbf E} (r-S_k + kX_k) \log_+ \frac{r-S_{k}}{T} + (r-S_k)_+ \frac{X_k}{T} + k X_k \log(1+\frac{X_k}{T}) }[/math].

Symmetrising, we conclude that

[math]\displaystyle{ Y_1 \leq \frac{k}{k-1} (Z_1 + Z_2 + Z_3) }[/math]

where

[math]\displaystyle{ Z_1 := {\mathbf E} r \log_+ \frac{r-S_{k}}{T} }[/math]
[math]\displaystyle{ Z_2 := {\mathbf E} (r-S_k)_+ \frac{S_k}{kT} }[/math]
[math]\displaystyle{ Z_3 := m_2^{-1} \int_0^T kt \log(1 + \frac{t}{T}) g(t)^2\ dt. }[/math]

For [math]\displaystyle{ Z_2 }[/math], which is a tiny term, we use the crude bound

[math]\displaystyle{ Z_2 \leq \frac{r^2}{4kT}. }[/math]

For [math]\displaystyle{ Z_1 }[/math], we use the bound

[math]\displaystyle{ \log_+ x \leq \frac{(x+2a\log a - a)^2}{4a^2 \log a} }[/math]

valid for any [math]\displaystyle{ a\gt 1 }[/math], which can be verified because the LHS is concave for [math]\displaystyle{ x \geq 1 }[/math], while the RHS is convex and is tangent to the LHS as x=a. We then have

[math]\displaystyle{ \log_+ \frac{r-S_{k}}{T} \leq \frac{(r-S_k+2aT\log a-aT)^2}{4a^2 T^2\log a} }[/math]

and thus

[math]\displaystyle{ Z_1 \leq r (\frac{(r-k\mu+2aT\log a-aT)^2 + k \sigma^2}{4a^2 T^2 \log a} ) }[/math]

where

[math]\displaystyle{ \mu := m_2^{-1} \int_0^T t g(t)^2\ dt }[/math]
[math]\displaystyle{ \sigma^2 := m_2^{-1} \int_0^T t^2 g(t)^2\ dt - \mu^2. }[/math]

A good choice for [math]\displaystyle{ a=a[r] }[/math] here is [math]\displaystyle{ a = \frac{r-k\mu}{T} }[/math] (assuming [math]\displaystyle{ 1-k\mu \geq T }[/math]), in which case the formula simplifies to

[math]\displaystyle{ Z_1 \leq r (\log \frac{r-k\mu}{T} + \frac{k \sigma^2}{4(r-k\mu)^2 \log a}) }[/math]


Thus far, our arguments have been valid for arbitrary functions [math]\displaystyle{ g }[/math]. We now specialise to functions of the form

[math]\displaystyle{ g(t) := \frac{1}{c+(k-1)t}. }[/math]

Note the identity

[math]\displaystyle{ \displaystyle g(t) - h(t) = (r - S_{k-1} - c) g(t) h(t) }[/math]

on [math]\displaystyle{ [0, \min(r-S_{k-1},T)] }[/math]. Thus

[math]\displaystyle{ Y_2 = \frac{k}{2 m_2} {\mathbf E} \int_{[0,\min(r-S_{k-1},T)]} \int_{[0,\min(r-S_{k-1},T)]} \frac{((g-h)(s) h(t)-(g-h)(t) h(s))^2}{h(s) h(t)}\ ds dt }[/math]
[math]\displaystyle{ = \frac{k}{2 m_2} {\mathbf E} (r - S_{k-1} - c)^2 \int_{[0,\min(r-S_{k-1},T)]} \int_{[0,\min(r-S_{k-1},T)]} (g(s)-g(t))^2 h(s) h(t)\ ds dt. }[/math]

Bounding [math]\displaystyle{ (g(s)-g(t))^2 \leq g(s)^2+g(t)^2 }[/math] and using symmetry, we conclude

[math]\displaystyle{ Y_2 \leq \frac{k}{m_2} {\mathbf E} (r - S_{k-1} - c)^2 \int_{[0,\min(r-S_{k-1},T)]} \int_{[0,\min(r-S_{k-1},T)]} g(s)^2 h(s) h(t)\ ds dt. }[/math]

Since [math]\displaystyle{ \int_0^{r-S_{k-1}} h(t)\ dt = \frac{\log k}{k-1} }[/math], we conclude that

[math]\displaystyle{ Y_2 \leq \frac{k}{k-1} Z_4 }[/math]

where [math]\displaystyle{ Z_4 = Z_4[r] }[/math] is the quantity

[math]\displaystyle{ Z_4 := \frac{\log k}{m_2} {\mathbf E} (r - S_{k-1} - c)^2 \int_{[0,\min(r-S_{k-1},T)]} g(s)^2 h_r(s)\ ds. }[/math]

Putting all this together, we have

[math]\displaystyle{ r \Delta_k {\mathbf P} ( S_k \leq r ) \leq \frac{k}{k-1} (r (\log \frac{r-k\mu}{T} + \frac{k \sigma^2}{4(r-k\mu)^2 \log a}) + \frac{r^2}{4kT} + Z_3 + Z_4[r] ). }[/math]

At this point we encounter a technical problem that [math]\displaystyle{ Z_4 }[/math] diverges logarithmically (up to a cap of [math]\displaystyle{ \log k }[/math]) as [math]\displaystyle{ S_{k-1} }[/math] approaches r. To deal with this issue we average in r, and specifically over the interval [math]\displaystyle{ [1,1+\tau] }[/math]. One can calculate that

[math]\displaystyle{ \int_0^1 (1+u\tau) 1_{x \gt 1+u\tau}\ du \leq (1+\tau/2) \frac{(x-k\mu)^2}{(1+\tau-k\mu)^2} }[/math]

for all x if we have [math]\displaystyle{ 1-k\mu \geq \tau }[/math] (because the two expressions touch at [math]\displaystyle{ x=1+\tau }[/math], with the RHS being convex with slope at least [math]\displaystyle{ (1+\tau/2)/\tau }[/math] there. and the LHS lying underneath this tangent line). Assuming this, we conclude that

[math]\displaystyle{ \int_0^1 (1+u\tau) {\mathbf P} ( S_k \leq 1+u\tau )\ du \leq (1 + \frac{\tau}{2}) (1 - \frac{k \sigma^2}{(1+\tau-k\mu)^2}) }[/math]

provided that [math]\displaystyle{ k \mu \lt 1 - \tau }[/math], and hence

[math]\displaystyle{ \Delta_k (1 + \frac{\tau}{2}) (1 - \frac{k \sigma^2}{(1+\tau-k\mu)^2}) \leq \frac{k}{k-1} ( \frac{1}{\tau} \int_1^{1+\tau} (r (\log \frac{r-k\mu}{T} + \frac{k \sigma^2}{4a^2 T^2 \log a}) + \frac{r^2}{4kT}) dr + Z_3 + \int_0^1 Z_4[1+u\tau]\ du ). }[/math]

Now we deal with the [math]\displaystyle{ Z_4 }[/math] integral. We split into two contributions, depending on whether [math]\displaystyle{ 1+u\tau-S_{k-1} \leq 2c }[/math] or not. If [math]\displaystyle{ 1+u\tau-S_{k-1}\leq 2c }[/math], then we may bound

[math]\displaystyle{ (1+u\tau-S_{k-1}-c)^2 \leq c^2 }[/math]

when [math]\displaystyle{ 1+u\tau-S_{k-1} \geq 0 }[/math], so this portion of [math]\displaystyle{ \int_0^1 Z_4[1+u\tau]\ du }[/math] may be bounded by

[math]\displaystyle{ \frac{\log k}{m_2} c^2 {\mathbf E} \int_{[0,T]} g(s)^2 (\int_0^1 h_{1+u\tau}(s) 1_{1+a\tau-S_{k-1} \geq s}\ du)\ ds. }[/math]

Observe that

[math]\displaystyle{ \int_0^1 h_{1+u\tau}(s) 1_{1+u\tau-S_{k-1} \geq s}\ du = \int_0^1 \frac{du}{1-S_{k-1}+u\tau+(k-1)s} 1_{1-S_{k-1}+u\tau \geq s} }[/math]
[math]\displaystyle{ = \frac{1}{\tau} \int_{[\max(ks, 1-S_{k-1}+(k-1)s), 1-S_{k-1}+\tau+(k-1)s]} \frac{du}{u} }[/math]
[math]\displaystyle{ \leq \frac{1}{\tau} \log \frac{ks+\tau}{ks} }[/math]

and so this portion of [math]\displaystyle{ \int_0^1 Z_4[1+a\tau]\ da }[/math] is bounded by

[math]\displaystyle{ W X }[/math]

where

[math]\displaystyle{ W := m_2^{-1} \int_{[0,T)]} g(s)^2 \log(1+\frac{\tau}{ks})\ ds. }[/math]
[math]\displaystyle{ X := \frac{\log k}{\tau} c^2 }[/math]

As for the portion when [math]\displaystyle{ 1+u\tau-S_{k-1} \gt 2c }[/math], we bound [math]\displaystyle{ h_{1+u\tau}(s) \leq \frac{1}{2c+(k-1)t} }[/math], and so this portion of [math]\displaystyle{ \int_0^1 Z_4[1+u\tau]\ du }[/math] may be bounded by

[math]\displaystyle{ \int_0^1 \frac{\log k}{c} {\mathbf E} (1 + u\tau - S_{k-1} - c)^2 V\ du }[/math]
[math]\displaystyle{ = V U }[/math]

where

[math]\displaystyle{ V := \frac{c}{m_2} \int_0^T \frac{g(t)^2}{2c + (k-1)t}\ dt. }[/math]
[math]\displaystyle{ U := \frac{\log k}{c} \int_0^1 (1 + u\tau - (k-1)\mu - c)^2+ (k-1)\sigma^2\ du }[/math]

We thus arrive at the final bound

[math]\displaystyle{ \Delta_k \leq \frac{k}{k-1} \frac{ \frac{1}{\tau} \int_1^{1+\tau} (r (\log \frac{r-k\mu}{T} + \frac{k \sigma^2}{4(r-k\mu)^2 \log a}) + \frac{r^2}{4kT}) dr + Z_3 + W X + VU}{ (1+\tau/2)(1 - \frac{k\sigma^2}{(1+\tau-k\mu)^2})} }[/math]

provided that [math]\displaystyle{ k \mu \lt 1 - \tau }[/math] and the denominator is positive.

Asymptotic analysis

We work in the asymptotic regime [math]\displaystyle{ k \to \infty }[/math]. Setting

[math]\displaystyle{ c:= \frac{1}{\log k} + \frac{\alpha}{\log^2 k} }[/math]
[math]\displaystyle{ T:= \frac{\beta}{\log k} }[/math]
[math]\displaystyle{ \tau := \frac{\gamma}{\log k} }[/math]

for some absolute constants [math]\displaystyle{ \alpha \in {\mathbf R} }[/math] and [math]\displaystyle{ \beta,\gamma \gt 0 }[/math], one calculates

[math]\displaystyle{ 1-k\mu= \frac{\alpha+\log\beta- 1+ o(1)}{\log k} }[/math]
[math]\displaystyle{ k\sigma^2 = \frac{\beta+o(1)}{\log^2 k} }[/math]

and so the constraint [math]\displaystyle{ 1-k\mu \gt \tau }[/math] becomes

[math]\displaystyle{ \alpha + \log \beta + \gamma \leq 1. }[/math]

With [math]\displaystyle{ r =1 + \frac{u\gamma}{\log k} }[/math] for [math]\displaystyle{ 0 \leq u \leq 1 }[/math], one has

[math]\displaystyle{ a = \frac{1-\alpha-\log \beta+u \gamma +o(1)}{\beta} }[/math]
[math]\displaystyle{ Z_1 \leq \log a + \frac{1}{4\beta a \log^2 a} }[/math]

and one also calculates

[math]\displaystyle{ Z_2, Z_3 = o(1) }[/math]
[math]\displaystyle{ W = \int_0^\infty \frac{1}{(1+t)^2} \log(1+\frac{\gamma}{t})\ dt +o(1) }[/math]
[math]\displaystyle{ = \frac{\gamma \log(\gamma)}{\gamma-1} + o(1) }[/math]
[math]\displaystyle{ X = \frac{1}{\gamma} + o(1) }[/math]
[math]\displaystyle{ V = \int_0^1\frac{dt}{(1+t)^2 (2+t)} + o(1) = 1 - \log(2) + o(1) }[/math]
[math]\displaystyle{ U = \int_0^1 (u\gamma -\alpha-\log \beta)^2\ du + \beta + o(1) }[/math]
[math]\displaystyle{ = \frac{(\gamma-\alpha-\log\beta)^3 + (\alpha+\log\beta)^3}{3}+ \beta + o(1) }[/math]

and hence

[math]\displaystyle{ \Delta \leq \frac{ \int_0^1 (\log a(u) + \frac{1}{4\beta a(u) \log^2 a(u)})\ du + \frac{\log(\gamma)}{\gamma-1} + \frac{1-\log(2)}{3} ((\gamma-\alpha-\log\beta)^3 + (\alpha+\log\beta)^3+3\beta) }{1 - \frac{\beta}{(1-\alpha-\log \beta-\gamma)^2} }+ o(1) }[/math]

assuming that [math]\displaystyle{ \alpha+\log \beta + \gamma\lt 1 }[/math] and

[math]\displaystyle{ 1 - \frac{\beta}{(1-\alpha-\log \beta-\gamma)^2} \gt 0, }[/math]

and where [math]\displaystyle{ a(u):=\frac{1-\alpha-\log \beta+u \gamma}{\beta} }[/math].

In particular, by setting [math]\displaystyle{ \alpha,\beta,\gamma }[/math] as absolute constants obeying these constraints, we have [math]\displaystyle{ \Delta \leq O(1) }[/math], and so

[math]\displaystyle{ M_k = \log k+O(1). }[/math]

More general variational problems

It appears that for the purposes of establish DHL type theorems, one can increase the range of F in which one is taking suprema over (and extending the range of integration in the definition of [math]\displaystyle{ J_k^{(m)}(F) }[/math] accordingly). Firstly, one can enlarge the simplex [math]\displaystyle{ {\mathcal R}_k }[/math] to the larger region

[math]\displaystyle{ {\mathcal R}'_k = \{ (t_1,\ldots,t_k) \in [0,1]^k: t_1+\ldots+t_k \leq 1 + \min(t_1,\ldots,t_k) \} }[/math]

provided that one works with a generalisation of [math]\displaystyle{ EH[\theta] }[/math] which controls more general Dirichlet convolutions than the von Mangoldt function (a precise assertion in this regard may be found in BFI). In fact (as shown here) one can work in any larger region [math]\displaystyle{ R }[/math] for which

[math]\displaystyle{ R + R \subset \{ (t_1,\ldots,t_k) \in [0,2/\theta]^k: t_1+\ldots+t_k \leq 2 + \max(t_1,\ldots,t_k) \} \cup \frac{2}{\theta} \cdot {\mathcal R}_k }[/math]

provided that all the marginal distributions of F are supported on [math]\displaystyle{ {\mathcal R}_{k-1} }[/math], thus (assuming F is symmetric)

[math]\displaystyle{ \int_0^\infty F(t_1,\ldots,t_{k-1},t_k)\ dt_k = 0 }[/math] when [math]\displaystyle{ t_1+\ldots+t_{k-1} \gt 1. }[/math]

For instance, one can take [math]\displaystyle{ R = \frac{1}{\theta} \cdot {\mathcal R}_k }[/math], or one can take [math]\displaystyle{ R = \{ (t_1,\ldots,t_k) \in [0,1/\theta]^k: t_1 +\ldots +t_{k-1} \leq 1 \} }[/math] (although the latter option breaks the symmetry for F). See this blog post for more discussion.

If the marginal distributions of F are supported in [math]\displaystyle{ (1+\varepsilon) \cdot {\mathcal R}_{k-1} }[/math] instead of [math]\displaystyle{ {\mathcal R}_{k-1} }[/math], one still has a usable lower bound in which [math]\displaystyle{ J_k^{(m)}(F) }[/math] is replaced by the slightly smaller quantity [math]\displaystyle{ J_{k,\varepsilon}^{(m)}(F) }[/math]; see this blog post for more discussion.

World records

[math]\displaystyle{ k }[/math] [math]\displaystyle{ M_k }[/math] [math]\displaystyle{ M'_k }[/math] [math]\displaystyle{ M''_k }[/math] (prism) [math]\displaystyle{ M''_k }[/math] (symmetric) [math]\displaystyle{ M''_k }[/math] (non-convex) [math]\displaystyle{ M_{k,\varepsilon,1/2} }[/math]
1D Lower Upper Lower Upper Lower Upper Lower Upper Lower Upper Lower Upper
2 1.383 1.38593... 1.38593... 2 2 2 2 2 2 1.690608 2
3 1.635 1.64644 1.648 1.8429 2.080 1.914 2.38593... 1.902626 3 1.847 3 1.91726 2.4142
4 1.820 1.845407 1.848 1.937 2.198 1.951 2.648 1.937 4 2.05411 2.5946
5 1.965 2.007145 2.011797 2.059 2.311 2.848 2.20264 2.7466
10 2.409 2.54547 2.55842 2.7466 2.68235 3.2716
20 2.810 3.12756 3.1534 3.2716 3.21470 3.8564
30 3.015 3.48313 3.51848 3.6079 3.51943 4.2182
40 3.145 3.73919 3.783467 3.8564 3.71480 4.4815
50 3.236 3.93586 3.99186 4.0540 4.0043 4.6889
51 3.244 3.95304 4.0105 4.0717 4.011910 4.7075
53 3.259 3.986213 4.0466 4.1062 4.000161 4.744
54 3.266 4.00223 4.0642 4.1230
55 3.273 4.01788 4.0816 4.1396
59 3.299 4.07704 4.1478399 4.2030


[math]\displaystyle{ k }[/math] [math]\displaystyle{ M_{k,\varepsilon,1} }[/math] [math]\displaystyle{ M'_{k,\varepsilon,1} }[/math] (sym) [math]\displaystyle{ M''_{k,\varepsilon,1} }[/math] (prism) [math]\displaystyle{ M''_{k,\varepsilon,1} }[/math] (sym) [math]\displaystyle{ M''_{k,\varepsilon,1} }[/math] (non-convex) [math]\displaystyle{ \hat M_{k,\varepsilon,1} }[/math] [math]\displaystyle{ \hat M'_{k,\varepsilon,1} }[/math] (sym) [math]\displaystyle{ \hat M''_{k,\varepsilon,1} }[/math] (sym) [math]\displaystyle{ \hat M''_{k,\varepsilon,1} }[/math] (prism)
2 1.38593... 2 2 2 2 1.690608 2
3 1.8615 1.956713 1.936708 1.962998 1.9400 1.91726 1.992 2.0012 1.965
4 2.0023 2.0023 2.0023 2.0023 2.05411


For k>2, all upper bounds on [math]\displaystyle{ M_k }[/math] come from (1). Upper bounds on [math]\displaystyle{ M'_k }[/math] come from the inequality [math]\displaystyle{ M'_k \leq \frac{k}{k-1} M_{k-1} }[/math] that follows from an averaging argument, and upper bounds on [math]\displaystyle{ M''_k }[/math] (on EH, using the prism [math]\displaystyle{ \{ t_1+\ldots+t_{k-1},t_k \leq 1\} }[/math] as the domain) come from the inequality [math]\displaystyle{ M''_k \leq M_{k-1} + 1 }[/math] by comparing [math]\displaystyle{ M''_k }[/math] with a variational problem on the prism (details here). The 1D bound [math]\displaystyle{ 4k(k-1)/j_{k-2}^2 }[/math] is the optimal value for [math]\displaystyle{ M_k }[/math] when the underlying function [math]\displaystyle{ F }[/math] is restricted to be of the "one-dimensional" form [math]\displaystyle{ F(t_1,\ldots,t_k) = f(t_1+\ldots+t_k) }[/math].

For k=2, [math]\displaystyle{ M'_2=M''_2 = 2 }[/math] can be computed exactly by taking F to be the indicator function of the unit square (for the lower bound), and by using Cauchy-Schwarz (for the upper bound). [math]\displaystyle{ M_2=\frac{1}{1-W(1/e)} \approx 1.38593 }[/math] can be computed exactly as the solution to the equation [math]\displaystyle{ 2-\frac{1}{x} + \log(1-\frac{1}{x}) = 0 }[/math].

The quantity [math]\displaystyle{ M_{k,\varepsilon,\theta} }[/math] is defined as the supremum of [math]\displaystyle{ \sum_{m=1}^k J_{k,\varepsilon}^{(m)}(F) / I_k(F) }[/math] where F is now supported on [math]\displaystyle{ [0,1/\theta]^k \cap (1+\varepsilon) \cdot {\mathcal R}_k }[/math], and [math]\displaystyle{ J_{k,\varepsilon}^{(m)} }[/math] is defined as [math]\displaystyle{ J_k^{(m)} }[/math] but now restricted to [math]\displaystyle{ (1-\varepsilon) \cdot {\mathcal R}_k }[/math], and [math]\displaystyle{ 0 \leq \varepsilon \leq \max(1/(k-1),1/\theta-1) }[/math] is a parameter to be optimised in. The quantity [math]\displaystyle{ M'_{k,\varepsilon,\theta} }[/math] is defined similarly, but with F now supported on [math]\displaystyle{ [0,1/\theta]^k \cap (1+\varepsilon) \cdot {\mathcal R}'_k \cap \max(\frac{k}{k-1},\frac{1}{\theta}) \cdot {\mathcal R}_k }[/math]. Finally, [math]\displaystyle{ M''_{k,\varepsilon,\theta} }[/math] is also defined similarly, but with F supported on a region R as above, with all marginals supported on [math]\displaystyle{ (1+\varepsilon) \cdot {\mathcal R}_{k-1} }[/math].

The quantities [math]\displaystyle{ \hat M_{k,\varepsilon,\theta}, \hat M'_{k,\varepsilon,\theta}, \hat M''_{k,\varepsilon,\theta} }[/math] are defined similarly to [math]\displaystyle{ M_{k,\varepsilon,\theta}, M'_{k,\varepsilon,\theta}, M''_{k,\varepsilon,\theta} }[/math], but with the truncation of [math]\displaystyle{ R }[/math] to [math]\displaystyle{ [0,1/\theta]^k }[/math] removed.

For k=3, we also have the non-convex candidate [math]\displaystyle{ R = \{ x+y,x+z \leq 1 \} \cup \{ x+y,y+z\leq 1 \} \cup \{ x+z,y+z \leq 1 \} }[/math].

The crude upper bound of [math]\displaystyle{ k }[/math] for any of the [math]\displaystyle{ M_k }[/math] type quantities comes from the parity problem obstruction that each separate event "[math]\displaystyle{ n+h_i }[/math] prime" can occur with probability at most 1/2.

Here are some notes on polytope decomposition for the k=3 case.

Quantity Polytope constraints Vanishing marginals? Epsilon trick? Need GEH?
[math]\displaystyle{ M_k }[/math] [math]\displaystyle{ t_1+\dots+t_k \leq 1 }[/math] No No No
[math]\displaystyle{ M'_k }[/math] [math]\displaystyle{ t_1+\dots+t_k - t_j \leq 1 }[/math] for all [math]\displaystyle{ j }[/math] No No Yes
[math]\displaystyle{ M''_k }[/math] (prism) [math]\displaystyle{ t_1+\dots+t_{k-1} \leq 1 }[/math]

[math]\displaystyle{ t_k \leq 1/\theta }[/math]

Yes No Yes
[math]\displaystyle{ M''_k }[/math] (symmetric) [math]\displaystyle{ t_1+\dots+t_k \leq \max(\frac{k}{k-1},\frac{1}{\theta}) }[/math]

[math]\displaystyle{ t_j \leq 1/\theta }[/math] for all [math]\displaystyle{ j }[/math]

Yes No Yes
[math]\displaystyle{ M''_3 }[/math] (nonconvex)

[math]\displaystyle{ (k,\theta)=(3,1) }[/math]

[math]\displaystyle{ t_1+t_2,t_1+t_3 \leq 1 }[/math] OR

[math]\displaystyle{ t_1+t_2,t_2+t_3 \leq 1 }[/math] OR

[math]\displaystyle{ t_1+t_3,t_1+t_3 \leq 1 }[/math]

Yes No Yes
[math]\displaystyle{ M''_3 }[/math] (nonconvex II)

[math]\displaystyle{ (k,\theta)=(3,1) }[/math]

[math]\displaystyle{ 2t_1+t_2+t_3,t_1+2t_2+t_3 \leq 2 }[/math] OR

[math]\displaystyle{ 2t_1+t_2+t_3,t_1+t_2+2t_3 \leq 2 }[/math] OR

[math]\displaystyle{ t_1+2t_2+t_3,t_1+t_2+2t_3 \leq 2 }[/math]; AND ALSO

[math]\displaystyle{ t_1,t_2,t_3 \leq 1 }[/math]

Yes No Yes
[math]\displaystyle{ \hat M''_k }[/math] (symmetric) [math]\displaystyle{ t_1+\dots+t_k \leq \max(\frac{k}{k-1},\frac{1}{\theta}) }[/math] Yes No Yes!
[math]\displaystyle{ \hat M''_k }[/math] (prism) [math]\displaystyle{ t_1+\dots+t_{k-1} \leq 1 }[/math]

[math]\displaystyle{ t_k \leq \frac{2}{\theta} }[/math]

Yes No Yes!
[math]\displaystyle{ M_{k,\varepsilon,\theta} }[/math] [math]\displaystyle{ t_1+\dots+t_k \leq 1+\varepsilon }[/math]

[math]\displaystyle{ t_1+\dots+t_k \leq \max(\frac{k}{k-1},\frac{1}{\theta}) }[/math]

[math]\displaystyle{ t_j \leq \frac{1}{\theta} }[/math] for all [math]\displaystyle{ j }[/math]

No Yes Yes
[math]\displaystyle{ M'_{k,\varepsilon,\theta} }[/math] (symmetric) [math]\displaystyle{ t_1+\dots+t_k-t_j \leq 1+\varepsilon }[/math] for all [math]\displaystyle{ j }[/math]

[math]\displaystyle{ t_1+\dots+t_k \leq \max(\frac{k}{k-1},\frac{1}{\theta}) }[/math]

[math]\displaystyle{ t_j \leq \frac{1}{\theta} }[/math] for all [math]\displaystyle{ j }[/math]

No Yes Yes
[math]\displaystyle{ M''_{k,\varepsilon,\theta} }[/math] (prism) [math]\displaystyle{ t_1+\dots+t_{k-1} \leq 1 }[/math]

[math]\displaystyle{ t_k \leq \frac{1}{\theta} }[/math]

Yes Yes Yes
[math]\displaystyle{ M''_{k,\varepsilon,\theta} }[/math] (symmetric) [math]\displaystyle{ t_1+\dots+t_k \leq \max(\frac{k}{k-1},\frac{1}{\theta}) }[/math]

[math]\displaystyle{ t_j \leq \frac{1}{\theta} }[/math] for all [math]\displaystyle{ j }[/math]

Yes Yes Yes
[math]\displaystyle{ M''_{3,\varepsilon,1} }[/math] (non-convex)

[math]\displaystyle{ (k,\theta)=(3,1) }[/math]

[math]\displaystyle{ t_1+t_2,t_1+t_3 \leq 1 }[/math] OR

[math]\displaystyle{ t_1+t_2,t_2+t_3 \leq 1 }[/math] OR

[math]\displaystyle{ t_1+t_3,t_1+t_3 \leq 1 }[/math]

Yes Yes Yes
[math]\displaystyle{ \hat M_{k,\varepsilon,\theta} }[/math] [math]\displaystyle{ t_1+\dots+t_k \leq 1+\varepsilon }[/math]

[math]\displaystyle{ t_1+\dots+t_k \leq \max(\frac{k}{k-1},\frac{1}{\theta}) }[/math]

No Yes Yes!
[math]\displaystyle{ \hat M'_{k,\varepsilon,\theta} }[/math] (symmetric) [math]\displaystyle{ t_1+\dots+t_k-t_j \leq 1+\varepsilon }[/math] for all [math]\displaystyle{ j }[/math]

[math]\displaystyle{ t_1+\dots+t_k \leq \max(\frac{k}{k-1},\frac{1}{\theta}) }[/math]

No Yes Yes!
[math]\displaystyle{ \hat M''_{k,\varepsilon,\theta} }[/math] (symmetric) [math]\displaystyle{ t_1+\dots+t_k \leq \max(\frac{k}{k-1},\frac{1}{\theta}) }[/math] Yes Yes Yes!