Roth's Theorem concerning discrepancy on Arithmetic Progression

From Polymath Wiki
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Letting [math]\displaystyle{ \scriptstyle \mathcal{AP}(N) }[/math] be the collection of all (not necessarily homogeneous) arithmetic progressions in [math]\displaystyle{ \{1, 2, ..., N\} }[/math], Roth (1964) proved that [math]\displaystyle{ \scriptstyle \delta(\mathcal{AP}(N),{\mathbf x}) \geq c n^{1/4} }[/math], independent of the sequence [math]\displaystyle{ \scriptstyle {\mathbf x} }[/math].

  • Roth, K. F., Remark concerning integer sequences. Acta Arith. 9 1964 257--260. MR0168545 (29 #5806)

In 2000, László Lovász gave another proof using semidefinite optimization.

  • Lovász, László, Integer sequences and semidefinite programming. Dedicated to Professor Kálmán Gyory on the occasion of his 60th birthday.

Publ. Math. Debrecen 56 (2000), no. 3-4, 475--479. MR1765994 (2001k:11020)

Matoušek and Spencer (1996) have proven that the exponent [math]\displaystyle{ 1/4 }[/math] is best possible, although their construction is not constructive.

"Representation of the identity proof of [math]\displaystyle{ \scriptstyle \gg n^{1/8} }[/math]

This is an elaboration of the outline in this post. A further elaboration that achieves Roth's bound of [math]\displaystyle{ n^{1/4} }[/math] can be found here.

We break the proof down into 9 easy pieces.

  • First, we consider APs modulo [math]\displaystyle{ n }[/math] for a large [math]\displaystyle{ n }[/math]. At the end, we'll have proven the existence of an AP with diameter smaller than n, so that if it wraps around it does so only once, and so at least half of the discrepancy is on a non-wrap-around AP. Set [math]\displaystyle{ \omega=e^{2\pi i /n} }[/math]. Let [math]\displaystyle{ \scriptstyle {\mathbf x} }[/math] be a [math]\displaystyle{ \scriptstyle \pm1 }[/math] column vector (our coloring!), and let [math]\displaystyle{ \delta }[/math] be the maximum discrepancy of [math]\displaystyle{ \scriptstyle {\mathbf x} }[/math] along any AP. Summations are over all of [math]\displaystyle{ {\mathbb Z}_n }[/math] unless noted otherwise, or obviously otherwise.
  • Let [math]\displaystyle{ P_i }[/math] be the indicator vector of the [math]\displaystyle{ i }[/math]-th AP. Suppose that we have some [math]\displaystyle{ \lambda_{i,j} }[/math] with
[math]\displaystyle{ I = \sum_{i,j} \lambda_{i,j} P_i P_j^T, }[/math]

where [math]\displaystyle{ I }[/math] is the [math]\displaystyle{ \scriptstyle n\times n }[/math] identity matrix. Then

[math]\displaystyle{ n={\mathbf x}^T {\mathbf x} = {\mathbf x}^T I {\mathbf x} ={\mathbf x}^T\left(\sum \lambda_{i,j} P_i P_j^T\right) {\mathbf x} =\sum_{i,j} \lambda ({\mathbf x}^T P_i) (P_j^T {\mathbf x}) \leq \delta^2 \sum |\lambda_{i,j}|. }[/math]

So, if [math]\displaystyle{ \sum |\lambda_{i,j}| = o(n) }[/math], necessarily [math]\displaystyle{ \delta\to\infty }[/math]. This simplifies the problem considerably: can the identity matrix be written in the form [math]\displaystyle{ \sum \lambda_{i,j} P_i P_j^T }[/math] with small [math]\displaystyle{ \lambda_{i,j} }[/math]? We no longer need concern ourselves with the vector [math]\displaystyle{ \scriptstyle {\mathbf x} }[/math].

  • Set [math]\displaystyle{ P_{d,a} }[/math] to be the indicator vector of the AP with length [math]\displaystyle{ 2m+1 }[/math] (with [math]\displaystyle{ m }[/math] to be determined later), difference [math]\displaystyle{ d\gt 1 }[/math], and center [math]\displaystyle{ a }[/math]. Set [math]\displaystyle{ \omega_r(t)=\omega^{rt} }[/math], a column vector whose [math]\displaystyle{ t }[/math]-th component is [math]\displaystyle{ \omega^{rt} }[/math]. Note that
[math]\displaystyle{ I = \frac1n \sum_r \omega_r \overline{\omega_r}^T. }[/math]

We now proceed to write [math]\displaystyle{ \omega_r }[/math] as a sum of [math]\displaystyle{ P_{d,a} }[/math].

  • We express a sum in two ways:
[math]\displaystyle{ \sum_s P_{d,0}(s) \omega^{r(t-s)} = \omega^{rt} \sum_s P_{d,0}\omega^{-rs}=\omega^{rt} \hat{P}_{d,0}(-r) }[/math]

and

[math]\displaystyle{ \sum_s P_{d,0}(s) \omega^{r(t-s)} = \sum_s P_{d,0}(t-s) \omega^{rs}=\sum_s \omega^{rs}P_{d,s}(t) }[/math]

Thus,

[math]\displaystyle{ \omega_r = \frac{1}{\hat{P}_{d,0}(-r)} \sum_s \omega^{rs} P_{d,s}, }[/math]

the promised sum-of-APs representation of [math]\displaystyle{ \omega_r }[/math], and consequently the promised representation of [math]\displaystyle{ I }[/math] as a sum of [math]\displaystyle{ P_{d,s_1} P_{d,s_2}^T }[/math]. All that remains is to analyze the coefficients of the representation of [math]\displaystyle{ I }[/math].

  • As [math]\displaystyle{ \hat{P}_{d,0}(-r)=\sum_s P_{d,0}\omega^{-rs} }[/math], and [math]\displaystyle{ P_{d,0} }[/math] is symmetric about 0, we see that [math]\displaystyle{ \hat{P}_{d,0}(-r)=\hat{P}_{d,0}(r) }[/math] is real. Moreover, as it is a geometric series, it is routine to find
[math]\displaystyle{ \alpha_r:=\hat{P}_{d,0}(-r)=\frac{\sin\left((2m+1)\pi rd/n\right)}{\sin\left(\pi rd/n\right)} }[/math]

We will want [math]\displaystyle{ \alpha_r }[/math] to be bounded away from 0. To that end for each [math]\displaystyle{ r }[/math], fix now and forever [math]\displaystyle{ d }[/math] in [math]\displaystyle{ [1,\sqrt{n}] }[/math] with [math]\displaystyle{ \|rd/n\|\leq n^{-1/2} }[/math], where [math]\displaystyle{ \|\cdot\| }[/math] means the distance to the nearest integer. This is a classic result of diophantine approximation (Dirichlet's theorem), whose proof is just the pigeonhole principle. Setting [math]\displaystyle{ \scriptstyle m=\frac15 \sqrt n }[/math], we find

[math]\displaystyle{ \alpha_r = \frac{\sin\left((2m+1)\pi rd/n\right)}{\sin\left(\pi rd/n\right)} \geq \frac{2\cdot\left((2m+1)rd/n\right)}{\pi rd/n}\gt m, }[/math]

and more obviously [math]\displaystyle{ \scriptstyle \alpha_r\leq 2m+1 }[/math]. In short [math]\displaystyle{ \scriptstyle \alpha_r }[/math] has order [math]\displaystyle{ \scriptstyle \sqrt{n} }[/math], to within a constant factor.

  • [math]\displaystyle{ I = n^{-1} \sum_r \omega_r \overline{\omega_r}^T = n^{-1} \sum_r \alpha_r^{-2}\left(\sum_x \omega^{rx}P_{d,x}\right)\left(\sum_y \omega^{-ry}P_{d,y}^T\right) = \sum_{r,x,y} n^{-1} \alpha_r^{-2} \omega^{r(x-y)} P_{d,x}P_{d,y}^T }[/math]

It is tempting to try and be done the easy way, but [math]\displaystyle{ \sum_{r,x,y} \left|n^{-1} \alpha_r^{-2} \omega^{r(x-y)} \right| }[/math] is easily recognized as a sum with [math]\displaystyle{ n^3 }[/math] summands, each of which has size [math]\displaystyle{ \approx n^{-2} }[/math], so that the total is not [math]\displaystyle{ o(n) }[/math]. In short, we need to find some cancelation.

  • Set [math]\displaystyle{ R(\Delta)=\{r \colon d=\Delta\} }[/math]. By a pigeonhole argument, [math]\displaystyle{ \scriptstyle |R(\Delta)|\leq 2\sqrt{n} }[/math]. The coefficient of [math]\displaystyle{ P_{\Delta,x}P_{\Delta,y}^T }[/math] is [math]\displaystyle{ \sum_{r\in R(\Delta)} n^{-1}\alpha_r^{-2} \omega^{r(x-y)}. }[/math]

We need to bound

[math]\displaystyle{ \sum_{\Delta=1}^{\sqrt{n}} \sum_{x,y} \left| \sum_{r\in R(\Delta)} n^{-1}\alpha_r^{-2} \omega^{r(x-y)} \right|. }[/math]
  • [math]\displaystyle{ \sum_{x,y} \left| \sum_{r\in R(\Delta)} n^{-1}\alpha_r^{-2} \omega^{r(x-y)} \right| \leq \left(\sum_{x,y} 1^2 \right)^{1/2} \left( \sum_{x,y} \left| \sum_{r\in R(\Delta)} n^{-1}\alpha_r^{-2} \omega^{r(x-y)} \right|^2\right)^{1/2} }[/math]
[math]\displaystyle{ = n \,\cdot\,\left(\sum_{x,y} \sum_{r_1,r_2} n^{-2} \alpha_{r_1}^{-2} \alpha_{r_2}^{-2} \omega^{(r_1-r_2)(x-y)}\right)^{1/2} = n\,\cdot\, \left( \sum_{r} n^{-2}\alpha_r^{-4} n^2\right)^{1/2} }[/math]

where the last equality is by noting that [math]\displaystyle{ \sum_{x,y}\omega^{(r_1-r_2)(x-y)} }[/math] is either [math]\displaystyle{ n^2 }[/math] (if [math]\displaystyle{ \scriptstyle r_1-r_2=0 }[/math]) or 0 (if [math]\displaystyle{ r_1-r_2\not=0 }[/math]). Since [math]\displaystyle{ \alpha_r }[/math] is between [math]\displaystyle{ m }[/math] and [math]\displaystyle{ 2m+1 }[/math] (recall [math]\displaystyle{ \scriptstyle m=\frac15 \sqrt n }[/math]), we have [math]\displaystyle{ \alpha_r^{-4} \leq m^{-4} = \frac{5^4}{n^2} }[/math]. Thus,

[math]\displaystyle{ n\,\cdot\, \left( \sum_{r} n^{-2}\alpha_r^{-4} n^2\right)^{1/2} \leq 25 |R(\Delta)|^{1/2}\leq 25\sqrt{2} n^{1/4}, }[/math]

and so

[math]\displaystyle{ \sum_{\Delta=1}^{\sqrt{n}} \sum_{x,y} \left| \sum_{r\in R(\Delta)} n^{-1}\alpha_r^{-2} \omega^{r(x-y)} \right|\leq 25\sqrt{2} n^{3/4}. }[/math]
  • Returning to step 2, we have [math]\displaystyle{ \delta^2 \geq \frac{1}{25\sqrt{2}} n^{1/4} }[/math], whence [math]\displaystyle{ \delta \gt \frac 17 n^{1/8} }[/math].