Furstenberg-Katznelson argument
This argument is being pursued in our reading seminar. Below is an ongoing "combinatorial" translation of the argument.
Theorem DHJ(3). For every [math]\displaystyle{ \delta \gt 0 }[/math] there exists n such that every subset [math]\displaystyle{ A \subset [3]^n }[/math] of density at least [math]\displaystyle{ \delta }[/math] contains a combinatorial line.
Proof. Suppose for contradiction that the claim failed; then there exists [math]\displaystyle{ \delta \gt 0 }[/math], a sequence of n going to infinity, and a sequence [math]\displaystyle{ A^{(n)} \subset [3]^n }[/math] of line-free sets, all with density at least [math]\displaystyle{ \delta }[/math].
Step 1: Random sampling
Let [math]\displaystyle{ m = m^{(n)} }[/math] be a sufficiently slowly growing integer-valued function of n (e.g. one can take [math]\displaystyle{ m := \lfloor \log \log \log n \rfloor }[/math]). We can then construct a random embedding [math]\displaystyle{ \phi^{(n)}: [3]^m \to [3]^n }[/math] as follows:
- Pick a random base point [math]\displaystyle{ x \in [3]^n }[/math]. Note from the law of large numbers that the number of 0s, 1s, and 2s in x will be comparable to n/3 with probability 1-o(1) (here we use o(1) to denote anything that goes to zero in the limit [math]\displaystyle{ n \to \infty }[/math], independently of the choice of m).
- Pick m random indices [math]\displaystyle{ i_1,\ldots,i_m }[/math] in the 0-set of x. These will be distinct with probability 1-o(1), if m is a sufficiently slowly growing function of n.
- Define [math]\displaystyle{ \phi^{(n)}(w_1 \ldots w_m) }[/math] to be the word in [math]\displaystyle{ [3]^n }[/math] formed by starting with x, and replacing the [math]\displaystyle{ i_j }[/math] coefficient of x (which is 0) with [math]\displaystyle{ w_i }[/math], for each [math]\displaystyle{ i=1,\ldots,m }[/math].
For instance, if n=5, m=2, x=01002, and [math]\displaystyle{ i_1 = 1; i_2 = 4 }[/math], then [math]\displaystyle{ \phi(ab) = a10b2 }[/math] for all [math]\displaystyle{ a,b=0,1,2 }[/math].
The above map is well-defined with probability 1-o(1), and maps combinatorial lines to combinatorial lines. Thus the random set [math]\displaystyle{ B^{(n)}_m := (\phi^{(n)})^{-1}(A^{(n)}) \subset [3]^m }[/math] will be line-free.
Given any [math]\displaystyle{ 0 \leq j \leq m }[/math] and [math]\displaystyle{ w \in [3]^j }[/math], let [math]\displaystyle{ B_{m-j,w} \subset [3]^{m-j} }[/math] denote the random set
- [math]\displaystyle{ B^{(n)}_{m-j,w} := \{ v \in [3]^{m-j}: vw \in B^{(n)}_m \} }[/math]
where vw is of course the concatenation of v and w. A convenient observation is that the [math]\displaystyle{ B^{(n)}_{m-j,w} }[/math] are asymptotically stationary in the sense that the distribution are asymptotically independent of w:
Stationarity lemma. Let [math]\displaystyle{ 0 \leq j\leq m }[/math], let [math]\displaystyle{ B \subset [3]^{m-j} }[/math], and let [math]\displaystyle{ w, w' \in [3]^j }[/math]. Then
- [math]\displaystyle{ {\Bbb P}( B^{(n)}_{m-j,w} = B ) = {\Bbb P}( B^{(n)}_{m-j,w'} = B ) + o(1). }[/math]
Proof. Observe that [math]\displaystyle{ B_{m-j,w} = (\phi^{(n)}_{m-j,w})^{-1}(A_n) }[/math], where [math]\displaystyle{ \phi^{(n)}_{m-j,w}: [3]^{m-j} \to [3]^n }[/math] is defined like [math]\displaystyle{ \phi^{(n)} }[/math], but replacing m with m-j, and also replacing x with x', defined by replacing x at j random locations [math]\displaystyle{ i_{m-j+1},\ldots,i_m }[/math] in the 0-set by the entries in w. But a direct computation using Stirling's formula (if m is sufficiently slowly growing) shows that the probability distribution of x' is within o(1) in total variation norm from the distribution of x (intuitively the j changes to x are a "drop in the ocean" compared with the noise level [math]\displaystyle{ \sqrt{n} }[/math] of x). In particular, replacing w with w' only affects the distribution of x' by o(1) in total variation, and the claim follows. [math]\displaystyle{ \Box }[/math]
Step 2: Taking limits
For fixed n, and [math]\displaystyle{ m = m^{(n)} }[/math], we can construct random sets [math]\displaystyle{ B^{(n)}_i \subset [3]^i }[/math] for [math]\displaystyle{ i=0,\ldots,m }[/math] by the formula [math]\displaystyle{ B^{(n)}_i := B^{(n)}_{i,0^{m-i}} }[/math]; these are all line-free. We then have the asymptotic stationarity property
- [math]\displaystyle{ {\Bbb P}( B^{(n)}_{i,w} = B ) = {\Bbb P}( B^{(n)}_i = B ) + o(1) }[/math]
for all [math]\displaystyle{ w \in [3]^j }[/math] with [math]\displaystyle{ i+j \leq m }[/math], where
- [math]\displaystyle{ B^{(n)}_{i,w} := \{ v \in [3]^i: vw \in B_{i+j}^{(n)} \}. }[/math] (1)
Also, observe that [math]\displaystyle{ {\Bbb P}(B^{(n)}_0 \neq \emptyset) }[/math] is the density of [math]\displaystyle{ A_n }[/math] and is thus at least [math]\displaystyle{ \delta }[/math].
Using the Helly selection principle we can take a subsequence limit as [math]\displaystyle{ n \to \infty }[/math] in the vague topology. This gives us a new sequence of random sets [math]\displaystyle{ B_i \subset [3]^i }[/math] for all [math]\displaystyle{ i \geq 0 }[/math] obeying the exact stationarity property
- [math]\displaystyle{ {\Bbb P}( B_{i,w} = B ) = {\Bbb P}( B_i = B ) }[/math]
for all [math]\displaystyle{ w \in [3]^j }[/math], where [math]\displaystyle{ B_{i,w} }[/math] is defined similarly to (1), and such that [math]\displaystyle{ B_0 \neq \emptyset }[/math] with positive probability. Also, all the [math]\displaystyle{ B_i }[/math] are almost surely line-free; by removing an event of probability zero we may assume that they are surely line-free.
Step 3: Creating transformations
For each i, let [math]\displaystyle{ X_i := 2^{[3]^i} }[/math] denote the space of all subsets [math]\displaystyle{ B_i \subset [3]^n }[/math], with the product sigma-algebra. The probability distribution of the random set [math]\displaystyle{ B_i }[/math] defined in the previous section is thus a probability measure [math]\displaystyle{ \mu_i }[/math] on [math]\displaystyle{ X_i }[/math]. The set [math]\displaystyle{ E_0 := \{ B_0 \neq \emptyset \} }[/math] has positive measure with respect to [math]\displaystyle{ \mu_0 }[/math].
For each j=0,1,2 and [math]\displaystyle{ i \geq 1 }[/math], let [math]\displaystyle{ \tilde U_i^j: X_i \to X_{i-1} }[/math] be the transformation that sends [math]\displaystyle{ B_i }[/math] to [math]\displaystyle{ B_{i,j} := \{ w \in [3]^{i-1}: wj \in B_i \} }[/math]. The stationarity property is nothing more than the statement that [math]\displaystyle{ \tilde U_i^j }[/math] is measure-preserving in the sense that [math]\displaystyle{ \mu_i((\tilde U_i^j)^{-1}(F)) = \mu_{i-1}(F) }[/math] for any set F in [math]\displaystyle{ X_{i-1} }[/math]. Observe that if we define [math]\displaystyle{ \tilde U_w: X_n \to X_0 }[/math] for any word [math]\displaystyle{ w = w_1 \ldots w_n }[/math] by the formula
- [math]\displaystyle{ \tilde U_{w_1 \ldots w_n} = \tilde U_1^{w_1} \ldots \tilde U_n^{w_n} }[/math](2)
then [math]\displaystyle{ \tilde U_w^{-1}(E_0) = \{ w \in B_n \} }[/math]. Since the [math]\displaystyle{ B_i }[/math] are a.s. line free, we thus see that
- [math]\displaystyle{ \mu_n( \tilde U_{w^0}^{-1}(E_0) \cap \tilde U_{w^1}^{-1}(E_0) \cap \tilde U_{w^2}^{-1}(E_0) ) = 0 }[/math]
for all n and all combinatorial lines [math]\displaystyle{ w^0, w^1, w^2 \in [3]^n }[/math].
It is inconvenient that the [math]\displaystyle{ \tilde U_i^j }[/math] are not invertible. But this can be remedied by lifting the measure spaces up to a sufficiently large space (or, in more probabilistic terms, by introducing a sufficient number of random number generators).
...