Furstenberg-Katznelson argument

From Polymath Wiki
Revision as of 11:41, 14 February 2009 by Teorth (talk | contribs)
Jump to navigationJump to search

This proof of DHJ(3) is being pursued in our reading seminar. Below is an ongoing "combinatorial" translation of the argument.


Step 1: Random sampling

Suppose for contradiction that DHJ(3); then there exists [math]\displaystyle{ \delta \gt 0 }[/math], a sequence of n going to infinity, and a sequence [math]\displaystyle{ A^{(n)} \subset [3]^n }[/math] of line-free sets, all with density at least [math]\displaystyle{ \delta }[/math].

For any fixed integer m, and with any n larger than m, we can then form a random line-free subset [math]\displaystyle{ A^{(n)}_m \subset [3]^m }[/math] by randomly embedding [math]\displaystyle{ [3]^m }[/math] inside [math]\displaystyle{ [3]^n }[/math]. More precisely,

  1. Pick m distinct positions [math]\displaystyle{ a_1,\ldots,a_m \in [n] }[/math] uniformly at random.
  2. Outside of these m positions, pick a random word x in [math]\displaystyle{ [3]^{[n] \backslash \{a_1,\ldots,a_m\}} }[/math] uniformly at random.
  3. Define the random sampling map [math]\displaystyle{ \phi_m^{(n)}: [3]^m \to [3]^n$ by mapping \lt math\gt w_1 \ldots w_m }[/math] to the string that is equal to x outside of the positions [math]\displaystyle{ a_1,\ldots,a_m }[/math], and then setting the position at [math]\displaystyle{ a_i }[/math] equal to [math]\displaystyle{ w_i }[/math].
  4. Define [math]\displaystyle{ A^{(n)}_m := (\phi_m^{(n)})^{-1}(A^{(n)}) }[/math].

One can think of [math]\displaystyle{ A^{(n)}_m }[/math] as a random variable taking values in the power set [math]\displaystyle{ X_m := 2^{[3]^m} }[/math], so its probability distribution is a probability measure [math]\displaystyle{ \mu^{(n)}_m }[/math] on [math]\displaystyle{ X_m }[/math]. Since [math]\displaystyle{ A^{(n)} }[/math] is line-free, the [math]\displaystyle{ A^{(n)}_m }[/math] are also.

These random variables have two notable stationarity properties. The first is permutation symmetry: the obvious action of the permutation group [math]\displaystyle{ S_m }[/math] on [math]\displaystyle{ X_m }[/math] preserves [math]\displaystyle{ \mu^{(n)}_m }[/math], thus [math]\displaystyle{ A^{(n)}_m }[/math] is stationary with respect to this action. The other symmetry comes from the slicing maps [math]\displaystyle{ U_{m+|w|}^w: X_{m+|w|} \to X_m }[/math] defined for any m, and any word [math]\displaystyle{ w \in [3]^{|w|} }[/math] of length |w| by the formula

[math]\displaystyle{ U_{m+|w|}^w(A_{m+|w|}) := \{ v \in [3]^m: vw \in A_{m+|w|} \} }[/math].

Observe the semigroupoid law

[math]\displaystyle{ U_{m+|w|}^w U_{m+|w|+|u|}^u = U_{m+|wu|}^{wu} }[/math] (1)

for all words w, u.

One has the almost stationary property: for fixed m and w, the random variables [math]\displaystyle{ A^{(n)}_m }[/math] and [math]\displaystyle{ U_{m+|w|}^w(A^{(n)}_{m+|w|}) }[/math] have asymptotically the same distribution in the limit [math]\displaystyle{ n \to \infty }[/math]. This is because both of these random subsets of [math]\displaystyle{ [3]^m }[/math] are formed by pulling back [math]\displaystyle{ A^{(n)} }[/math] by a random m-dimensional subspace; the probability distributions of these random subspaces differ slightly, but one can compute that the total variation of the difference goes to zero in the limit [math]\displaystyle{ n \to \infty }[/math] (holding m,w fixed).

Taking a subsequence limit as [math]\displaystyle{ n \to \infty }[/math] (and using the Helly selection principle, a.k.a. the vague sequential compactness of probability measures), we can then find random line-free sets [math]\displaystyle{ A_m \subset [3]^m }[/math] (or equivalently, a probability measure [math]\displaystyle{ \mu_m }[/math] on [math]\displaystyle{ X_m }[/math]) which is stationary with respect to the permutation groups [math]\displaystyle{ S_m }[/math] acting on [math]\displaystyle{ X_m }[/math], and also with respect to the slicing maps [math]\displaystyle{ U_{m+|w|}^w: X_{m+|w|} \to X_m }[/math]. Also, since [math]\displaystyle{ A^{(n)}_0 }[/math] was non-empty with probability at least [math]\displaystyle{ \delta }[/math], then so does [math]\displaystyle{ A_0 }[/math]; by stationarity, this implies that for any [math]\displaystyle{ v \in [3]^m }[/math], that [math]\displaystyle{ v\in A_m }[/math] with probability at least [math]\displaystyle{ \delta }[/math] also.

Step 2. Inverting the translations

This is all very well and good, but there is one minor technical problem which will cause difficulty in later parts of the argument: the slicing maps [math]\displaystyle{ U_{m+|w|}^w: X_{m+r} \to X_m }[/math] are not invertible. We would like to (artificially) upgrade them to be invertible (thus upgrading the semigroupoid of transformations [math]\displaystyle{ U_{m+r}^w }[/math] to a groupoid). This can be done, but requires extending the measure spaces $latex X_m$ a bit (or in probabilistic language, by adding random number generators).

Let's turn to the details. It suffices to understand what is going on with the hyperplane slicing maps [math]\displaystyle{ U_{m+1}^i: X_{m+1} \to X_m }[/math] for [math]\displaystyle{ i=0,1,2 }[/math], since the more general maps can be expressed as compositions of these maps by (1). The key lemma is

Lemma 1 Let [math]\displaystyle{ (X,\mu_X) }[/math] and [math]\displaystyle{ (Y,\mu_Y) }[/math] be finite probability spaces, and let [math]\displaystyle{ U: X \to Y }[/math] be a surjective map which is probability-preserving (thus [math]\displaystyle{ U_* \mu_X = \mu_Y }[/math], or equivalently that [math]\displaystyle{ \mu_X(U^{-1}(E)) = \mu_Y(E) }[/math] for every [math]\displaystyle{ E \subset Y). Then one can lift U to an (almost surely) invertible map \lt math\gt \tilde U: X \times [0,1] \to Y \times [0,1] }[/math] on the product probability spaces which forms a commuting square with the projections [math]\displaystyle{ \pi_X: X \times [0,1] \to X }[/math], [math]\displaystyle{ \pi_Y: Y \times [0,1] \to Y }[/math], such that [math]\displaystyle{ \tilde U, \tilde U^{-1} }[/math] are measure-preserving.

Proof: By working one fibre of U at a time, one can reduce to the case when Y is a point (so that U is trivial). Order X as [math]\displaystyle{ x_1,\ldots,x_k }[/math], with probabilities [math]\displaystyle{ p_1,\ldots,p_k }[/math]. One can then take

[math]\displaystyle{ \tilde U( x_j, t ) := p_1 + \ldots + p_{j-1} + p_j t }[/math]

for [math]\displaystyle{ j=1,\ldots,k }[/math] and [math]\displaystyle{ t \in [0,1] }[/math]; one easily verifies the desired properties. [math]\displaystyle{ \Box }[/math]

Applying this lemma, we may now construct invertible measure-preserving slicing maps [math]\displaystyle{ \tilde U_{m+1}^i: \tilde X_{m+1} \to \tilde X_m }[/math] that commute with the original slicing maps, where [math]\displaystyle{ \tilde X_m := X_m \times [0,1] }[/math] with product measure. We can also extend these maps to negative m, by declaring [math]\displaystyle{ X_m }[/math] to be a point in these cases (and the slicing map to be trivial). We then have the invertible measure-preserving slicing map [math]\displaystyle{ \tilde U_{m+r}^w: \tilde X_{m+r} \to \tilde X_m }[/math] for all words [math]\displaystyle{ w \in [3]^r }[/math] of length r, defined by the semigroupoid law

[math]\displaystyle{ \tilde U_{m+r}^{w_1 \ldots w_r} = \tilde U_{m+1}^{w_1} \ldots \tilde U_{m+r}^{w_r}. }[/math]

In fact, we can now extend this groupoid action not just to words, but to any element w of the free group [math]\displaystyle{ F_3 }[/math] generated by the alphabet 0,1,2 and the formal inverses [math]\displaystyle{ 0^{-1}, 1^{-1}, 2^{-1} }[/math], defining [math]\displaystyle{ \tilde U_{m+|w|}^w: \tilde X_{m+|w|} \to \tilde X_m }[/math] using the semigroupoid law (1) and the inversion law

[math]\displaystyle{ (\tilde U_{m+|w|}^w)^{-1} := \tilde U_m^{w^{-1}} }[/math].

Here we adopt the convention that the inverses [math]\displaystyle{ 0^{-1}, 1^{-1}, 2^{-1} }[/math] have length -1. Thus for instance [math]\displaystyle{ \tilde U_5^{012^{-1}1}: \tilde X_5 \to \tilde X_3 }[/math] is the map

[math]\displaystyle{ \tilde U_5^{012^{-1}1} = \tilde U_4^0 \tilde U_5^1 (\tilde U_4^2)^{-1} \tilde U_5^1 }[/math].