Furstenberg-Katznelson argument

From Polymath1Wiki
Jump to: navigation, search

This proof of DHJ(3) is being pursued in our reading seminar. Below is an ongoing "combinatorial" translation of the argument. [Very informal!]

We write [math][3] = \{0,1,2\}[/math] for our alphabet. We let [math]Lines(n)[/math] be the space of all combinatorial lines [math]\ell: [3] \to [3]^n[/math] with "few" wildcards, where we will be vague about what "few" means. We also have [math]Planes(n)[/math], the space of combinatorial planes [math]\pi: [3]^2 \to [3]^n[/math], and [math]Spaces(n)[/math], the space of combinatorial 3-spaces [math]\sigma: [3]^3 \to [3]^n[/math]. Note that one can think of a combinatorial plane in [math][3]^n[/math] as a combinatorial line in Lines(n), etc.

DHJ(3) is then the claim that if [math]f: [3]^n \to [0,1][/math] has large mean, then

[math]{\Bbb E} f(\ell(0)) f(\ell(1)) f(\ell(2))[/math] is large,

where [math]\ell[/math] ranges over Lines(n).

12-low influence

We say that a function [math]f: [3]^n \to [-1,1][/math] is 12-low influence if

[math]{\Bbb E} |f(x)-f(y)|^2[/math] is small,

where x is drawn uniformly from [math][3]^n[/math] and y is formed from x by randomly flipping a 1 to a 2 or vice versa. Equivalently, f is 12-low influence if

[math]{\Bbb E} |f(\ell(1)) - f(\ell(2))|^2[/math] is small,

where [math]\ell[/math] is drawn randomly from Lines(n).

The following claim is, technically, not true, but is a useful first approximation to the truth:

Lemma 1 (A lie) Let [math]f: [3]^n \to [-1,1][/math]. Then the function [math]F(x) := {\Bbb E}_{\ell: \ell(1)=x} f(\ell(2))[/math] has 12-low influence, as does the function [math]G(x) := {\Bbb E}_{\ell: \ell(2) = x} f(\ell(1))[/math].

Let us say that a function [math]f: [3]^n \to [-1,1][/math] is 12-uniform if [math]{\Bbb E} g(\ell(1)) f(\ell(2))[/math] is small for all [math]g: [3]^n \to [-1,1][/math], where [math]\ell[/math] is again ranging over Lines(n). Assuming Lemma 1, we see that 12-uniform functions are basically orthogonal to 12-low influence functions. In fact any function can essentially be orthogonally decomposed into 12-uniform and 12-low influence components (see the Fourier-analytic proof of Sperner for some more discussion).

We can also define 12-uniformity and 12-low influence on functions on Lines(n) rather than [math][3]^n[/math], where now [math]\ell[/math] is ranging over lines in Lines(n) (i.e. in elements of Planes(n)). Similarly for functions on Planes(n), Spaces(n), etc.

01-uniformity relative to 12-low influence

We can define 01-uniformity in exact analogy to 12-uniformity; a function f is 01-uniform if [math]{\Bbb E} g(\ell(0)) f(\ell(1))[/math] is small for all [math]g: [3]^n \to [-1,1][/math]. We need a more sophisticated notion (analogous to [math]U^2({\Bbb Z}/N{\Bbb Z})[/math] Gowers uniformity): a function f is 01-uniform relative to 12-low influence if [math]{\Bbb E} g(\ell(0)) f(\ell(1)) h(\ell)[/math] is small for all [math]g: [3]^n \to [-1,1][/math] and all 12-low influence [math]h: Lines(n) \to [-1,1][/math].

Example Random functions are 01-uniform relative to 12-low influence.

Lemma 2 (von Neumann theorem) If [math]f: [3]^n \to [-1,1][/math] is 01-uniform relative to 12-low influence and [math]g, h: [3]^n \to [-1,1][/math], then [math]{\Bbb E} h(\ell(0)) f(\ell(1)) g(\ell(2))[/math] is small.

Proof: An "IP van der Corput lemma" allows one to ensure that [math]{\Bbb E} h(\ell(0)) f(\ell(1)) g(\ell(2))[/math] is small if we can show that

[math]{\Bbb E} f(\pi(01)) g(\pi(02)) f(\pi(11)) g(\pi(22))[/math] (1)

is small, where [math]\pi[/math] ranges over Planes(n). (Rough sketch of proof: we can rearrange [math]{\Bbb E} h(\ell(0)) f(\ell(1)) g(\ell(2))[/math] as

[math]{\Bbb E}_{\phi,i} h(\phi(0^r)) f(\phi(1^i 0^{r-i})) g(2^i 0^{r-i})[/math]

where r is a medium size number, [math]\phi: [3]^r \to [3]^n[/math] ranges over r-dimensional subspaces with few wildcards, and i ranges from 1 to r. Using Cauchy-Schwarz in i, we reduce to bounding

[math]{\Bbb E}_{\phi,i,j} f(\phi(1^i 0^{r-i})) g(2^i 0^{r-i}) f(\phi(1^j 0^{r-j})) g(2^j 0^{r-j})[/math]

which can be rearranged to give (1)).

We can rearrange (1) further as

[math]{\Bbb E} F(\pi(1)) G(\pi(2))[/math]

where [math]\pi(1), \pi(2)[/math] are thought of as lines and [math]F(\ell) := f(\ell(0)) f(\ell(1))[/math] and [math]G(\ell) := g(\ell(0)) g(\ell(2))[/math]. But by Lemma 1, we can rewrite this as

[math]{\Bbb E} F(\ell) \overline{G}(\ell)[/math]

where [math]\overline{G}[/math] is 12-low influence. But as f is 01-uniform relative to 12-low influence, this expression is small. [math]\Box[/math]

Thus, when dealing with expressions of the form

[math]{\Bbb E} h(\ell(0)) f(\ell(1)) g(\ell(2))[/math],

functions which are 01-uniform relative to 12-low influence can be discarded from the second factor. For similar reasons, functions which are 20-uniform relative to 12-low influence (defined in the obvious manner) can be discarded from the third factor).

Obstructions to uniformity

To continue, we need to understand what the obstructions are to 01-uniformity relative to 12-low influence. The answer turns out to be 01-almost periodicity relative to 12-low influence. Roughly speaking, a function f is 01-periodic relative to 12-low influence if any shift from 0s to 1s distorts f in a manner which can be described by 12-low influence functions. More precisely, there exists a bounded number of functions [math]g_h: [3]^n \to [-1,1][/math] and 12-low influence functions [math]c_h: Lines(n) \to [-1,1][/math] for [math]h = 1,\ldots,H[/math] such that the equation

[math]f(\ell(1)) = {\Bbb E}_h c_h(\ell) g_h(\ell(0))[/math]

is approximately true for most lines [math]\ell[/math] in Lines(n).

Example 1 If f is 01-low influence, then [math]f(\ell(1)) \approx f(\ell(0))[/math], so f is certainly 01-almost periodic relative to 12-low influence.

Example 2 Let [math]f(x) \in \{-1,+1\}[/math] be the parity of the number of 1s in x, then [math]f(\ell(1)) = c(\ell) f(\ell(0))[/math] where [math]c(\ell)[/math] depends only on the number of wildcards in [math]\ell[/math], and is in particular 12-low influence. Thus this function is also 01-almost periodic relative to 12-low influence.

Example 3 If [math]f: [3]^n \to [-1,1][/math] is 12-low influence, then trivially [math]f(\ell(1)) = c(\ell)[/math] where [math]c(\ell) := f(\ell(1))[/math]. Since c is 12-low influence, we conclude that f is 01-almost periodic relative to 12-low influence.

Example 4 Any bounded linear combination of 01-almost periodic functions relative to 12-low influence is also 01-almost periodic relative to 12-low influence; also, any product of 01-almost periodic functions relative to 12-low influence is also 01-almost periodic relative to 12-low influence.

These almost periodic functions obstruct uniformity:

Lemma 2: If [math]f: [3]^n \to [-1,1][/math] is 01-uniform relative to 12-low influence, and [math]g: [3]^n \to [-1,1][/math] is 01-almost periodic relative to 12-low influence, then f and g have small inner product.

Proof: The inner product of f and g can be expressed as

[math] {\Bbb E} f(\ell(1)) g(\ell(1))[/math]

where [math]\ell[/math] ranges over Lines(n). On the other hand, from the almost periodicity of g we can write

[math]g(\ell(1)) = {\Bbb E}_h c_h(\ell) G_h(\ell(0))[/math]

for some 12-low influence [math]c_h[/math] and some bounded [math]G_h[/math]. So we can reexpress the inner product as

[math] {\Bbb E}_h {\Bbb E} f(\ell(1)) G_h(\ell(0)) c_h(\ell).[/math]

But as f is 01-uniform relative to 12-low influence, the inner expectation is small, and the claim follows. [math]\Box[/math]

Conversely, these are the only obstructions to uniformity:

Lemma 3: If [math]f: [3]^n \to [-1,1][/math] is not 01-uniform relative to 12-low influence, then there exists a [math]g: [3]^n \to [-1,1][/math] is 01-almost periodic relative to 12-low influence, such that f and g have large inner product.

Proof: By hypothesis, we can find a 12-low influence c and a bounded G such that

[math]{\Bbb E} f(\ell(1)) G(\ell(0)) c(\ell)[/math]

is large. We then take

[math] g(x) := {\Bbb E}_{\ell: \ell(1) = x} G(\ell(0)) c(\ell)[/math].

Clearly f correlates with g.

If the expression inside the expectation was constant, then we would have

[math] g(\ell(1)) = G(\ell(0)) c(\ell)[/math]

and g would be 01-almost periodic relative to 12-low influence. Of course, the expression here is not necessarily constant, but if it oscillates too much, then g would be very small and the claim would be easy. I think we can use some sort of statistical sampling argument to handle the general case (I did this in my finitisation of Furstenberg's proof of Szemeredi's theorem); I'll come back to this point later. [math]\Box[/math]

In view of this lemma, I believe we have

Corollary 4: Any function [math]f: [3]^n \to [0,1][/math] can be decomposed into a function [math]f_{01}: [3]^n \to [0,1][/math] with the same mean as f which is 01-almost periodic relative to 12-low influence, plus an error [math]f-f_{01}[/math] which is 01-uniform relative to 12-low influence.

We have a similar decomposition [math]f = f_{20} + (f-f_{20})[/math] into a function [math]f_{20}: [3]^n \to [0,1][/math] with the same mean as f which is 20-almost periodic relative to 12-low influence, plus an error [math]f-f_{20}[/math] which is 01-uniform relative to 12-low influence. Using the von Neumann theorem, we can thus replace

[math]{\Bbb E} f(\ell(0)) f(\ell(1)) f(\ell(2))[/math]

by

[math]{\Bbb E} f(\ell(0)) f_{01}(\ell(1)) f_{20}(\ell(2))[/math]. (2)

Conclusion of proof

Let [math]f_{12}[/math] be the 12-low influence component of f, then [math]f_{12}[/math] is non-negative, has large density, and correlates with f. If we then let E be the set where [math]f_{12}[/math] is large, then E is also 12-low influence, and f is large on E.

By DHJ(2.5), the 12-low influence set E contains large combinatorial subspaces. Passing to such a subspace, we may assume that E is everything, i.e. [math]f_{12}[/math] is large everywhere. To put it another way, this implies that f has large density inside any 12-low influence set.

Having done this, we now perform the reductions in the previous section to get to the expression (2). From the almost periodicity of [math]f_{01}, f_{20}[/math], we have

[math]f_{01}(\ell(1)) = {\Bbb E}_h c_h(\ell) g_{01,h}(\ell(0))[/math] (3)

where [math]c_h[/math] is 12-low influence, and similarly

[math]f_{20}(\ell(2)) = {\Bbb E}_h c'_h(\ell) g_{20,h}(\ell(0))[/math] (4)

where [math]c'_h[/math] is 12-low influence.

Direct substitution of (3) and (4) into (2) does not quite work; we have to do something a bit sneakier. Let r be a medium size number and consider an r-dimensional subspace [math]\phi: [3]^r \to [3]^n[/math]. By the Graham-Rothschild theorem, if r is large enough, one can find a three-dimensional subspace [math]\sigma: [3]^3 \to [3]^n[/math] of [math]\phi[/math] on which [math]c_h, c'_h[/math] are essentially constant. Call such a subspace monochromatic; thus the collection of monochromatic 3-dimensional subspaces has positive density in Spaces(n). Furthermore, since this collection is defined using the 12-low influence functions [math]c_h, c'_h[/math], the collection of 3-dimensional monochromatic spaces is itself 12-low influence.

We now rewrite (2) as

[math]{\Bbb E} f(\sigma(012)) f_{01}(\sigma(112)) f_{20}(\sigma(212))[/math]

where [math]\sigma[/math] ranges over Spaces(n).

Suppose that [math]\sigma[/math] is monochromatic. Then from (3) we have

[math]f_{01}(\sigma(112)) = {\Bbb E}_h c_h(\sigma(**2)) g_{01,h}(\sigma(002))[/math]

and

[math]f_{01}(\sigma(012)) = {\Bbb E}_h c_h(\sigma(0*2)) g_{01,h}(\sigma(002))[/math]

and hence by monochromaticity

[math]f_{01}(\sigma(112)) = f_{01}(\sigma(012))[/math]

and similarly

[math]f_{20}(\sigma(212)) = f_{20}(\sigma(012))[/math].

Thus we can bound (2) from below by

[math]{\Bbb E} 1_{mono}(\sigma) [f f_{01} f_{20}](\sigma(012))[/math]

where [math]1_{mono}[/math] is the indicator function on the class of monochromatic spaces.

Now, we claim that [math]f_{01}[/math] is large on almost all of the support of f. Indeed, if there was a large set in the support of f on which [math]f_{01}[/math] was small, then denoting f' by the restriction of f to that set, [math]{\Bbb E} f'[/math] would be large but [math] {\Bbb E} f' f_{01}[/math] would be small. But if [math]f'_{01}[/math] denotes the component of f' which is 01-almost periodic relative to 12-low influence, then [math]f'_{01}[/math] is dominated by [math]f_{01}[/math] (since f' is dominated by f), and so [math] {\Bbb E} f' f'_{01}[/math] would be small. But by orthogonality of f' and [math]f'-f'_{01}[/math], this is equal to [math] {\Bbb E} (f'_{01})^2 [/math]. Since [math]f'_{01}[/math] has the same mean as f', which is large, this is a contradiction. Thus [math]f_{01}[/math] is large on almost all the support of f, and similarly for [math]f_{20}[/math]. Thus we can bound (2) from below by some multiple of

[math]{\Bbb E} 1_{mono}(\sigma) 1_{supp(f)}(\sigma(012))[/math].

But because f has positive density inside any 12-low influence set, this expression is large, and we are done.