Furstenberg-Katznelson argument: Difference between revisions

Latest revision as of 18:43, 24 February 2009

This proof of DHJ(3) is being pursued in our reading seminar. Below is an ongoing "combinatorial" translation of the argument. [Very informal!]

We write [math]\displaystyle{ [3] = \{0,1,2\} }[/math] for our alphabet. We let [math]\displaystyle{ Lines(n) }[/math] be the space of all combinatorial lines [math]\displaystyle{ \ell: [3] \to [3]^n }[/math] with "few" wildcards, where we will be vague about what "few" means. We also have [math]\displaystyle{ Planes(n) }[/math], the space of combinatorial planes [math]\displaystyle{ \pi: [3]^2 \to [3]^n }[/math], and [math]\displaystyle{ Spaces(n) }[/math], the space of combinatorial 3-spaces [math]\displaystyle{ \sigma: [3]^3 \to [3]^n }[/math]. Note that one can think of a combinatorial plane in [math]\displaystyle{ [3]^n }[/math] as a combinatorial line in Lines(n), etc.

DHJ(3) is then the claim that if [math]\displaystyle{ f: [3]^n \to [0,1] }[/math] has large mean, then

[math]\displaystyle{ {\Bbb E} f(\ell(0)) f(\ell(1)) f(\ell(2)) }[/math] is large,

where [math]\displaystyle{ \ell }[/math] ranges over Lines(n).

12-low influence

We say that a function [math]\displaystyle{ f: [3]^n \to [-1,1] }[/math] is 12-low influence if

[math]\displaystyle{ {\Bbb E} |f(x)-f(y)|^2 }[/math] is small,

where x is drawn uniformly from [math]\displaystyle{ [3]^n }[/math] and y is formed from x by randomly flipping a 1 to a 2 or vice versa. Equivalently, f is 12-low influence if

[math]\displaystyle{ {\Bbb E} |f(\ell(1)) - f(\ell(2))|^2 }[/math] is small,

where [math]\displaystyle{ \ell }[/math] is drawn randomly from Lines(n).

The following claim is, technically, not true, but is a useful first approximation to the truth:

Lemma 1 (A lie) Let [math]\displaystyle{ f: [3]^n \to [-1,1] }[/math]. Then the function [math]\displaystyle{ F(x) := {\Bbb E}_{\ell: \ell(1)=x} f(\ell(2)) }[/math] has 12-low influence, as does the function [math]\displaystyle{ G(x) := {\Bbb E}_{\ell: \ell(2) = x} f(\ell(1)) }[/math].

Let us say that a function [math]\displaystyle{ f: [3]^n \to [-1,1] }[/math] is 12-uniform if [math]\displaystyle{ {\Bbb E} g(\ell(1)) f(\ell(2)) }[/math] is small for all [math]\displaystyle{ g: [3]^n \to [-1,1] }[/math], where [math]\displaystyle{ \ell }[/math] is again ranging over Lines(n). Assuming Lemma 1, we see that 12-uniform functions are basically orthogonal to 12-low influence functions. In fact any function can essentially be orthogonally decomposed into 12-uniform and 12-low influence components (see the Fourier-analytic proof of Sperner for some more discussion).

We can also define 12-uniformity and 12-low influence on functions on Lines(n) rather than [math]\displaystyle{ [3]^n }[/math], where now [math]\displaystyle{ \ell }[/math] is ranging over lines in Lines(n) (i.e. in elements of Planes(n)). Similarly for functions on Planes(n), Spaces(n), etc.

01-uniformity relative to 12-low influence

We can define 01-uniformity in exact analogy to 12-uniformity; a function f is 01-uniform if [math]\displaystyle{ {\Bbb E} g(\ell(0)) f(\ell(1)) }[/math] is small for all [math]\displaystyle{ g: [3]^n \to [-1,1] }[/math]. We need a more sophisticated notion (analogous to [math]\displaystyle{ U^2({\Bbb Z}/N{\Bbb Z}) }[/math] Gowers uniformity): a function f is 01-uniform relative to 12-low influence if [math]\displaystyle{ {\Bbb E} g(\ell(0)) f(\ell(1)) h(\ell) }[/math] is small for all [math]\displaystyle{ g: [3]^n \to [-1,1] }[/math] and all 12-low influence [math]\displaystyle{ h: Lines(n) \to [-1,1] }[/math].

Example Random functions are 01-uniform relative to 12-low influence.

Lemma 2 (von Neumann theorem) If [math]\displaystyle{ f: [3]^n \to [-1,1] }[/math] is 01-uniform relative to 12-low influence and [math]\displaystyle{ g, h: [3]^n \to [-1,1] }[/math], then [math]\displaystyle{ {\Bbb E} h(\ell(0)) f(\ell(1)) g(\ell(2)) }[/math] is small.

Proof: An "IP van der Corput lemma" allows one to ensure that [math]\displaystyle{ {\Bbb E} h(\ell(0)) f(\ell(1)) g(\ell(2)) }[/math] is small if we can show that

[math]\displaystyle{ {\Bbb E} f(\pi(01)) g(\pi(02)) f(\pi(11)) g(\pi(22)) }[/math] (1)

is small, where [math]\displaystyle{ \pi }[/math] ranges over Planes(n). (Rough sketch of proof: we can rearrange [math]\displaystyle{ {\Bbb E} h(\ell(0)) f(\ell(1)) g(\ell(2)) }[/math] as

[math]\displaystyle{ {\Bbb E}_{\phi,i} h(\phi(0^r)) f(\phi(1^i 0^{r-i})) g(2^i 0^{r-i}) }[/math]

where r is a medium size number, [math]\displaystyle{ \phi: [3]^r \to [3]^n }[/math] ranges over r-dimensional subspaces with few wildcards, and i ranges from 1 to r. Using Cauchy-Schwarz in i, we reduce to bounding

[math]\displaystyle{ {\Bbb E}_{\phi,i,j} f(\phi(1^i 0^{r-i})) g(2^i 0^{r-i}) f(\phi(1^j 0^{r-j})) g(2^j 0^{r-j}) }[/math]

which can be rearranged to give (1)).

We can rearrange (1) further as

[math]\displaystyle{ {\Bbb E} F(\pi(1)) G(\pi(2)) }[/math]

where [math]\displaystyle{ \pi(1), \pi(2) }[/math] are thought of as lines and [math]\displaystyle{ F(\ell) := f(\ell(0)) f(\ell(1)) }[/math] and [math]\displaystyle{ G(\ell) := g(\ell(0)) g(\ell(2)) }[/math]. But by Lemma 1, we can rewrite this as

[math]\displaystyle{ {\Bbb E} F(\ell) \overline{G}(\ell) }[/math]

where [math]\displaystyle{ \overline{G} }[/math] is 12-low influence. But as f is 01-uniform relative to 12-low influence, this expression is small. [math]\displaystyle{ \Box }[/math]

Thus, when dealing with expressions of the form

[math]\displaystyle{ {\Bbb E} h(\ell(0)) f(\ell(1)) g(\ell(2)) }[/math],

functions which are 01-uniform relative to 12-low influence can be discarded from the second factor. For similar reasons, functions which are 20-uniform relative to 12-low influence (defined in the obvious manner) can be discarded from the third factor).

Obstructions to uniformity

To continue, we need to understand what the obstructions are to 01-uniformity relative to 12-low influence. The answer turns out to be 01-almost periodicity relative to 12-low influence. Roughly speaking, a function f is 01-periodic relative to 12-low influence if any shift from 0s to 1s distorts f in a manner which can be described by 12-low influence functions. More precisely, there exists a bounded number of functions [math]\displaystyle{ g_h: [3]^n \to [-1,1] }[/math] and 12-low influence functions [math]\displaystyle{ c_h: Lines(n) \to [-1,1] }[/math] for [math]\displaystyle{ h = 1,\ldots,H }[/math] such that the equation

[math]\displaystyle{ f(\ell(1)) = {\Bbb E}_h c_h(\ell) g_h(\ell(0)) }[/math]

is approximately true for most lines [math]\displaystyle{ \ell }[/math] in Lines(n).

Example 1 If f is 01-low influence, then [math]\displaystyle{ f(\ell(1)) \approx f(\ell(0)) }[/math], so f is certainly 01-almost periodic relative to 12-low influence.

Example 2 Let [math]\displaystyle{ f(x) \in \{-1,+1\} }[/math] be the parity of the number of 1s in x, then [math]\displaystyle{ f(\ell(1)) = c(\ell) f(\ell(0)) }[/math] where [math]\displaystyle{ c(\ell) }[/math] depends only on the number of wildcards in [math]\displaystyle{ \ell }[/math], and is in particular 12-low influence. Thus this function is also 01-almost periodic relative to 12-low influence.

Example 3 If [math]\displaystyle{ f: [3]^n \to [-1,1] }[/math] is 12-low influence, then trivially [math]\displaystyle{ f(\ell(1)) = c(\ell) }[/math] where [math]\displaystyle{ c(\ell) := f(\ell(1)) }[/math]. Since c is 12-low influence, we conclude that f is 01-almost periodic relative to 12-low influence.

Example 4 Any bounded linear combination of 01-almost periodic functions relative to 12-low influence is also 01-almost periodic relative to 12-low influence; also, any product of 01-almost periodic functions relative to 12-low influence is also 01-almost periodic relative to 12-low influence.

These almost periodic functions obstruct uniformity:

Lemma 2: If [math]\displaystyle{ f: [3]^n \to [-1,1] }[/math] is 01-uniform relative to 12-low influence, and [math]\displaystyle{ g: [3]^n \to [-1,1] }[/math] is 01-almost periodic relative to 12-low influence, then f and g have small inner product.

Proof: The inner product of f and g can be expressed as

[math]\displaystyle{ {\Bbb E} f(\ell(1)) g(\ell(1)) }[/math]

where [math]\displaystyle{ \ell }[/math] ranges over Lines(n). On the other hand, from the almost periodicity of g we can write

[math]\displaystyle{ g(\ell(1)) = {\Bbb E}_h c_h(\ell) G_h(\ell(0)) }[/math]

for some 12-low influence [math]\displaystyle{ c_h }[/math] and some bounded [math]\displaystyle{ G_h }[/math]. So we can reexpress the inner product as

[math]\displaystyle{ {\Bbb E}_h {\Bbb E} f(\ell(1)) G_h(\ell(0)) c_h(\ell). }[/math]

But as f is 01-uniform relative to 12-low influence, the inner expectation is small, and the claim follows. [math]\displaystyle{ \Box }[/math]

Conversely, these are the only obstructions to uniformity:

Lemma 3: If [math]\displaystyle{ f: [3]^n \to [-1,1] }[/math] is not 01-uniform relative to 12-low influence, then there exists a [math]\displaystyle{ g: [3]^n \to [-1,1] }[/math] is 01-almost periodic relative to 12-low influence, such that f and g have large inner product.

Proof: By hypothesis, we can find a 12-low influence c and a bounded G such that

[math]\displaystyle{ {\Bbb E} f(\ell(1)) G(\ell(0)) c(\ell) }[/math]

is large. We then take

[math]\displaystyle{ g(x) := {\Bbb E}_{\ell: \ell(1) = x} G(\ell(0)) c(\ell) }[/math].

Clearly f correlates with g.

If the expression inside the expectation was constant, then we would have

[math]\displaystyle{ g(\ell(1)) = G(\ell(0)) c(\ell) }[/math]

and g would be 01-almost periodic relative to 12-low influence. Of course, the expression here is not necessarily constant, but if it oscillates too much, then g would be very small and the claim would be easy. I think we can use some sort of statistical sampling argument to handle the general case (I did this in my finitisation of Furstenberg's proof of Szemeredi's theorem); I'll come back to this point later. [math]\displaystyle{ \Box }[/math]

In view of this lemma, I believe we have

Corollary 4: Any function [math]\displaystyle{ f: [3]^n \to [0,1] }[/math] can be decomposed into a function [math]\displaystyle{ f_{01}: [3]^n \to [0,1] }[/math] with the same mean as f which is 01-almost periodic relative to 12-low influence, plus an error [math]\displaystyle{ f-f_{01} }[/math] which is 01-uniform relative to 12-low influence.

We have a similar decomposition [math]\displaystyle{ f = f_{20} + (f-f_{20}) }[/math] into a function [math]\displaystyle{ f_{20}: [3]^n \to [0,1] }[/math] with the same mean as f which is 20-almost periodic relative to 12-low influence, plus an error [math]\displaystyle{ f-f_{20} }[/math] which is 01-uniform relative to 12-low influence. Using the von Neumann theorem, we can thus replace

[math]\displaystyle{ {\Bbb E} f(\ell(0)) f(\ell(1)) f(\ell(2)) }[/math]

by

[math]\displaystyle{ {\Bbb E} f(\ell(0)) f_{01}(\ell(1)) f_{20}(\ell(2)) }[/math]. (2)

Conclusion of proof

Let [math]\displaystyle{ f_{12} }[/math] be the 12-low influence component of f, then [math]\displaystyle{ f_{12} }[/math] is non-negative, has large density, and correlates with f. If we then let E be the set where [math]\displaystyle{ f_{12} }[/math] is large, then E is also 12-low influence, and f is large on E.

By DHJ(2.5), the 12-low influence set E contains large combinatorial subspaces. Passing to such a subspace, we may assume that E is everything, i.e. [math]\displaystyle{ f_{12} }[/math] is large everywhere. To put it another way, this implies that f has large density inside any 12-low influence set.

Having done this, we now perform the reductions in the previous section to get to the expression (2). From the almost periodicity of [math]\displaystyle{ f_{01}, f_{20} }[/math], we have

[math]\displaystyle{ f_{01}(\ell(1)) = {\Bbb E}_h c_h(\ell) g_{01,h}(\ell(0)) }[/math] (3)

where [math]\displaystyle{ c_h }[/math] is 12-low influence, and similarly

[math]\displaystyle{ f_{20}(\ell(2)) = {\Bbb E}_h c'_h(\ell) g_{20,h}(\ell(0)) }[/math] (4)

where [math]\displaystyle{ c'_h }[/math] is 12-low influence.

Direct substitution of (3) and (4) into (2) does not quite work; we have to do something a bit sneakier. Let r be a medium size number and consider an r-dimensional subspace [math]\displaystyle{ \phi: [3]^r \to [3]^n }[/math]. By the Graham-Rothschild theorem, if r is large enough, one can find a three-dimensional subspace [math]\displaystyle{ \sigma: [3]^3 \to [3]^n }[/math] of [math]\displaystyle{ \phi }[/math] on which [math]\displaystyle{ c_h, c'_h }[/math] are essentially constant. Call such a subspace monochromatic; thus the collection of monochromatic 3-dimensional subspaces has positive density in Spaces(n). Furthermore, since this collection is defined using the 12-low influence functions [math]\displaystyle{ c_h, c'_h }[/math], the collection of 3-dimensional monochromatic spaces is itself 12-low influence.

We now rewrite (2) as

[math]\displaystyle{ {\Bbb E} f(\sigma(012)) f_{01}(\sigma(112)) f_{20}(\sigma(212)) }[/math]

where [math]\displaystyle{ \sigma }[/math] ranges over Spaces(n).

Suppose that [math]\displaystyle{ \sigma }[/math] is monochromatic. Then from (3) we have

[math]\displaystyle{ f_{01}(\sigma(112)) = {\Bbb E}_h c_h(\sigma(**2)) g_{01,h}(\sigma(002)) }[/math]

and

[math]\displaystyle{ f_{01}(\sigma(012)) = {\Bbb E}_h c_h(\sigma(0*2)) g_{01,h}(\sigma(002)) }[/math]

and hence by monochromaticity

[math]\displaystyle{ f_{01}(\sigma(112)) = f_{01}(\sigma(012)) }[/math]

and similarly

[math]\displaystyle{ f_{20}(\sigma(212)) = f_{20}(\sigma(012)) }[/math].

Thus we can bound (2) from below by

[math]\displaystyle{ {\Bbb E} 1_{mono}(\sigma) [f f_{01} f_{20}](\sigma(012)) }[/math]

where [math]\displaystyle{ 1_{mono} }[/math] is the indicator function on the class of monochromatic spaces.

Now, we claim that [math]\displaystyle{ f_{01} }[/math] is large on almost all of the support of f. Indeed, if there was a large set in the support of f on which [math]\displaystyle{ f_{01} }[/math] was small, then denoting f' by the restriction of f to that set, [math]\displaystyle{ {\Bbb E} f' }[/math] would be large but [math]\displaystyle{ {\Bbb E} f' f_{01} }[/math] would be small. But if [math]\displaystyle{ f'_{01} }[/math] denotes the component of f' which is 01-almost periodic relative to 12-low influence, then [math]\displaystyle{ f'_{01} }[/math] is dominated by [math]\displaystyle{ f_{01} }[/math] (since f' is dominated by f), and so [math]\displaystyle{ {\Bbb E} f' f'_{01} }[/math] would be small. But by orthogonality of f' and [math]\displaystyle{ f'-f'_{01} }[/math], this is equal to [math]\displaystyle{ {\Bbb E} (f'_{01})^2 }[/math]. Since [math]\displaystyle{ f'_{01} }[/math] has the same mean as f', which is large, this is a contradiction. Thus [math]\displaystyle{ f_{01} }[/math] is large on almost all the support of f, and similarly for [math]\displaystyle{ f_{20} }[/math]. Thus we can bound (2) from below by some multiple of

[math]\displaystyle{ {\Bbb E} 1_{mono}(\sigma) 1_{supp(f)}(\sigma(012)) }[/math].

But because f has positive density inside any 12-low influence set, this expression is large, and we are done.

@@ Line 1: / Line 1: @@
-This proof of [[DHJ(3)]] is being pursued in our [http://terrytao.wordpress.com/2009/02/11/a-reading-seminar-on-density-hales-jewett/ reading seminar].  Below is an ongoing "combinatorial" translation of the argument.
+This proof of [[DHJ(3)]] is being pursued in our [http://terrytao.wordpress.com/2009/02/11/a-reading-seminar-on-density-hales-jewett/ reading seminar].  Below is an ongoing "combinatorial" translation of the argument. [Very informal!]
+We write <math>[3] = \{0,1,2\}</math> for our alphabet.  We let <math>Lines(n)</math> be the space of all combinatorial lines <math>\ell: [3] \to [3]^n</math> with "few" wildcards, where we will be vague about what "few" means.  We also have <math>Planes(n)</math>, the space of combinatorial planes <math>\pi: [3]^2 \to [3]^n</math>, and <math>Spaces(n)</math>, the space of combinatorial 3-spaces <math>\sigma: [3]^3 \to [3]^n</math>.  Note that one can think of a combinatorial plane in <math>[3]^n</math> as a combinatorial line in Lines(n), etc.
-== Step 1: Random sampling ==
+[[DHJ(3)]] is then the claim that if <math>f: [3]^n \to [0,1]</math> has large mean, then
-Suppose for contradiction that [[DHJ(3)]]; then there exists <math>\delta > 0</math>, a sequence of n going to infinity, and a sequence <math>A^{(n)} \subset [3]^n</math> of [[line-free]] sets, all with density at least <math>\delta</math>.
+:<math>{\Bbb E} f(\ell(0)) f(\ell(1)) f(\ell(2))</math> is large,
-For any fixed integer m, and with any n larger than m, we can then form a random line-free subset <math>A^{(n)}_m \subset [3]^m</math> by randomly embedding <math>[3]^m</math> inside <math>[3]^n</math>.  More precisely,
+where <math>\ell</math> ranges over Lines(n).
-# Pick m distinct positions <math>a_1,\ldots,a_m \in [n]</math> uniformly at random.
+== 12-low influence ==
-# Outside of these m positions, pick a random word x in <math>[3]^{[n] \backslash \{a_1,\ldots,a_m\}}</math> uniformly at random.
-# Define the random sampling map <math>\phi_m^{(n)}: [3]^m \to [3]^n$ by mapping <math>w_1 \ldots w_m</math> to the string that is equal to x outside of the positions <math>a_1,\ldots,a_m</math>, and then setting the position at <math>a_i</math> equal to <math>w_i</math>.
-# Define <math>A^{(n)}_m := (\phi_m^{(n)})^{-1}(A^{(n)})</math>.
-One can think of <math>A^{(n)}_m</math> as a random variable taking values in the power set <math>X_m := 2^{[3]^m}</math>, so its probability distribution is a probability measure <math>\mu^{(n)}_m</math> on <math>X_m</math>.  Since <math>A^{(n)}</math> is line-free, the <math>A^{(n)}_m</math> are also.
+We say that a function <math>f: [3]^n \to [-1,1]</math> is ''12-low influence'' if
-These random variables have two notable stationarity properties.  The first is permutation symmetry: the obvious action of the permutation group <math>S_m</math> on <math>X_m</math> preserves <math>\mu^{(n)}_m</math>, thus <math>A^{(n)}_m</math> is stationary with respect to this action.  The other symmetry comes from the ''slicing maps'' <math>U_{m+|w|}^w: X_{m+|w|} \to X_m</math> defined for any m, and any word <math>w \in [3]^{|w|}</math> of length |w| by the formula
+:<math>{\Bbb E} |f(x)-f(y)|^2</math> is small,
-:<math>U_{m+|w|}^w(A_{m+|w|}) := \{ v \in [3]^m: vw \in A_{m+|w|} \}</math>.
+where x is drawn uniformly from <math>[3]^n</math> and y is formed from x by randomly flipping a 1 to a 2 or vice versa.  Equivalently, f is 12-low influence if
-Observe the semigroupoid law
+<math>{\Bbb E} |f(\ell(1)) - f(\ell(2))|^2</math> is small,
-:<math>U_{m+|w|}^w U_{m+|w|+|u|}^u = U_{m+|wu|}^{wu}</math> (1)
+where <math>\ell</math> is drawn randomly from Lines(n).
-for all words w, u.
+The following claim is, technically, not true, but is a useful first approximation to the truth:
-One has the <i>almost stationary</i> property: for fixed m and w, the random variables <math>A^{(n)}_m</math> and <math>U_{m+|w|}^w(A^{(n)}_{m+|w|})</math> have asymptotically the same distribution in the limit <math>n \to \infty</math>.  This is because both of these random subsets of <math>[3]^m</math> are formed by pulling back <math>A^{(n)}</math> by a random m-dimensional subspace; the probability distributions of these random subspaces differ slightly, but one can compute that the total variation of the difference goes to zero in the limit <math>n \to \infty</math> (holding m,w fixed).
+:'''Lemma 1''' (A lie)  Let <math>f: [3]^n \to [-1,1]</math>.  Then the function <math>F(x) := {\Bbb E}_{\ell: \ell(1)=x} f(\ell(2))</math> has 12-low influence, as does the function <math>G(x) := {\Bbb E}_{\ell: \ell(2) = x} f(\ell(1))</math>.
-Taking a subsequence limit as <math>n \to \infty</math> (and using the Helly selection principle, a.k.a. the vague sequential compactness of probability measures), we can then find random line-free sets <math>A_m \subset [3]^m</math> (or equivalently, a probability measure <math>\mu_m</math> on <math>X_m</math>) which is stationary with respect to the permutation groups <math>S_m</math> acting on <math>X_m</math>, and also with respect to the slicing maps <math>U_{m+|w|}^w: X_{m+|w|} \to X_m</math>.  Also, since <math>A^{(n)}_0</math> was non-empty with probability at least <math>\delta</math>, then so does <math>A_0</math>; by stationarity, this implies that for any <math>v \in [3]^m</math>, that <math>v\in A_m</math> with probability at least <math>\delta</math> also.
+Let us say that a function <math>f: [3]^n \to [-1,1]</math> is ''12-uniform'' if <math>{\Bbb E} g(\ell(1)) f(\ell(2))</math> is small for all <math>g: [3]^n \to [-1,1]</math>, where <math>\ell</math> is again ranging over Lines(n).  Assuming Lemma 1, we see that 12-uniform functions are basically orthogonal to 12-low influence functions.  In fact any function can essentially be orthogonally decomposed into 12-uniform and 12-low influence components (see the [[Fourier-analytic proof of Sperner]] for some more discussion).
-== Step 2.  Inverting the translations ==
+We can also define 12-uniformity and 12-low influence on functions on Lines(n) rather than <math>[3]^n</math>, where now <math>\ell</math> is ranging over lines in Lines(n) (i.e. in elements of Planes(n)).  Similarly for functions on Planes(n), Spaces(n), etc.
-This is all very well and good, but there is one minor technical problem which will cause difficulty in later parts of the argument: the slicing maps <math>U_{m+|w|}^w: X_{m+r} \to X_m</math> are not invertible.  We would like to (artificially) upgrade them to be invertible (thus upgrading the semigroupoid of transformations <math>U_{m+r}^w</math> to a groupoid). This can be done, but requires extending the measure spaces $latex X_m$ a bit (or in probabilistic language, by adding random number generators).
+== 01-uniformity relative to 12-low influence ==
-Let's turn to the details.  It suffices to understand what is going on with the hyperplane slicing maps <math>U_{m+1}^i: X_{m+1} \to X_m</math> for <math>i=0,1,2</math>, since the more general maps can be expressed as compositions of these maps by (1).  The key lemma is
+We can define 01-uniformity in exact analogy to 12-uniformity; a function f is 01-uniform if <math>{\Bbb E} g(\ell(0)) f(\ell(1))</math> is small for all <math>g: [3]^n \to [-1,1]</math>.  We need a more sophisticated notion (analogous to <math>U^2({\Bbb Z}/N{\Bbb Z})</math> Gowers uniformity): a function f is 01-uniform relative to 12-low influence if <math>{\Bbb E} g(\ell(0)) f(\ell(1)) h(\ell)</math> is small for all <math>g: [3]^n \to [-1,1]</math> and all 12-low influence <math>h: Lines(n) \to [-1,1]</math>.
-'''Lemma 1'''  Let <math>(X,\mu_X)</math> and <math>(Y,\mu_Y)</math> be finite probability spaces, and let <math>U: X \to Y</math> be a surjective map which is probability-preserving (thus <math>U_* \mu_X = \mu_Y</math>, or equivalently that <math>\mu_X(U^{-1}(E)) = \mu_Y(E)</math> for every <math>E \subset Y).  Then one can lift U to an (almost surely) invertible map <math>\tilde U: X \times [0,1] \to Y \times [0,1]</math> on the product probability spaces which forms a commuting square with the projections <math>\pi_X: X \times [0,1] \to X</math>, <math>\pi_Y: Y \times [0,1] \to Y</math>, such that <math>\tilde U, \tilde U^{-1}</math> are measure-preserving.
+'''Example''' Random functions are 01-uniform relative to 12-low influence.
-'''Proof''': By working one fibre of U at a time, one can reduce to the case when Y is a point (so that U is trivial).  Order X as <math>x_1,\ldots,x_k</math>, with probabilities <math>p_1,\ldots,p_k</math>.  One can then take
+:'''Lemma 2''' (von Neumann theorem)  If <math>f: [3]^n \to [-1,1]</math> is 01-uniform relative to 12-low influence and <math>g, h: [3]^n \to [-1,1]</math>, then <math>{\Bbb E} h(\ell(0)) f(\ell(1)) g(\ell(2))</math> is small.
-:<math>\tilde U( x_j, t ) := p_1 + \ldots + p_{j-1} + p_j t</math>
+'''Proof''': An "IP van der Corput lemma" allows one to ensure that <math>{\Bbb E} h(\ell(0)) f(\ell(1)) g(\ell(2))</math> is small if we can show that
-for <math>j=1,\ldots,k</math> and <math>t \in [0,1]</math>; one easily verifies the desired properties.  <math>\Box</math>
+:<math>{\Bbb E} f(\pi(01)) g(\pi(02)) f(\pi(11)) g(\pi(22))</math> (1)
-Applying this lemma, we may now construct invertible measure-preserving slicing maps <math>\tilde U_{m+1}^i: \tilde X_{m+1} \to \tilde X_m</math> that commute with the original slicing maps, where <math>\tilde X_m := X_m \times [0,1]</math> with product measure.  We can also extend these maps to negative m, by declaring <math>X_m</math> to be a point in these cases (and the slicing map to be trivial).  We then have the invertible measure-preserving slicing map <math>\tilde U_{m+r}^w: \tilde X_{m+r} \to \tilde X_m</math> for all words <math>w \in [3]^r</math> of length r, defined by the semigroupoid law
+is small, where <math>\pi</math> ranges over Planes(n).  (Rough sketch of proof: we can rearrange <math>{\Bbb E} h(\ell(0)) f(\ell(1)) g(\ell(2))</math> as
-:<math>\tilde U_{m+r}^{w_1 \ldots w_r} = \tilde U_{m+1}^{w_1} \ldots \tilde U_{m+r}^{w_r}.</math>
+:<math>{\Bbb E}_{\phi,i} h(\phi(0^r)) f(\phi(1^i 0^{r-i})) g(2^i 0^{r-i})</math>
-In fact, we can now extend this groupoid action not just to words, but to any element w of the free group <math>F_3</math> generated by the alphabet 0,1,2 and the formal inverses <math>0^{-1}, 1^{-1}, 2^{-1}</math>, defining <math>\tilde U_{m+|w|}^w: \tilde X_{m+|w|} \to \tilde X_m</math> using the semigroupoid law (1) and the inversion law
+where r is a medium size number, <math>\phi: [3]^r \to [3]^n</math> ranges over r-dimensional subspaces with few wildcards, and i ranges from 1 to r.  Using Cauchy-Schwarz in i, we reduce to bounding
-:<math>(\tilde U_{m+|w|}^w)^{-1} := \tilde U_m^{w^{-1}}</math>.
+:<math>{\Bbb E}_{\phi,i,j} f(\phi(1^i 0^{r-i})) g(2^i 0^{r-i}) f(\phi(1^j 0^{r-j})) g(2^j 0^{r-j})</math>
-Here we adopt the convention that the inverses <math>0^{-1}, 1^{-1}, 2^{-1}</math> have length -1.  Thus for instance <math>\tilde U_5^{012^{-1}1}: \tilde X_5 \to \tilde X_3</math> is the map
+which can be rearranged to give (1)).
-:<math>\tilde U_5^{012^{-1}1} = \tilde U_4^0 \tilde U_5^1 (\tilde U_4^2)^{-1} \tilde U_5^1</math>.
+We can rearrange (1) further as
+:<math>{\Bbb E} F(\pi(1)) G(\pi(2))</math>
+where <math>\pi(1), \pi(2)</math> are thought of as lines and <math>F(\ell) := f(\ell(0)) f(\ell(1))</math> and <math>G(\ell) := g(\ell(0)) g(\ell(2))</math>.  But by Lemma 1, we can rewrite this as
+:<math>{\Bbb E} F(\ell) \overline{G}(\ell)</math>
+where <math>\overline{G}</math> is 12-low influence.  But as f is 01-uniform relative to 12-low influence, this expression is small.  <math>\Box</math>
+Thus, when dealing with expressions of the form
+:<math>{\Bbb E} h(\ell(0)) f(\ell(1)) g(\ell(2))</math>,
+functions which are 01-uniform relative to 12-low influence can be discarded from the second factor.  For similar reasons, functions which are 20-uniform relative to 12-low influence (defined in the obvious manner) can be discarded from the third factor).
+== Obstructions to uniformity ==
+To continue, we need to understand what the obstructions are to 01-uniformity relative to 12-low influence.  The answer turns out to be 01-almost periodicity relative to 12-low influence.  Roughly speaking, a function f is 01-periodic relative to 12-low influence if any shift from 0s to 1s distorts f in a manner which can be described by 12-low influence functions.  More precisely, there exists a bounded number of functions <math>g_h: [3]^n \to [-1,1]</math> and 12-low influence functions <math>c_h: Lines(n) \to [-1,1]</math> for <math>h = 1,\ldots,H</math> such that the equation
+:<math>f(\ell(1)) = {\Bbb E}_h c_h(\ell) g_h(\ell(0))</math>
+is approximately true for most lines <math>\ell</math> in Lines(n).
+'''Example 1''' If f is 01-low influence, then <math>f(\ell(1)) \approx f(\ell(0))</math>, so f is certainly 01-almost periodic relative to 12-low influence.
+'''Example 2''' Let <math>f(x) \in \{-1,+1\}</math> be the parity of the number of 1s in x, then <math>f(\ell(1)) = c(\ell) f(\ell(0))</math> where <math>c(\ell)</math> depends only on the number of wildcards in <math>\ell</math>, and is in particular 12-low influence.  Thus this function is also 01-almost periodic relative to 12-low influence.
+'''Example 3''' If <math>f: [3]^n \to [-1,1]</math> is 12-low influence, then trivially <math>f(\ell(1)) = c(\ell)</math> where <math>c(\ell) := f(\ell(1))</math>.  Since c is 12-low influence, we conclude that f is 01-almost periodic relative to 12-low influence.
+'''Example 4''' Any bounded linear combination of 01-almost periodic functions relative to 12-low influence is also 01-almost periodic relative to 12-low influence; also, any product of 01-almost periodic functions relative to 12-low influence is also 01-almost periodic relative to 12-low influence.
+These almost periodic functions obstruct uniformity:
+'''Lemma 2''': If <math>f: [3]^n \to [-1,1]</math> is 01-uniform relative to 12-low influence, and <math>g: [3]^n \to [-1,1]</math> is 01-almost periodic relative to 12-low influence, then f and g have small inner product.
+'''Proof''': The inner product of f and g can be expressed as
+:<math> {\Bbb E} f(\ell(1)) g(\ell(1))</math>
+where <math>\ell</math> ranges over Lines(n).  On the other hand, from the almost periodicity of g we can write
+:<math>g(\ell(1)) = {\Bbb E}_h c_h(\ell) G_h(\ell(0))</math>
+for some 12-low influence <math>c_h</math> and some bounded <math>G_h</math>.  So we can reexpress the inner product as
+:<math> {\Bbb E}_h {\Bbb E} f(\ell(1)) G_h(\ell(0)) c_h(\ell).</math>
+But as f is 01-uniform relative to 12-low influence, the inner expectation is small, and the claim follows.  <math>\Box</math>
+Conversely, these are the only obstructions to uniformity:
+'''Lemma 3''': If <math>f: [3]^n \to [-1,1]</math> is not 01-uniform relative to 12-low influence, then there exists a <math>g: [3]^n \to [-1,1]</math> is 01-almost periodic relative to 12-low influence, such that f and g have large inner product.
+'''Proof''': By hypothesis, we can find a 12-low influence c and a bounded G such that
+:<math>{\Bbb E} f(\ell(1)) G(\ell(0)) c(\ell)</math>
+is large.  We then take
+:<math> g(x) := {\Bbb E}_{\ell: \ell(1) = x} G(\ell(0)) c(\ell)</math>.
+Clearly f correlates with g.
+If the expression inside the expectation was constant, then we would have
+:<math> g(\ell(1)) = G(\ell(0)) c(\ell)</math>
+and g would be 01-almost periodic relative to 12-low influence.  Of course, the expression here is not necessarily constant, but if it oscillates too much, then g would be very small and the claim would be easy.  I think we can use some sort of statistical sampling argument to handle the general case (I did this in [http://front.math.ucdavis.edu/math.CO/0405251 my finitisation of Furstenberg's proof of Szemeredi's theorem]); I'll come back to this point later.  <math>\Box</math>
+In view of this lemma, I believe we have
+'''Corollary 4''': Any function <math>f: [3]^n \to [0,1]</math> can be decomposed into a function <math>f_{01}: [3]^n \to [0,1]</math> with the same mean as f which is 01-almost periodic relative to 12-low influence, plus an error <math>f-f_{01}</math> which is 01-uniform relative to 12-low influence.
+We have a similar decomposition <math>f = f_{20} + (f-f_{20})</math> into a function <math>f_{20}: [3]^n \to [0,1]</math> with the same mean as f which is 20-almost periodic relative to 12-low influence, plus an error <math>f-f_{20}</math> which is 01-uniform relative to 12-low influence.  Using the von Neumann theorem, we can thus replace
+:<math>{\Bbb E} f(\ell(0)) f(\ell(1)) f(\ell(2))</math>
+by
+:<math>{\Bbb E} f(\ell(0)) f_{01}(\ell(1)) f_{20}(\ell(2))</math>.  (2)
+== Conclusion of proof ==
+Let <math>f_{12}</math> be the 12-low influence component of f, then <math>f_{12}</math> is non-negative, has large density, and correlates with f.  If we then let E be the set where <math>f_{12}</math> is large, then E is also 12-low influence, and f is large on E.
+By [[DHJ(2.5)]], the 12-low influence set E contains large combinatorial subspaces.  Passing to such a subspace, we may assume that E is everything, i.e. <math>f_{12}</math> is large everywhere.  To put it another way, this implies that f has large density inside any 12-low influence set.
+Having done this, we now perform the reductions in the previous section to get to the expression (2).  From the almost periodicity of <math>f_{01}, f_{20}</math>, we have
+:<math>f_{01}(\ell(1)) = {\Bbb E}_h c_h(\ell) g_{01,h}(\ell(0))</math> (3)
+where <math>c_h</math> is 12-low influence, and similarly
+:<math>f_{20}(\ell(2)) = {\Bbb E}_h c'_h(\ell) g_{20,h}(\ell(0))</math> (4)
+where <math>c'_h</math> is 12-low influence.
+Direct substitution of (3) and (4) into (2) does not quite work; we have to do something a bit sneakier.  Let r be a medium size number and consider an r-dimensional subspace <math>\phi: [3]^r \to [3]^n</math>.  By the [[Graham-Rothschild theorem]], if r is large enough, one can find a three-dimensional subspace <math>\sigma: [3]^3 \to [3]^n</math> of <math>\phi</math> on which <math>c_h, c'_h</math> are essentially constant.  Call such a subspace ''monochromatic''; thus the collection of monochromatic 3-dimensional subspaces has positive density in Spaces(n).  Furthermore, since this collection is defined using the 12-low influence functions <math>c_h, c'_h</math>, the collection of 3-dimensional monochromatic spaces is itself 12-low influence.
+We now rewrite (2) as
+:<math>{\Bbb E} f(\sigma(012)) f_{01}(\sigma(112)) f_{20}(\sigma(212))</math>
+where <math>\sigma</math> ranges over Spaces(n).
+Suppose that <math>\sigma</math> is monochromatic.  Then from (3) we have
+:<math>f_{01}(\sigma(112)) = {\Bbb E}_h c_h(\sigma(**2)) g_{01,h}(\sigma(002))</math>
+and
+:<math>f_{01}(\sigma(012)) = {\Bbb E}_h c_h(\sigma(0*2)) g_{01,h}(\sigma(002))</math>
+and hence by monochromaticity
+:<math>f_{01}(\sigma(112)) = f_{01}(\sigma(012))</math>
+and similarly
+:<math>f_{20}(\sigma(212)) = f_{20}(\sigma(012))</math>.
+Thus we can bound (2) from below by
+:<math>{\Bbb E} 1_{mono}(\sigma) [f f_{01} f_{20}](\sigma(012))</math>
+where <math>1_{mono}</math> is the indicator function on the class of monochromatic spaces.
+Now, we claim that <math>f_{01}</math> is large on almost all of the support of f.  Indeed, if there was a large set in the support of f on which <math>f_{01}</math> was small, then denoting f' by the restriction of f to that set, <math>{\Bbb E} f'</math> would be large but <math> {\Bbb E} f' f_{01}</math> would be small.  But if <math>f'_{01}</math> denotes the component of f' which is 01-almost periodic relative to 12-low influence, then <math>f'_{01}</math> is dominated by <math>f_{01}</math> (since f' is dominated by f), and so <math> {\Bbb E} f' f'_{01}</math> would be small.  But by orthogonality of f' and <math>f'-f'_{01}</math>, this is equal to <math> {\Bbb E} (f'_{01})^2 </math>.  Since <math>f'_{01}</math> has the same mean as f', which is large, this is a contradiction.  Thus <math>f_{01}</math> is large on almost all the support of f, and similarly for <math>f_{20}</math>.  Thus we can bound (2) from below by some multiple of
+:<math>{\Bbb E} 1_{mono}(\sigma) 1_{supp(f)}(\sigma(012))</math>.
+But because f has positive density inside any 12-low influence set, this expression is large, and we are done.

Furstenberg-Katznelson argument: Difference between revisions

Latest revision as of 18:43, 24 February 2009

Contents

12-low influence

01-uniformity relative to 12-low influence

Obstructions to uniformity

Conclusion of proof

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools