Fourier-analytic proof of Sperner

From Polymath1Wiki

Jump to: navigation, search


Fourier analysis

We identify [2]n with the power set 2[n] of [n]. Given any function f: 2^{[n]} \to {\Bbb R}, we define the Fourier transform \hat f: 2^{[n]} \to {\Bbb R} by the formula

\hat f(A) := {\Bbb E}_{X \in 2^{[n]}} (-1)^{|A \cap X|} f(X);

we have the Plancherel formula

\sum_{A \in 2^{[n]}} |\hat f(A)|^2 = {\Bbb E}_{X \in 2^{[n]}} |f(X)|^2 (1)

and the inversion formula

f(X) = \sum_{A \in 2^{[n]}} (-1)^{|A \cap X|} \hat f(A). (2)

For any scale 1 < r < n, we consider the bilinear form

 B_r( f, g ) := {\Bbb E} f(X) g(Y) (3)

where X is drawn randomly from 2[n], and Y is formed from X by setting j \in Y whenever j \in X, and when j \not \in X, setting j \in Y with probability r / n. To avoid significant distortion in the measure, we will be working in the regime 1 \leq r \ll \sqrt{n}. Note that if Br(1A,1A) is large, and r is not too small, then A will contain many combinatorial lines (since Y will be strictly larger than X with high probability); thus it is of interest to get lower bounds on Br.

We also define the influence Inf(f) as

 Inf(f) := {\Bbb E} |f(X) - f(X')|^2 (4)

where X' is formed from X by randomly flipping one bit (from 0 to 1, or vice versa). In Fourier space, we have

 Inf(f) = 4\sum_{A \in 2^{[n]}} \frac{|A|}{n} |\hat f(A)|^2. (5)

Dichotomy between structure and randomness

For any f: 2^{[n]} \to {\Bbb R}, define the uniformity norm \|f\|_{U(r)} at scale r by the formula

\|f\|_{U(r)} := \sup_g \max( |B_r(f,g)|, |B_r(g,f)| ) (6)

where g ranges over all functions g: 2^{[n]} \to [-1,1]. Functions which are small in the uniform norm are thus "negligible" for the purposes of computing Br and can be discarded. (This is analogous to the theory of counting arithmetic progressions of length three, in which functions with small Fourier coefficients (or small Gowers U2 norm) can be discarded.)

Lemma 1 Let f: 2^{[n]} \to [-1,1] be such that \|f\|_{U(r)} \geq \eta, where 1 \leq r \ll \sqrt{n} and η > 0. Then there exists a function F: 2^{[n]} \to [-1,1] of influence Inf(F) \ll_\eta \frac{1}{r} such that
 \langle f, F \rangle := {\Bbb E}_{X \in 2^{[n]}} f(X) F(X) \gg_\eta 1.

It seems that this is more or less proven in Ryan's notes. (May be worth redoing it here on the wiki. Note that the scale r here basically corresponds to εn in Ryan's notes.)

Structure theorem

Now we use the general "energy increment" argument (see e.g. Terry's paper on this) to obtain a structural theorem.

Corollary 2 (Koopman-von Neumann type theorem) Let f: 2^{[n]} \to [0,1] and 0 < η < 1. Then there exists k = k(η) such that whenever \sqrt{n} \gg r_k > \ldots > r_1 > 1 be a sequence of scales. Then there exists 1 \leq i \leq k and a decomposition
f = f_{U^\perp} + f_U (7)
where f_{U^\perp}, f_U take values in [-1,1],
Inf( f_{U^\perp} ) \ll_{\eta} 1/r_i (8)
\| f_U \|_{U(r_{i+1})} \leq \eta. (9)
Furthermore, f_{U^\perp} is non-negative and
 {\Bbb E} f = {\Bbb E} f_{U^\perp}. (10)

Proof (sketch) It's likely that there is now a slick proof of this type of thing using the duality arguments of Gowers or of Reingold-Trevisan-Tulsiani-Vadhan. But here is the "old school" approach from my long AP paper with Ben, running the energy increment algorithm:

  1. Initialise i=1, and let {\mathcal B} be the trivial partition of [2]n.
  2. Let f_{U^\perp} := {\Bbb E}(f|{\mathcal B}) be the conditional expectation of f with respect to the partition; initially, this is just the constant function {\Bbb E} f, but becomes more complicated as the partition gets finer. Set f_U := f - f_{U^\perp}
  3. If \| f_U \|_{U(r_{i+1})} \leq \eta then STOP.
  4. Otherwise, we apply Lemma 1 to find a function F of influence Oη(1 / ri + 1) that fU correlates with. Partition F into level sets (discretising in multiples of η or so, randomly shifting the cut-points to avoid edge effects) and use this to augment the partition {\mathcal B}. In doing so, the energy {\Bbb E} |f_{U^\perp}|^2 increases by some non-negligible amount c(η) > 0, thanks to Pythagoras' theorem: this is why this process will stop before k steps.
  5. Now return to Step 1.

A certain tedious amount of verification is needed to show that all the hypotheses are satisfied. One technical step is that one needs to use the Weierstrass approximation theorem to approximate f_{U^\perp} by a polynomial combination of low-influence functions, and one needs to show that polynomial combinations of low-influence functions are low-influence. The complexity of the polynomial is controlled by some function of η and this explains the losses in that parameter in (8). \Box


Now we prove DHJ(2). Let f = 1A have density δ. We let η be a small number depending on δ. Now we pick some scales 1 \leq r_1 < \ldots < r_k between 1 and \sqrt{n} (the exact choices are not so important, so long as the scales are widely separated), and apply the above corollary, to decompose f = f_{U^\perp} + f_U, where fU is uniform at some scale ri and f_{U^\perp} is low-influence at a coarser scale ri + 1.

We then look at B_{r_i}(f, f). Decomposing into four pieces and using the uniformity of fU, this expression is equal to B_{r_i}(f_{U^\perp}, f_{U^\perp}) + O(\eta). But the low influence of f_{U^\perp} means that when one randomly flips a bit from a 0 to a 1, f_{U^\perp} changes by Oη(1 / ri + 1) on the average. Iterating this ri times, we see that B_{r_i}(f_{U^\perp}, f_{U^\perp}) = {\Bbb E}( f_{U^\perp}^2 ) + O_\eta( r_i / r_{i+1} ). But f_{U^\perp} has density δ, and so the main term is at least δ2. Choosing η small enough and the ri widely separated enough we obtain a nontrivial lower bound on B_{r_i}(f_{U^\perp}, f_{U^\perp}), which gives DHJ(2).

Personal tools