A second Fourier decomposition related to Sperner's theorem

From Polymath Wiki
Revision as of 07:23, 27 February 2009 by 128.232.245.25 (talk) (Wrote some more)
Jump to navigationJump to search

Introduction

It seems as though it should be possible to give a clean "purely Fourier" proof of Sperner's theorem that exploits positivity, if one uses equal-slices measure. Here we present a Fourier decomposition that should achieve this, but we have not yet managed to prove the positivity (which could turn out to be a hard problem---at this stage it is not clear).

We use Ryan's formulation of equal-slices measure on [math]\displaystyle{ [2]^n }[/math]: the density of a set [math]\displaystyle{ \mathcal{A} }[/math] is [math]\displaystyle{ \mathbb{E}_p\mu_p(\mathcal{A}), }[/math] where [math]\displaystyle{ \mu_p }[/math] is the standard p-weighted measure on the cube: if A is a subset of [n] then [math]\displaystyle{ \mu_p(A)=p^{|A|}(1-p)^{n-|A|}. }[/math] A useful way of thinking about [math]\displaystyle{ \mu_p }[/math] is that [math]\displaystyle{ \mu_p(\mathcal{A} }[/math] is the probability that [math]\displaystyle{ (X_1,\dots,X_n)\in\mathcal{A} }[/math] if the [math]\displaystyle{ X_i }[/math] are independent Bernoulli random variables with probability p of equalling 1.

We also need to define a measure on the space of all combinatorial lines. The natural measure seems to be this. Define two sequences [math]\displaystyle{ (X_1,\dots,X_n) }[/math] and [math]\displaystyle{ (Y_1,\dots,Y_n) }[/math] of Bernoulli random variables as follows. The joint distribution of [math]\displaystyle{ (X_i,Y_i) }[/math] is that it equals (0,0),(0,1) and (1,1) with probabilities 1-q,q-p and p, respectively, and let different pairs [math]\displaystyle{ (X_i,Y_i) }[/math] be independent. Then the [math]\displaystyle{ X_i }[/math] are independent Bernoulli, as are the [math]\displaystyle{ Y_i, }[/math] but they depend on each other in a way that guarantees that they form a combinatorial line.

We shall now be interested in the quantity [math]\displaystyle{ \mathbb{E}f(X)f(Y), }[/math] where [math]\displaystyle{ f }[/math] is some function and [math]\displaystyle{ X=(X_1,\dots,X_n), Y=(Y_1,\dots,Y_n). }[/math] But ultimately what will interest us is the average of this quantity over all pairs [math]\displaystyle{ 0\leq p\leq q\leq 1, }[/math] which we can write as [math]\displaystyle{ \mathbb{E}_{p\leq q}\mathbb{E}f(X_{p,q})f(Y_{p,q}). }[/math]

A proof in physical space

Let us first prove positivity in physical space. To do this, we shall define the following equivalent model for the joint distribution of [math]\displaystyle{ X_{p,q} }[/math] and [math]\displaystyle{ Y_{p,q}. }[/math] We first pick a random variable [math]\displaystyle{ T=(t_1,\dots,t_n) }[/math] uniformly from [math]\displaystyle{ [0,1]^n, }[/math] and then we let [math]\displaystyle{ (X_{p,q})_i }[/math] be [math]\displaystyle{ 0 }[/math] if [math]\displaystyle{ p\leq t_i }[/math] and [math]\displaystyle{ 1 }[/math] if [math]\displaystyle{ p\gt t_i. }[/math] Similarly, we let [math]\displaystyle{ Y_{p,q} }[/math] be 0 if [math]\displaystyle{ q\leq t_i }[/math] and 1 if [math]\displaystyle{ q\gt t_i. }[/math] This gives us the nice property that it makes perfectly good sense even if [math]\displaystyle{ p\gt q, }[/math] and we therefore have the equation [math]\displaystyle{ \mathbb{E}_{p\leq q}\mathbb{E}f(X_{p,q})f(Y_{p,q})=(1/2)\mathbb{E}_{t\in T}\mathbb{E}_{p,q}f(X_{p,q})f(Y_{p,q}). }[/math]

Now [math]\displaystyle{ X_{p,q} }[/math] is just a function of p and t, and [math]\displaystyle{ Y_{p,q} }[/math] is the same function applied to q and t. Therefore, the right-hand side is equal to [math]\displaystyle{ \mathbb{E}_{t\in T}(\mathbb{E}_pf(X_{p,q}))^2, }[/math] which is positive.

We can say more. By Cauchy-Schwarz, the expectation is at least [math]\displaystyle{ (\mathbb{E}_{t\in T}\mathbb{E}_pf(X_{p,q}))^2, }[/math] which is the square of the equal-slices integral of f. In particular, if f is the characteristic function of a set [math]\displaystyle{ \mathcal{A}, }[/math] then the combinatorial-line density is at least the square of the density of [math]\displaystyle{ \mathcal{A}. }[/math]

A Fourier translation of the problem

Recall that the Walsh functions on the set [math]\displaystyle{ [2]^n }[/math] are defined as follows. For each subset [math]\displaystyle{ A\subset[n] }[/math] we define [math]\displaystyle{ w_A(B) }[/math] to be [math]\displaystyle{ (-1)^{|A\cap B|}. }[/math] They can also be conveniently defined in sequence terms: for each A and each [math]\displaystyle{ x\in\{0,1\}^n }[/math] we define [math]\displaystyle{ w_A(x)=\prod_{i\in A}(-1)^{x_i}. }[/math]

It can easily be checked that the Walsh functions form an orthonormal basis of the [math]\displaystyle{ 2^n }[/math]-dimensional Euclidean space of all functions from [math]\displaystyle{ \{0,1\}^n }[/math] to [math]\displaystyle{ \mathbb{R}, }[/math] if we take as our inner product the expression [math]\displaystyle{ \langle f,g\rangle =\mathbb{E}_xf(x)g(x). }[/math] However, we shall not always be taking the uniform measure on [math]\displaystyle{ \{0,1\}^n. }[/math] If we take the measure [math]\displaystyle{ \mu_p, }[/math] then the natural inner product is [math]\displaystyle{ \langle f,g\rangle_p=\sum_x\mu_p(x)f(x)g(x), }[/math] which we shall write as [math]\displaystyle{ \mathbb{E}_pf(x)g(x). }[/math] (This can be read as the expectation of [math]\displaystyle{ f(x)g(x) }[/math] if the coordinates of x are independent Bernoulli variables with mean p.)

With respect to this inner product, the Walsh functions are no longer orthonormal, but we can define very similar functions that are orthonormal, as follows. First, choose numbers [math]\displaystyle{ a_p }[/math] and [math]\displaystyle{ b_p }[/math] such that [math]\displaystyle{ (1-p)a_p+pb_p=0, }[/math] [math]\displaystyle{ (1-p)a_p^2+pb_p^2=1, }[/math] and [math]\displaystyle{ a_p\geq 0. }[/math] This turns out to imply that [math]\displaystyle{ a_p=(p/(1-p))^{1/2} }[/math] and [math]\displaystyle{ b_p=-((1-p)/p)^{1/2}, }[/math] though for now it is the equations they satisfy that we really care about. Next, define [math]\displaystyle{ w_{i,p}(x) }[/math] to be [math]\displaystyle{ a_p }[/math] if [math]\displaystyle{ x_i=0 }[/math] and [math]\displaystyle{ b_p }[/math] if [math]\displaystyle{ x_i=1. }[/math] Finally, define [math]\displaystyle{ w_{A,p}(x) }[/math] to be </math>\prod_{i\in A}w_{i,p}(x).</math>

To be continued.