A second Fourier decomposition related to Sperner's theorem

From Polymath Wiki
Jump to navigationJump to search

Introduction

It seems as though it should be possible to give a clean "purely Fourier" proof of Sperner's theorem that exploits positivity, if one uses equal-slices measure. Here we present a Fourier decomposition that should achieve this, but we have not yet managed to prove the positivity (which could turn out to be a hard problem---at this stage it is not clear).

We use Ryan's formulation of equal-slices measure on [math]\displaystyle{ [2]^n }[/math]: the density of a set [math]\displaystyle{ \mathcal{A} }[/math] is [math]\displaystyle{ \mathbb{E}_p\mu_p(\mathcal{A}), }[/math] where [math]\displaystyle{ \mu_p }[/math] is the standard p-weighted measure on the cube: if A is a subset of [n] then [math]\displaystyle{ \mu_p(A)=p^{|A|}(1-p)^{n-|A|}. }[/math] A useful way of thinking about [math]\displaystyle{ \mu_p }[/math] is that [math]\displaystyle{ \mu_p(\mathcal{A}) }[/math] is the probability that [math]\displaystyle{ (X_1,\dots,X_n)\in\mathcal{A} }[/math] if the [math]\displaystyle{ X_i }[/math] are independent Bernoulli random variables with probability p of equalling 1.

We also need to define a measure on the space of all combinatorial lines. The natural measure seems to be this. Define two sequences [math]\displaystyle{ (X_1,\dots,X_n) }[/math] and [math]\displaystyle{ (Y_1,\dots,Y_n) }[/math] of Bernoulli random variables as follows. The joint distribution of [math]\displaystyle{ (X_i,Y_i) }[/math] is that it equals (0,0),(0,1) and (1,1) with probabilities 1-q,q-p and p, respectively, and let different pairs [math]\displaystyle{ (X_i,Y_i) }[/math] be independent. Then the [math]\displaystyle{ X_i }[/math] are independent Bernoulli, as are the [math]\displaystyle{ Y_i, }[/math] but they depend on each other in a way that guarantees that they form a combinatorial line.

We shall now be interested in the quantity [math]\displaystyle{ \mathbb{E}f(X)f(Y), }[/math] where [math]\displaystyle{ f }[/math] is some function and [math]\displaystyle{ X=(X_1,\dots,X_n), Y=(Y_1,\dots,Y_n). }[/math] But ultimately what will interest us is the average of this quantity over all pairs [math]\displaystyle{ 0\leq p\leq q\leq 1, }[/math] which we can write as [math]\displaystyle{ \mathbb{E}_{p\leq q}\mathbb{E}f(X_{p,q})f(Y_{p,q}). }[/math]

A proof in physical space

Let us first prove positivity in physical space. To do this, we shall define the following equivalent model for the joint distribution of [math]\displaystyle{ X_{p,q} }[/math] and [math]\displaystyle{ Y_{p,q}. }[/math] We first pick a random variable [math]\displaystyle{ T=(t_1,\dots,t_n) }[/math] uniformly from [math]\displaystyle{ [0,1]^n, }[/math] and then we let [math]\displaystyle{ (X_{p,q})_i }[/math] be [math]\displaystyle{ 0 }[/math] if [math]\displaystyle{ p\leq t_i }[/math] and [math]\displaystyle{ 1 }[/math] if [math]\displaystyle{ p\gt t_i. }[/math] Similarly, we let [math]\displaystyle{ Y_{p,q} }[/math] be 0 if [math]\displaystyle{ q\leq t_i }[/math] and 1 if [math]\displaystyle{ q\gt t_i. }[/math] This gives us the nice property that it makes perfectly good sense even if [math]\displaystyle{ p\gt q, }[/math] and we therefore have the equation [math]\displaystyle{ \mathbb{E}_{p\leq q}\mathbb{E}f(X_{p,q})f(Y_{p,q})=\mathbb{E}_{t\in T}\mathbb{E}_{p,q}f(X_{p,q})f(Y_{p,q}). }[/math]

Now [math]\displaystyle{ X_{p,q} }[/math] is just a function of p and t, and [math]\displaystyle{ Y_{p,q} }[/math] is the same function applied to q and t. Therefore, it makes sense to change our notation. Let [math]\displaystyle{ (X_{p,t})_i=0 }[/math] if [math]\displaystyle{ p\leq t }[/math] and 1 if [math]\displaystyle{ p\gt t. }[/math] Then the right-hand side can be rewritten as [math]\displaystyle{ \mathbb{E}_{t\in T}\mathbb{E}_{p,q}f(X_{p,t})f(X_{q,t}). }[/math] Therefore, it is equal to [math]\displaystyle{ \mathbb{E}_{t\in T}(\mathbb{E}_pf(X_{p,t}))^2, }[/math] which is positive.

We can say more. By Cauchy-Schwarz, the expectation is at least [math]\displaystyle{ (\mathbb{E}_{t\in T}\mathbb{E}_pf(X_{p,t}))^2, }[/math] which is the square of the equal-slices integral of f. In particular, if f is the characteristic function of a set [math]\displaystyle{ \mathcal{A}, }[/math] then the combinatorial-line density is at least the square of the density of [math]\displaystyle{ \mathcal{A}. }[/math]

Equal-slices Fourier analysis

The equal-slices Fourier transform

Recall that the Walsh functions on the set [math]\displaystyle{ [2]^n }[/math] are defined as follows. For each subset [math]\displaystyle{ A\subset[n] }[/math] we define [math]\displaystyle{ w_A(B) }[/math] to be [math]\displaystyle{ (-1)^{|A\cap B|}. }[/math] They can also be conveniently defined in sequence terms: for each A and each [math]\displaystyle{ x\in\{0,1\}^n }[/math] we define [math]\displaystyle{ w_A(x)=\prod_{i\in A}(-1)^{x_i}. }[/math]

It can easily be checked that the Walsh functions form an orthonormal basis of the [math]\displaystyle{ 2^n }[/math]-dimensional Euclidean space of all functions from [math]\displaystyle{ \{0,1\}^n }[/math] to [math]\displaystyle{ \mathbb{R}, }[/math] if we take as our inner product the expression [math]\displaystyle{ \langle f,g\rangle =\mathbb{E}_xf(x)g(x). }[/math] However, we shall not always be taking the uniform measure on [math]\displaystyle{ \{0,1\}^n. }[/math] If we take the measure [math]\displaystyle{ \mu_p, }[/math] then the natural inner product is [math]\displaystyle{ \langle f,g\rangle_p=\sum_x\mu_p(x)f(x)g(x), }[/math] which we shall write as [math]\displaystyle{ \mathbb{E}_pf(x)g(x). }[/math] (This can be read as the expectation of [math]\displaystyle{ f(x)g(x) }[/math] if the coordinates of x are independent Bernoulli variables with mean p.)

With respect to this inner product, the Walsh functions are no longer orthonormal, but we can define very similar functions that are orthonormal, as follows. First, choose numbers [math]\displaystyle{ a_p }[/math] and [math]\displaystyle{ b_p }[/math] such that [math]\displaystyle{ (1-p)a_p+pb_p=0, }[/math] [math]\displaystyle{ (1-p)a_p^2+pb_p^2=1, }[/math] and [math]\displaystyle{ a_p\geq 0. }[/math] This turns out to imply that [math]\displaystyle{ a_p=(p/(1-p))^{1/2} }[/math] and [math]\displaystyle{ b_p=-((1-p)/p)^{1/2}, }[/math] though for now it is the equations they satisfy that we really care about. Next, define [math]\displaystyle{ w_{i,p}(x) }[/math] to be [math]\displaystyle{ a_p }[/math] if [math]\displaystyle{ x_i=0 }[/math] and [math]\displaystyle{ b_p }[/math] if [math]\displaystyle{ x_i=1. }[/math] Finally, define [math]\displaystyle{ w_{A,p}(x) }[/math] to be [math]\displaystyle{ \prod_{i\in A}w_{i,p}(x). }[/math] Note that if [math]\displaystyle{ p=1/2, }[/math] then [math]\displaystyle{ a_p=1, b_p=-1, w_{i,p}(x)=(-1)^{x_i} }[/math] and [math]\displaystyle{ w_{A,p}(x)=w_A(x). }[/math]

We now define the p-Fourier transform of a function f to be the function [math]\displaystyle{ \hat{f}_p }[/math] given by the formula [math]\displaystyle{ \hat{f}_p(A)=\langle f,w_{A,p}\rangle. }[/math] We then define the equal-slices Fourier coefficient of f at A to be the function [math]\displaystyle{ p\mapsto\hat{f}_p(A). }[/math] This is unlike a normal Fourier coefficient in that its value is a function rather than a real number, but this turns out to be surprisingly unproblematic for many purposes.

An inversion formula

Obviously one can recover f from its equal-slices Fourier coefficients, since one can recover f from its Walsh coefficients, or indeed its p-Walsh coefficients for any p other than 0 or 1. However, it is more natural to average the results. In other words, since for each p we know that [math]\displaystyle{ f(x)=\sum_A\hat{f}_p(A)w_{A,p}(x), }[/math] we obtain the formula [math]\displaystyle{ f=\mathbb{E}_{p\in[0,1]}\sum_A\hat{f}_p(A)w_{A,p}. }[/math] Reversing the order of summation/integration, we can also split up according to the contribution from each set A, obtaining the formula [math]\displaystyle{ f=\sum_A\int_0^1\hat{f}_p(A)w_{A,p}dp. }[/math]

A Parseval identity

Next, let us prove a Parseval identity. First, we define the equal-slices inner product of f and g to be [math]\displaystyle{ \mathbb{E}_{p\in[0,1]}\langle f,g\rangle_p. }[/math] It is easy to see that this is indeed an inner product. More generally, if [math]\displaystyle{ f_p }[/math] and [math]\displaystyle{ g_p }[/math] are functions that depend on p, then we define the equal-slices inner product to be [math]\displaystyle{ \mathbb{E}_{p\in[0,1]}\langle f_p,g_p\rangle_p. }[/math] Let us now calculate the equal-slices inner product of [math]\displaystyle{ w_{A,p} }[/math] and [math]\displaystyle{ w_{B,p}. }[/math]

This, like the inversion formula, follows trivially from the corresponding facts for fixed p, which is an easy exercise, but we will do it anyway, just to get used to the kinds of arguments used later. That is, we shall prove that for fixed p the functions [math]\displaystyle{ w_{A,p} }[/math] form an orthonormal basis with respect to the measure [math]\displaystyle{ \mu_p. }[/math] To do this, let [math]\displaystyle{ x }[/math] be chosen randomly according to this measure and recall that [math]\displaystyle{ w_{A,p}(x)=\prod_{i\in A}w_{i,p}(x), }[/math], and similarly for [math]\displaystyle{ w_{B,p}(x) }[/math]. Therefore, [math]\displaystyle{ w_{A,p}(x)w_{B,p}(x) }[/math] is a product of independent random variables, each of which is either 1, [math]\displaystyle{ w_{i,p}(x) }[/math] or [math]\displaystyle{ w_{i,p}(x)^2, }[/math] according as i belongs to zero, one or two of the sets A and B. Since the expectation of a product of independent random variables equals the product of the expectations, and the expectation of [math]\displaystyle{ w_{i,p}(x) }[/math] is zero (this is the true meaning of the condition [math]\displaystyle{ (1-p)a_p+pb_p=0 }[/math]), the entire expectation is zero unless A=B. And since the expectation of [math]\displaystyle{ w_{i,p}(x)^2 }[/math] is 1 (this is the meaning of the condition [math]\displaystyle{ (1-p)a_p^2+pb_p^2=1 }[/math]), the entire expectation is 1 when A=B.

This proves that [math]\displaystyle{ \langle w_{A,p},w_{B,p}\rangle=\delta_{AB}, }[/math] from which it follows that [math]\displaystyle{ \mathbb{E}_{p\in[0,1]}\langle w_{A,p},w_{B,p}\rangle_p=\delta_{AB}. }[/math]

If we now use the inversion formula to write [math]\displaystyle{ f=\sum_A\hat{f}_pw_{A,p} }[/math] and [math]\displaystyle{ g=\sum_B\hat{g}_pw_{B,p}, }[/math] then we find that [math]\displaystyle{ \langle f,g\rangle_p=\sum_{A,B}\delta_{A,B}\hat{f}_p(A)\hat{g}_p(B)=\sum_A\hat{f}_p(A)\hat{g}_p(A). }[/math] Therefore, the equal-slices inner product of f and g is equal to [math]\displaystyle{ \sum_A\int_0^1\hat{f}_p(A)\hat{g}_p(A)dp. }[/math] In particular, if f=g then we find that the square of the equal-slices [math]\displaystyle{ L_2 }[/math]-norm of f is equal to [math]\displaystyle{ \sum_A\int_0^1\hat{f}_p(A)^2dp. }[/math]

A Fourier expression for the density of subset pairs

We would now like to examine the quantity [math]\displaystyle{ \mathbb{E}_{t\in T}\mathbb{E}_{p,q}f(X_{p,t})f(X_{q,t}) }[/math] on the Fourier side. To do this, we write [math]\displaystyle{ f(X_{p,t}) }[/math] as [math]\displaystyle{ \sum_A\hat{f}_p(A)w_{A,p}(X_{p,t}) }[/math] and we write [math]\displaystyle{ f(X_{q,t}) }[/math] as [math]\displaystyle{ \sum_B\hat{f}_q(B)w_{B,q}(X_{q,t}). }[/math] So for each fixed p and q our problem reduces to calculating [math]\displaystyle{ \mathbb{E}_{t\in T}w_{A,p}(X_{p,t})w_{B,q}(X_{q,t}). }[/math]

This calculation is very similar to the calculations in the proof of Parseval's identity. The random variable [math]\displaystyle{ w_{A,p}(X_{p,t})w_{B,q}(X_{q,t}) }[/math] splits up as a product of functions of the various [math]\displaystyle{ t_i, }[/math] which are independent. If i belongs to A but not to B, then the expectation of [math]\displaystyle{ w_{i,p}(X_{p,t})w_{i,q}(X_{q,t}) }[/math] is the expectation of [math]\displaystyle{ w_{i,p}(X_{p,t}), }[/math] which is zero. So the contribution from pairs (A,B) with [math]\displaystyle{ A\ne B }[/math] is zero. If A=B then we need to calculate the expectation [math]\displaystyle{ \lambda_{p,q} }[/math] of [math]\displaystyle{ w_{i,p}(X_{p,t})w_{i,q}(X_{q,t}). }[/math]

This depends on which of p and q is larger. If [math]\displaystyle{ p\lt q }[/math], then we get [math]\displaystyle{ a_pa_q }[/math] with probability 1-q, [math]\displaystyle{ a_pb_q }[/math] with probability q-p, and [math]\displaystyle{ b_pb_q }[/math] with probability p. This gives an expectation of [math]\displaystyle{ (1-q)(\frac{p}{1-p})^{1/2}(\frac{q}{1-q})^{1/2}-(q-p)(\frac{p}{1-p})^{1/2}(\frac{1-q}{q})^{1/2}+p(\frac{1-p}{p})^{1/2}(\frac{1-q}{q})^{1/2}, }[/math] which works out to be [math]\displaystyle{ (\frac{p}{1-p})^{1/2}(\frac{1-q}{q})^{1/2}. }[/math] By symmetry, if [math]\displaystyle{ p\geq q }[/math] then [math]\displaystyle{ \lambda_{p,q}=(\frac{1-p}{p})^{1/2}(\frac{q}{1-q})^{1/2}. }[/math] Therefore, if A=B, then the expectation of [math]\displaystyle{ w_{A,p}(X_{p,t})w_{B,q}(X_{q,t}) }[/math] is [math]\displaystyle{ \lambda_{p,q}^{|A|}, }[/math] byt the independence of the [math]\displaystyle{ t_i. }[/math]

Putting all this together, we obtain the formula [math]\displaystyle{ \mathbb{E}_{t\in T}\mathbb{E}_{p,q}f(X_{p,t})f(X_{q,t})=\mathbb{E}_{p,q}\sum_A\lambda_{p,q}^{|A|}\hat{f}_p(A)\hat{g}_q(A). }[/math] This is the Fourier translation we were seeking.