Sperner's theorem

From Polymath Wiki
Revision as of 09:08, 12 March 2009 by 67.186.58.92 (talk)
Jump to navigationJump to search

Statement of the theorem

Sperner's theorem as originally stated is a result about set systems. Suppose that you want to find the largest collection of sets [math]\displaystyle{ \mathcal{A} }[/math] such that no set in [math]\displaystyle{ \mathcal{A} }[/math] is a proper subset of any other. Then the best you can do is choose all the sets of some fixed size---and of course the best size to pick is [math]\displaystyle{ \lfloor n/2\rfloor }[/math], since the binomial coefficient [math]\displaystyle{ \binom nm }[/math] is maximized when [math]\displaystyle{ m=\lfloor n/2\rfloor. }[/math]

Sperner's theorem is closely related to the density Hales-Jewett theorem: in fact, it is nothing other than DHJ(2) with the best possible bound. To see this, we associate each set [math]\displaystyle{ A\subset[n] }[/math] with its characteristic function (that is, the sequence that is 0 outside A and 1 in A). If we have a pair of sets [math]\displaystyle{ A\subset B, }[/math] then the two sequences form a combinatorial line in [math]\displaystyle{ [2]^n. }[/math] For example, if n=6 and A and B are the sets [math]\displaystyle{ \{2,3\} }[/math] and [math]\displaystyle{ \{2,3,4,6\} }[/math], then we get the combinatorial line that consists of the two points 011000 and 011101, which we can denote by 011*0* (so the wildcard set is [math]\displaystyle{ \{4,6\} }[/math]).

Proof of the theorem

There are several proofs, but perhaps the most enlightening is a very simple averaging argument that proves a stronger result. Let [math]\displaystyle{ \mathcal{A} }[/math] be a collection of subsets of [n]. For each k, let [math]\displaystyle{ \delta_k }[/math] denote the density of [math]\displaystyle{ \mathcal{A} }[/math] in the kth layer of the cube: that is, it is the number of sets in [math]\displaystyle{ \mathcal{A} }[/math] of size k, divided by [math]\displaystyle{ \binom nk. }[/math] The equal-slices measure of [math]\displaystyle{ \mathcal{A} }[/math] is defined to be [math]\displaystyle{ \delta_0+\dots+\delta_n. }[/math]

Now the equal-slices measure of [math]\displaystyle{ \mathcal{A} }[/math] is easily seen to be equal to the following quantity. Let [math]\displaystyle{ \pi }[/math] be a random permutation of [n], let [math]\displaystyle{ U_0,U_1,U_2\dots,U_n }[/math] be the sets [math]\displaystyle{ \emptyset, \{\pi(1)\},\{\pi(1),\pi(2)\},\dots,[n], }[/math] and let [math]\displaystyle{ \mu(\mathcal{A}) }[/math] be the expected number of the sets [math]\displaystyle{ U_i }[/math] that belong to [math]\displaystyle{ \mathcal{A}. }[/math] This is the same by linearity of expectation and the fact that the probability that [math]\displaystyle{ U_k }[/math] belongs to [math]\displaystyle{ \mathcal{A} }[/math] is [math]\displaystyle{ \delta_k. }[/math]

Therefore, if the equal-slices measure of [math]\displaystyle{ \mathcal{A} }[/math] is greater than 1, then the expected number of sets [math]\displaystyle{ U_k }[/math] in [math]\displaystyle{ \mathcal{A} }[/math] is greater than 1, so there must exist a permutation for which it is at least 2, and that gives us a pair of sets with one contained in the other.

To see that this implies Sperner's theorem, one just has to make the simple observation that a set with equal-slices measure at most 1 must have cardinality at most [math]\displaystyle{ \binom n{\lfloor n/2\rfloor}. }[/math] (If n is odd, so that there are two middle layers, then it is not quite so obvious that to have an extremal set you must pick one or other of the layers, but this is the case.) This stronger version of the statement is called the LYM inequality

Multidimensional version

The following proof is a variant of the Gunderson-Rodl-Sidorenko result. Its parameters are a little worse, but the proof is a little simpler.

Proposition 1: Let [math]\displaystyle{ A \subseteq \{0,1\}^n }[/math] have density [math]\displaystyle{ \delta }[/math]. Let [math]\displaystyle{ Y_1, \dots, Y_d }[/math] be a partition of [math]\displaystyle{ [n] }[/math] with [math]\displaystyle{ |Y_i| \geq r }[/math] for each [math]\displaystyle{ i }[/math]. If

[math]\displaystyle{ \delta^{2^d} - \frac{d}{\sqrt{\pi r}} \gt 0, }[/math] (1)

then [math]\displaystyle{ A }[/math] contains a nondegenerate combinatorial subspace of dimension [math]\displaystyle{ d }[/math], with its [math]\displaystyle{ i }[/math]th wildcard set a subset of [math]\displaystyle{ Y_i }[/math].

Proof: Let [math]\displaystyle{ C_i }[/math] denote a random chain from [math]\displaystyle{ 0^{|Y_i|} }[/math] up to [math]\displaystyle{ 1^{|Y_i|} }[/math], thought of as residing in the coordinates [math]\displaystyle{ Y_i }[/math], with the [math]\displaystyle{ d }[/math] chains chosen independently. Also, let [math]\displaystyle{ s_i, t_i }[/math] denote independent Binomial[math]\displaystyle{ (|Y_i|, 1/2) }[/math] random variables, [math]\displaystyle{ i \in [d] }[/math]. Note that [math]\displaystyle{ C_i(s_i) }[/math] and [math]\displaystyle{ C_i(t_i) }[/math] are (dependent) uniform random strings in [math]\displaystyle{ \{0,1\}^{Y_i} }[/math]. We write, say,

[math]\displaystyle{ (C_1(s_1), C_2(t_2), C_3(t_3), \dots, C_d(s_d)) }[/math] (2)

for the string in [math]\displaystyle{ \{0,1\}^n }[/math] formed by putting [math]\displaystyle{ C_1(s_1) }[/math] into the [math]\displaystyle{ Y_1 }[/math] coordinates, [math]\displaystyle{ C_2(t_2) }[/math] into the [math]\displaystyle{ Y_2 }[/math] coordinates, etc. Note that each string of this form is also uniformly random, since the chains are independent.

If all [math]\displaystyle{ 2^d }[/math] strings of the form in (2) are simultaneously in [math]\displaystyle{ A }[/math] then we have a [math]\displaystyle{ d }[/math]-dimensional subspace inside [math]\displaystyle{ A }[/math] with wildcard sets that are \emph{subsets} of [math]\displaystyle{ Y_1, \dots, Y_d }[/math]. All [math]\displaystyle{ d }[/math] dimensions are nondegenerate iff [math]\displaystyle{ s_i \neq t_i }[/math] for all [math]\displaystyle{ i }[/math]. Since [math]\displaystyle{ s_i }[/math] and [math]\displaystyle{ t_i }[/math] are independent Binomial[math]\displaystyle{ (|Y_i|, 1/2) }[/math]'s with [math]\displaystyle{ |Y_i| \geq r }[/math], we have

[math]\displaystyle{ \Pr[s_i = t_i] \leq \frac{1}{\sqrt{\pi r}}. }[/math]

Thus to complete the proof, it suffices to show that with probability at least [math]\displaystyle{ \delta^{2^d} }[/math], all [math]\displaystyle{ 2^d }[/math] strings of the form in (2) are in [math]\displaystyle{ A }[/math].

This is easy: writing [math]\displaystyle{ f }[/math] for the indicator of [math]\displaystyle{ A }[/math], the probability is

[math]\displaystyle{ \mathbf{E}_{C_1, \dots, C_d} \left[\mathbf{E}_{s_1, \dots, t_d}[f(C_1(s_1), \dots, C_d(s_d)) \cdots f(C_1(t_1), \dots, C_d(t_d))]\right]. }[/math]

Since [math]\displaystyle{ s_1, \dots, t_d }[/math] are independent, the inside expectation-of-a-product can be changed to a product of expectations. [THIS STEP IS WRONG, I THINK -- Ryan] But for fixed [math]\displaystyle{ C_1, \dots, C_d }[/math], each string of the form in (2) has the same distribution. Hence the above equals

[math]\displaystyle{ \mathbf{E}_{C_1, \dots, C_d} \left[\mathbf{E}_{s_1, \dots, s_d}[f(C_1(s_1), \dots, C_d(s_d))]^{2^d}\right]. }[/math]

By Jensen (or repeated Cauchy-Schwarz), this is at least

[math]\displaystyle{ \left(\mathbf{E}_{C_1, \dots, C_d} \mathbf{E}_{s_1, \dots, s_d}[f(C_1(s_1), \dots, C_d(s_d))]\right)^{2^d}. }[/math]

But this is just [math]\displaystyle{ \delta^{2^d} }[/math], since [math]\displaystyle{ (C_1(s_1), \dots, C_d(s_d)) }[/math] is uniformly distributed. []


As an aside: Corollary 2: If [math]\displaystyle{ A \subseteq [n] }[/math] has density [math]\displaystyle{ \Omega(1) }[/math], then [math]\displaystyle{ A }[/math] contains a nondegenerate combinatorial subspace of dimension at least [math]\displaystyle{ \log_2 \log n - O(1) }[/math].


If we are willing to sacrifice significantly more probability, we can find a [math]\displaystyle{ d }[/math]-dimensional subspace randomly.

Corollary 3: In the setting of Proposition 1, assume [math]\displaystyle{ \delta \lt 2/3 }[/math] and

[math]\displaystyle{ r \geq \exp(4 \ln(1/\delta) 2^d). }[/math] (3)

Suppose we choose a random nondegenerate [math]\displaystyle{ d }[/math]-dimensional subspace of [math]\displaystyle{ [n] }[/math] with wildcard sets [math]\displaystyle{ Z_i \subseteq Y_i }[/math]. By this we mean choosing, independently for each [math]\displaystyle{ i }[/math], a random combinatorial line within [math]\displaystyle{ \{0,1\}^{Y_i} }[/math], uniformly from the [math]\displaystyle{ 3^r - 1 }[/math] possibilities. Then this subspace is entirely contained within [math]\displaystyle{ A }[/math] with probability at least [math]\displaystyle{ 3^{-dr} }[/math].


This follows immediately from Proposition~\ref{prop:1}: having [math]\displaystyle{ r }[/math] as in (3) achieves (1), hence the desired nondengenerate combinatorial subspace exists and we pick it with probability [math]\displaystyle{ 1/(3^r-1)^d }[/math].


We can further conclude: Corollary 4: Let [math]\displaystyle{ A \subseteq \{0,1\}^n }[/math] have density [math]\displaystyle{ \delta \lt 2/3 }[/math] and let [math]\displaystyle{ Y_1, \dots, Y_d }[/math] be disjoint subsets of [math]\displaystyle{ [n] }[/math] with each [math]\displaystyle{ |Y_i| \geq r }[/math],

[math]\displaystyle{ r \geq \exp(4 \ln(1/\delta) 2^d). }[/math]

Choose a nondegenerate combinatorial subspace at random by picking uniformly nondegenerate combinatorial lines in each of [math]\displaystyle{ Y_1, \dots, Y_d }[/math], and filling in the remaining coordinates outside of the [math]\displaystyle{ Y_i }[/math]'s uniformly at random. Then with probability at least [math]\displaystyle{ \exp(-r^{O(1)}) }[/math], this combinatorial subspace is entirely contained within [math]\displaystyle{ A }[/math].


This follows because for a random choice of the coordinates outside the [math]\displaystyle{ Y_i }[/math]'s, there is a [math]\displaystyle{ \delta/2 }[/math] chance that [math]\displaystyle{ A }[/math] has density at least [math]\displaystyle{ \delta/2 }[/math] over the [math]\displaystyle{ Y }[/math] coordinates. We then apply the previous corollary, noting that [math]\displaystyle{ \exp(-r^{O(1)}) \ll (\delta/2)3^{-dr} }[/math], even with [math]\displaystyle{ \delta }[/math] replaced by [math]\displaystyle{ \delta/2 }[/math] in the lower bound demanded of [math]\displaystyle{ r }[/math].


Strong version

An alternative argument deduces the multidimensional Sperner theorem from the density Hales-Jewett theorem. We can think of [math]\displaystyle{ [2]^n }[/math] as [math]\displaystyle{ [2^k]^{n/k}. }[/math] If we do so and apply DHJ(2^k) and translate back to [math]\displaystyle{ [2]^n, }[/math] then we find that we have produced a k-dimensional combinatorial subspace. This is obviously a much more sophisticated proof, since DHJ(2^k) is a very hard result, but it gives more information, since the wildcard sets turn out to have the same size. A sign that this strong version is genuinely strong is that it implies Szemerédi's theorem. For instance, suppose you take as your set [math]\displaystyle{ \mathcal{A} }[/math] the set of all sequences such that the number of 0s plus the number of 1s in even places plus twice the number of 1s in odd places belongs to some dense set in [math]\displaystyle{ [3n]. }[/math] Then if you have a 2D subspace with both wildcard sets of size d, one wildcard set consisting of odd numbers and the other of even numbers (which this proof gives), then this implies that in your dense set of integers you can find four integers of the form a, a+d, a+2d, a+d+2d, which is an arithmetic progression of length 4.

One can also prove the above strong form of Sperner theorem by using the multidimensional Szemerédi theorem which has combinatorial proofs. (reference!) It states that large dense high dimensional grids contain corners. Given a dense subset of [math]\displaystyle{ [2]^n, }[/math] denoted by [math]\displaystyle{ \mathcal{A} }[/math]. We can suppose that the elements of [math]\displaystyle{ \mathcal{A} }[/math] are of size about [math]\displaystyle{ \frac{n}{2}\pm C\sqrt{n}. }[/math] Take a random permutation of [n]. An element of [math]\displaystyle{ \mathcal{A} }[/math] is “[math]\displaystyle{ d- }[/math]nice” after the permutation if it consists of [math]\displaystyle{ d }[/math] intervals, each of length between [math]\displaystyle{ \frac{n}{2d}\pm C\sqrt{n}/2, }[/math] and each interval begins at position [math]\displaystyle{ id }[/math] for some [math]\displaystyle{ 0\leq i\lt \frac{n}{d}. }[/math] (Suppose that [math]\displaystyle{ d }[/math] divides [math]\displaystyle{ n }[/math]) Any [math]\displaystyle{ d- }[/math]nice set can be represented as a point in a [math]\displaystyle{ d- }[/math]dimensional [math]\displaystyle{ [C\sqrt{n}]^d }[/math] cube. The sets represented by the vertices of an axis-parallel [math]\displaystyle{ d- }[/math]dimensional cube in [math]\displaystyle{ [C\sqrt{n}]^d }[/math] form a subspace with equal sized wildcard sets. Finding a cube is clearly more difficult than finding a corner, but it's existence in dense sets also follows from the multidimensional Szemerédi theorem. All what we need is to show that the expected number of the [math]\displaystyle{ d- }[/math]nice elements is [math]\displaystyle{ c\sqrt{n}^d }[/math] where c only depends on the density of [math]\displaystyle{ \mathcal{A} }[/math]. For a typical [math]\displaystyle{ m- }[/math]element subset of [math]\displaystyle{ \mathcal{A} }[/math] the probability that it is [math]\displaystyle{ d- }[/math]nice after the permutation is about [math]\displaystyle{ \binom{n}{m}^{-1}\sqrt{n}^{d-1}. }[/math] The sum for elements of [math]\displaystyle{ \mathcal{A} }[/math] with size between [math]\displaystyle{ \frac{n}{2}\pm C\sqrt{n}, }[/math] gives that the expected number of the [math]\displaystyle{ d- }[/math]nice elements is [math]\displaystyle{ c\sqrt{n}^d, }[/math] so there is a cube if n is large enough.

Further remarks

The k=3 generalisation of the LYM inequality is the hyper-optimistic conjecture.

Sperner's theorem is also related to the Kruskal-Katona theorem.