Difference between revisions of "Sperner's theorem"

From Polymath1Wiki
Jump to: navigation, search
Line 15: Line 15:
 
To see that this implies Sperner's theorem, one just has to make the simple observation that a set with equal-slices measure at most 1 must have cardinality at most <math>\binom n{\lfloor n/2\rfloor}.</math> (If n is odd, so that there are two middle layers, then it is not quite so obvious that to have an extremal set you must pick one or other of the layers, but this is the case.) This stronger version of the statement is called the <em>LYM inequality</em>
 
To see that this implies Sperner's theorem, one just has to make the simple observation that a set with equal-slices measure at most 1 must have cardinality at most <math>\binom n{\lfloor n/2\rfloor}.</math> (If n is odd, so that there are two middle layers, then it is not quite so obvious that to have an extremal set you must pick one or other of the layers, but this is the case.) This stronger version of the statement is called the <em>LYM inequality</em>
  
==Multidimensional versions==
+
==Multidimensional version==
  
===Basic version===
+
The following proof is a variant of the [http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6WHS-45GMWMS-9&_user=10&_rdoc=1&_fmt=&_orig=search&_sort=d&view=c&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=d2d74495dece42adf2ad10297bcb49ee Gunderson-Rodl-Sidorenko] result.  Its parameters are a little worse, but the proof is a little simpler.
  
Sperner's theorem allows us to find a combinatorial line in any dense subset of <math>[2]^n.</math> What if we want to find a higher-dimensional subspace? Here we briefly sketch a proof that this can be done. We make no attempt to optimize any bounds.
+
'''Proposition 1:''' Let <math>A \subseteq \{0,1\}^n</math> have density <math>\delta</math>. Let <math>Y_1, \dots, Y_d</math> be a partition of <math>[n]</math> with <math>|Y_i| \geq r</math> for each <math>i</math>. If
  
The proof uses the argument for the one-dimensional version together with one extra ingredient, which is the following lemma.
+
<math>\delta^{2^d} - \frac{d}{\sqrt{\pi r}} > 0, </math> (1)
  
'''Lemma.''' Let X be a set of size N and let <math>A_1,\dots,A_m</math> be subsets of X of average size <math>\delta N.</math> Then the average size of the intersections <math>A_i\cap A_j</math> with <math>i\ne j</math> is at least <math>(\delta^2-\delta/m)N.</math>
+
then <math>A</math> contains a nondegenerate combinatorial subspace of dimension <math>d</math>, with its <math>i</math>th wildcard set a subset of <math>Y_i</math>.
 +
 
 +
'''Proof:''' Let <math>C_i</math> denote a random chain from <math>0^{|Y_i|}</math> up to <math>1^{|Y_i|}</math>, thought of as residing in the coordinates <math>Y_i</math>, with the <math>d</math> chains chosen independently.  Also, let <math>s_i, t_i</math> denote independent Binomial</math>(|Y_i|, 1/2)</math> random variables, <math>i \in [d]</math>.  Note that <math>C_i(s_i)</math> and <math>C_i(t_i)</math> are (dependent) uniform random strings in <math>\{0,1\}^{Y_i}</math>.  We write, say,
 +
 
 +
<math>(C_1(s_1), C_2(t_2), C_3(t_3), \dots, C_d(s_d)) </math> (2)
 +
 
 +
for the string in <math>\{0,1\}^n</math> formed by putting <math>C_1(s_1)</math> into the <math>Y_1</math> coordinates, <math>C_2(t_2)</math> into the <math>Y_2</math> coordinates, etc.  Note that each string of this form is also uniformly random, since the chains are independent.
 +
 
 +
If all <math>2^d</math> strings of the form in (2) are simultaneously in <math>A</math> then we have a <math>d</math>-dimensional subspace inside <math>A</math> with wildcard sets that are \emph{subsets} of <math>Y_1, \dots, Y_d</math>.  All <math>d</math> dimensions are nondegenerate iff <math>s_i \neq t_i</math> for all <math>i</math>.  Since <math>s_i</math> and <math>t_i</math> are independent Binomial<math>(|Y_i|, 1/2)</math>'s with <math>|Y_i| \geq r</math>, we have
 +
 
 +
<math>\Pr[s_i = t_i] \leq \frac{1}{\sqrt{\pi r}}.</math>
 +
 
 +
Thus to complete the proof, it suffices to show that with probability at least <math>\delta^{2^d}</math>, all <math>2^d</math> strings of the form in (2) are in <math>A</math>.  
 +
 
 +
This is easy: writing <math>f</math> for the indicator of <math>A</math>, the probability is
 +
 
 +
<math>\mathbf{E}_{C_1, \dots, C_d} \left[\mathbf{E}_{s_1, \dots, t_d}[f(C_1(s_1), \dots, C_d(s_d)) \cdots f(C_1(t_1), \dots, C_d(t_d))]\right].</math>
 +
 
 +
Since <math>s_1, \dots, t_d</math> are independent, the inside expectation-of-a-product can be changed to a product of expectations.  But for fixed <math>C_1, \dots, C_d</math>, each string of the form in (2) has the same distribution.  Hence the above equals
 +
 
 +
<math>\mathbf{E}_{C_1, \dots, C_d} \left[\mathbf{E}_{s_1, \dots, s_d}[f(C_1(s_1), \dots, C_d(s_d))]^{2^d}\right].</math>
 +
 
 +
By Jensen (or repeated Cauchy-Schwarz), this is at least
 +
 
 +
<math>\left(\mathbf{E}_{C_1, \dots, C_d} \mathbf{E}_{s_1, \dots, s_d}[f(C_1(s_1), \dots, C_d(s_d))]\right)^{2^d}.</math>
 +
 
 +
But this is just <math>\delta^{2^d}</math>, since <math>(C_1(s_1), \dots, C_d(s_d))</math> is uniformly distributed. []
 +
 
 +
 
 +
As an aside:
 +
'''Corollary 2:''' If <math>A \subseteq [n]</math> has density <math>\Omega(1)</math>, then <math>A</math> contains a nondegenerate combinatorial subspace of dimension at least <math>\log_2 \log n - O(1)</math>.
 +
 
 +
 
 +
If we are willing to sacrifice significantly more probability, we can find a <math>d</math>-dimensional subspace randomly. 
 +
 
 +
'''Corollary 3:''' In the setting of Proposition 1, assume <math>\delta < 2/3</math> and
 +
 
 +
<math> r \geq \exp(4 \ln(1/\delta) 2^d). </math> (3)
 +
 
 +
Suppose we choose a random nondegenerate <math>d</math>-dimensional subspace of <math>[n]</math> with wildcard sets <math>Z_i \subseteq Y_i</math>.  By this we mean choosing, independently for each <math>i</math>, a random combinatorial line within <math>\{0,1\}^{Y_i}</math>, uniformly from the <math>3^r - 1</math> possibilities.  Then this subspace is entirely contained within <math>A</math> with probability at least <math>3^{-dr}</math>.
 +
 
 +
 
 +
This follows immediately from Proposition~\ref{prop:1}: having <math>r</math> as in (3) achieves (1), hence the desired nondengenerate combinatorial subspace exists and we pick it with probability <math>1/(3^r-1)^d</math>.
 +
 
 +
 
 +
We can further conclude:
 +
'''Corollary 4''': Let <math>A \subseteq \{0,1\}^n</math> have density <math>\delta < 2/3</math> and let <math>Y_1, \dots, Y_d</math> be disjoint subsets of <math>[n]</math> with each <math>|Y_i| \geq r</math>,
 +
 
 +
<math> r \geq \exp(4 \ln(1/\delta) 2^d). </math>
 +
 
 +
Choose a nondegenerate combinatorial subspace at random by picking uniformly nondegenerate combinatorial lines in each of <math>Y_1, \dots, Y_d</math>, and filling in the remaining coordinates outside of the <math>Y_i</math>'s uniformly at random.  Then with probability at least <math>\exp(-r^{O(1)})</math>, this combinatorial subspace is entirely contained within <math>A</math>.
 +
 
 +
 
 +
This follows because for a random choice of the coordinates outside the <math>Y_i</math>'s, there is a <math>\delta/2</math> chance that <math>A</math> has density at least <math>\delta/2</math> over the <math>Y</math> coordinates.  We then apply the previous corollary, noting that <math>\exp(-r^{O(1)}) \ll (\delta/2)3^{-dr}</math>, even with <math>\delta</math> replaced by <math>\delta/2</math> in the lower bound demanded of <math>r</math>.
  
'''Proof.''' For each <math>x\in X</math> let f(x) be defined to be the number of i such that <math>x\in A_i.</math> Then the average of f(x) is <math>\delta m,</math> so the average of <math>f(x)^2</math> is at least <math>\delta^2m^2.</math> But the sum over all <math>f(x)^2</math> is the sum over all i and j of <math>|A_i\cap A_j|.</math> If we define <math>g(A_i,A_j)</math> to be <math>|A_i\cap A_j|</math> when <math>i\ne j</math> and 0 when <math>i=j,</math> then we find that the sum of all <math>g(A_i,A_j)</math> is at least <math>\delta^2m^2N-\delta mN.</math> Therefore the average of <math>g(A_i,A_j),</math> and hence the average over all <math>i\ne j,</math> is at least <math>(\delta^2-\delta/m)N,</math> as stated.
 
  
Now let us suppose that <math>\mathcal{A}</math> is a subset of <math>[2]^n</math> of density <math>\delta.</math> Let n=p+q (where p and q are to be chosen at the stage where one actually tries to optimize what this argument gives) and think of <math>[2]^n</math> as <math>[2]^p\times[2]^q.</math> For each <math>y\in[2]^p</math> let <math>\mathcal{A}_y</math> be the set of <math>z\in[2]^q</math> such that <math>(y,z)\in\mathcal{A}.</math> Then the density of y such that <math>\mathcal{A}_y</math> has density at least <math>\delta/2</math> is at least <math>\delta/2.</math> From this it is easy to derive that for a random permutation <math>\pi</math> of [p], there are at least <math>8/\delta</math> sequences x corresponding to the initial segments of <math>\pi</math> such that the density of <math>\mathcal{A}_x</math> is at least <math>\delta/2.</math> But that, by the lemma, implies that there are two such sequences, x and x', say, such that <math>\mathcal{A}_x\cap\mathcal{A}_{x'}</math> has density at least <math>\delta^2/8.</math> Now we fix such a pair x and x' and use induction to say that <math>\mathcal{A}_x\cap\mathcal{A}_{x'}</math> contains a (k-1)-dimensional combinatorial subspace, which implies that <math>\mathcal{A}</math> contains a k-dimensional combinatorial subspace.
 
  
 
===Strong version===
 
===Strong version===

Revision as of 13:36, 8 March 2009

Statement of the theorem

Sperner's theorem as originally stated is a result about set systems. Suppose that you want to find the largest collection of sets [math]\mathcal{A}[/math] such that no set in [math]\mathcal{A}[/math] is a proper subset of any other. Then the best you can do is choose all the sets of some fixed size---and of course the best size to pick is [math]\lfloor n/2\rfloor[/math], since the binomial coefficient [math]\binom nm[/math] is maximized when [math]m=\lfloor n/2\rfloor.[/math]

Sperner's theorem is closely related to the density Hales-Jewett theorem: in fact, it is nothing other than DHJ(2) with the best possible bound. To see this, we associate each set [math]A\subset[n][/math] with its characteristic function (that is, the sequence that is 0 outside A and 1 in A). If we have a pair of sets [math]A\subset B,[/math] then the two sequences form a combinatorial line in [math][2]^n.[/math] For example, if n=6 and A and B are the sets [math]\{2,3\}[/math] and [math]\{2,3,4,6\}[/math], then we get the combinatorial line that consists of the two points 011000 and 011101, which we can denote by 011*0* (so the wildcard set is [math]\{4,6\}[/math]).

Proof of the theorem

There are several proofs, but perhaps the most enlightening is a very simple averaging argument that proves a stronger result. Let [math]\mathcal{A}[/math] be a collection of subsets of [n]. For each k, let [math]\delta_k[/math] denote the density of [math]\mathcal{A}[/math] in the kth layer of the cube: that is, it is the number of sets in [math]\mathcal{A}[/math] of size k, divided by [math]\binom nk.[/math] The equal-slices measure of [math]\mathcal{A}[/math] is defined to be [math]\delta_0+\dots+\delta_n.[/math]

Now the equal-slices measure of [math]\mathcal{A}[/math] is easily seen to be equal to the following quantity. Let [math]\pi[/math] be a random permutation of [n], let [math]U_0,U_1,U_2\dots,U_n[/math] be the sets [math]\emptyset, \{\pi(1)\},\{\pi(1),\pi(2)\},\dots,[n],[/math] and let [math]\mu(\mathcal{A})[/math] be the expected number of the sets [math]U_i[/math] that belong to [math]\mathcal{A}.[/math] This is the same by linearity of expectation and the fact that the probability that [math]U_k[/math] belongs to [math]\mathcal{A}[/math] is [math]\delta_k.[/math]

Therefore, if the equal-slices measure of [math]\mathcal{A}[/math] is greater than 1, then the expected number of sets [math]U_k[/math] in [math]\mathcal{A}[/math] is greater than 1, so there must exist a permutation for which it is at least 2, and that gives us a pair of sets with one contained in the other.

To see that this implies Sperner's theorem, one just has to make the simple observation that a set with equal-slices measure at most 1 must have cardinality at most [math]\binom n{\lfloor n/2\rfloor}.[/math] (If n is odd, so that there are two middle layers, then it is not quite so obvious that to have an extremal set you must pick one or other of the layers, but this is the case.) This stronger version of the statement is called the LYM inequality

Multidimensional version

The following proof is a variant of the Gunderson-Rodl-Sidorenko result. Its parameters are a little worse, but the proof is a little simpler.

Proposition 1: Let [math]A \subseteq \{0,1\}^n[/math] have density [math]\delta[/math]. Let [math]Y_1, \dots, Y_d[/math] be a partition of [math][n][/math] with [math]|Y_i| \geq r[/math] for each [math]i[/math]. If

[math]\delta^{2^d} - \frac{d}{\sqrt{\pi r}} \gt 0, [/math] (1)

then [math]A[/math] contains a nondegenerate combinatorial subspace of dimension [math]d[/math], with its [math]i[/math]th wildcard set a subset of [math]Y_i[/math].

Proof: Let [math]C_i[/math] denote a random chain from [math]0^{|Y_i|}[/math] up to [math]1^{|Y_i|}[/math], thought of as residing in the coordinates [math]Y_i[/math], with the [math]d[/math] chains chosen independently. Also, let [math]s_i, t_i[/math] denote independent Binomial</math>(|Y_i|, 1/2)</math> random variables, [math]i \in [d][/math]. Note that [math]C_i(s_i)[/math] and [math]C_i(t_i)[/math] are (dependent) uniform random strings in [math]\{0,1\}^{Y_i}[/math]. We write, say,

[math](C_1(s_1), C_2(t_2), C_3(t_3), \dots, C_d(s_d)) [/math] (2)

for the string in [math]\{0,1\}^n[/math] formed by putting [math]C_1(s_1)[/math] into the [math]Y_1[/math] coordinates, [math]C_2(t_2)[/math] into the [math]Y_2[/math] coordinates, etc. Note that each string of this form is also uniformly random, since the chains are independent.

If all [math]2^d[/math] strings of the form in (2) are simultaneously in [math]A[/math] then we have a [math]d[/math]-dimensional subspace inside [math]A[/math] with wildcard sets that are \emph{subsets} of [math]Y_1, \dots, Y_d[/math]. All [math]d[/math] dimensions are nondegenerate iff [math]s_i \neq t_i[/math] for all [math]i[/math]. Since [math]s_i[/math] and [math]t_i[/math] are independent Binomial[math](|Y_i|, 1/2)[/math]'s with [math]|Y_i| \geq r[/math], we have

[math]\Pr[s_i = t_i] \leq \frac{1}{\sqrt{\pi r}}.[/math]

Thus to complete the proof, it suffices to show that with probability at least [math]\delta^{2^d}[/math], all [math]2^d[/math] strings of the form in (2) are in [math]A[/math].

This is easy: writing [math]f[/math] for the indicator of [math]A[/math], the probability is

[math]\mathbf{E}_{C_1, \dots, C_d} \left[\mathbf{E}_{s_1, \dots, t_d}[f(C_1(s_1), \dots, C_d(s_d)) \cdots f(C_1(t_1), \dots, C_d(t_d))]\right].[/math]

Since [math]s_1, \dots, t_d[/math] are independent, the inside expectation-of-a-product can be changed to a product of expectations. But for fixed [math]C_1, \dots, C_d[/math], each string of the form in (2) has the same distribution. Hence the above equals

[math]\mathbf{E}_{C_1, \dots, C_d} \left[\mathbf{E}_{s_1, \dots, s_d}[f(C_1(s_1), \dots, C_d(s_d))]^{2^d}\right].[/math]

By Jensen (or repeated Cauchy-Schwarz), this is at least

[math]\left(\mathbf{E}_{C_1, \dots, C_d} \mathbf{E}_{s_1, \dots, s_d}[f(C_1(s_1), \dots, C_d(s_d))]\right)^{2^d}.[/math]

But this is just [math]\delta^{2^d}[/math], since [math](C_1(s_1), \dots, C_d(s_d))[/math] is uniformly distributed. []


As an aside: Corollary 2: If [math]A \subseteq [n][/math] has density [math]\Omega(1)[/math], then [math]A[/math] contains a nondegenerate combinatorial subspace of dimension at least [math]\log_2 \log n - O(1)[/math].


If we are willing to sacrifice significantly more probability, we can find a [math]d[/math]-dimensional subspace randomly.

Corollary 3: In the setting of Proposition 1, assume [math]\delta \lt 2/3[/math] and

[math] r \geq \exp(4 \ln(1/\delta) 2^d). [/math] (3)

Suppose we choose a random nondegenerate [math]d[/math]-dimensional subspace of [math][n][/math] with wildcard sets [math]Z_i \subseteq Y_i[/math]. By this we mean choosing, independently for each [math]i[/math], a random combinatorial line within [math]\{0,1\}^{Y_i}[/math], uniformly from the [math]3^r - 1[/math] possibilities. Then this subspace is entirely contained within [math]A[/math] with probability at least [math]3^{-dr}[/math].


This follows immediately from Proposition~\ref{prop:1}: having [math]r[/math] as in (3) achieves (1), hence the desired nondengenerate combinatorial subspace exists and we pick it with probability [math]1/(3^r-1)^d[/math].


We can further conclude: Corollary 4: Let [math]A \subseteq \{0,1\}^n[/math] have density [math]\delta \lt 2/3[/math] and let [math]Y_1, \dots, Y_d[/math] be disjoint subsets of [math][n][/math] with each [math]|Y_i| \geq r[/math],

[math] r \geq \exp(4 \ln(1/\delta) 2^d). [/math]

Choose a nondegenerate combinatorial subspace at random by picking uniformly nondegenerate combinatorial lines in each of [math]Y_1, \dots, Y_d[/math], and filling in the remaining coordinates outside of the [math]Y_i[/math]'s uniformly at random. Then with probability at least [math]\exp(-r^{O(1)})[/math], this combinatorial subspace is entirely contained within [math]A[/math].


This follows because for a random choice of the coordinates outside the [math]Y_i[/math]'s, there is a [math]\delta/2[/math] chance that [math]A[/math] has density at least [math]\delta/2[/math] over the [math]Y[/math] coordinates. We then apply the previous corollary, noting that [math]\exp(-r^{O(1)}) \ll (\delta/2)3^{-dr}[/math], even with [math]\delta[/math] replaced by [math]\delta/2[/math] in the lower bound demanded of [math]r[/math].


Strong version

An alternative argument deduces the multidimensional Sperner theorem from the density Hales-Jewett theorem. We can think of [math][2]^n[/math] as [math][2^k]^{n/k}.[/math] If we do so and apply DHJ(2^k) and translate back to [math][2]^n,[/math] then we find that we have produced a k-dimensional combinatorial subspace. This is obviously a much more sophisticated proof, since DHJ(2^k) is a very hard result, but it gives more information, since the wildcard sets turn out to have the same size. A sign that this strong version is genuinely strong is that it implies Szemerédi's theorem. For instance, suppose you take as your set [math]\mathcal{A}[/math] the set of all sequences such that the number of 0s plus the number of 1s in even places plus twice the number of 1s in odd places belongs to some dense set in [math][3n].[/math] Then if you have a 2D subspace with both wildcard sets of size d, one wildcard set consisting of odd numbers and the other of even numbers (which this proof gives), then this implies that in your dense set of integers you can find four integers of the form a, a+d, a+2d, a+d+2d, which is an arithmetic progression of length 4.

Further remarks

The k=3 generalisation of the LYM inequality is the hyper-optimistic conjecture.

Sperner's theorem is also related to the Kruskal-Katona theorem.