Sperner.tex: Difference between revisions

From Polymath Wiki
Jump to navigationJump to search
Created section
 
No edit summary
 
Line 1: Line 1:
\section{Remarks about Sperner's theorem}
\section{Sperner's theorem} \label{sec:sperner}


The first non-trivial case of Szemer\'edi's theorem is the case of arithmetic progressions of length 3. However, for the density Hales-Jewett theorem even the case $k=2$ is interesting. For the purposes of this discussion, let us define $[2]$ to be the set $\{0,1\}$. Then a combinatorial line in $[2]^n$ is a pair of $01$ sequences $x$ and $y$, such that to obtain $y$ from $x$ one changes a few $0$s to $1$s. If we think of the sequences $x$ and $y$ as the characteristic functions of two subsets $A$ and $B$ of $[n]$, then this is saying that $A$ is a proper subset of $B$. Therefore, when $k=2$ we can formulate the density Hales-Jewett theorem as follows: for every $\delta>0$ there exists $n$ such that if $\mathcal{A}$ is a collection of at least $\delta 2^n$ subsets of $[n]$, then there must exist two distinct sets $A,B\in\mathcal{A}$ with $A\subset B$.  
The first nontrivial case of Szemer\'edi's theorem is the case of arithmetic progressions of length $3$. However, for the density Hales--Jewett theorem even the case $k=2$ is interesting. DHJ($2$) follows from a basic result in extremal combinatorics: Sperner's theorem. \ignore{This result has been known for a long time. \noteryan{amusingly, published after vdW's theorem}} In this section we review a standard probabilistic proof of Sperner's theorem. Besides suggesting the equal-slices distribution, this proof easily gives the $k=2$ case of the probabilistic Density Hales--Jewett theorem, a key component in our proof of DHJ($3$).


This result has been known for a long time: it is an immediate consequence of the following basic result in extremal combinatorics due to Sperner. Recall that an \textit{antichain} is a collection $\mathcal{A}$ of sets such that no set in $\mathcal{A}$ is a proper subset of any other.
To investigate DHJ($2$) it slightly more convenient to take the alphabet to be $\Omega = \{0,1\}$.  Then a combinatorial line in $\{0,1\}^n$ is a pair of distinct binary strings $x$ and $y$ such that to obtain $y$ from $x$ one changes some $0$'s to $1$'s. If we think of the strings $x$ and $y$ as the indicators of two subsets $X$ and $Y$ of $[n]$, then this is saying that $X$ is a proper subset of $Y$. Therefore, when $k=2$ we can formulate the density Hales-Jewett theorem as follows: there exists $\dhj{2}{\delta}$ such that for $n \geq \dhj{2}{\delta}$, if $\calA$ is a collection of at least $\delta 2^n$ subsets of $[n]$, then there must exist two distinct sets $X,Y \in \calA$ with $X \subset Y$. In the language of combinatorics, this is saying that $\calA$ is \emph{not} an antichain. (Recall that an \textit{antichain} is a collection $\mathcal{A}$ of sets such that no set in $\mathcal{A}$ is a proper subset of any other.)  Sperner's theorem gives something slightly stronger: a precise lower bound on the cardinality of any antichain.


\begin{theorem} \label{Sperner}
\begin{named}{Sperner's theorem} \label{thm:sperner}
For every positive integer $n$, the largest cardinality of any antichain of subsets of $[n]$ is $\binom n{\lfloor n/2\rfloor}$.
For every positive integer $n$, the largest cardinality of any antichain of subsets of $[n]$ is $\binom{n}{\lfloor n/2\rfloor}$.
\end{theorem}
\end{named}
As the bound suggests, the best possible example is the collection of all subsets of $[n]$ of size $\lfloor n/2\rfloor$. (It can be shown that this example is essentially unique: the only other example is to take all sets of size $\lceil n/2\rceil$, and even this is different only when $n$ is odd.) It is well known that $\binom{n}{\lfloor n/2\rfloor}2^{-n} \geq 1/2\sqrt{n}$ for all $n$; hence Sperner's theorem implies that one may take $\dhj{2}{\delta} = 4/\delta^2$.\noteryan{The constant can be sharpened to $\pi/2$ for small $\delta$, of course. Also, $4/\delta^2$ is technically not an integer.}  Let us present a standard probabilistic proof of Sperner's theorem (see, e.g.,~\cite{Spe90}):


\begin{proof}
\begin{proof} (\emph{Sperner's theorem.})  Consider the following way of choosing a random subset of $[n]$. First, we choose, uniformly at random, a permutation $\tau$ of $[n]$. Next, we choose, uniformly at random and independently of $\tau$, an integer $s$ from the set $\{0,1,\dots,n\}$. Finally, we set $X=\{\tau(1),\dots,\tau(s)\}$ (where this is interpreted as the empty set if $s=0$). 
As the bound suggests, the best possible example is the collection of all subsets of $[n]$ of size $\lfloor n/2\rfloor$. (It can be shown that this example is essentially unique: the only other example is to take all sets of size $\lceil n/2\rceil$, and even this is different only when $n$ is odd.)  
 
Let $\mathcal{A}$ be an antichain. Then the probability that a set $X$ that is chosen randomly in the above manner belongs to $\mathcal{A}$ is at most $1/(n+1)$, since whatever $\tau$ is at most one of the $n+1$ sets $\{\tau(1),\dots,\tau(s)\}$ can belong to $\mathcal{A}$.
 
However, what we are really interested in is the probability that $X\in\mathcal{A}$ if $X$ is chosen \textit{uniformly} from all subsets of $[n]$. Let us write $\eqs{2}[X]$ for the probability that we choose $X$ according to the distribution defined above, and $\unif_2[X]$ for the probability that we choose it uniformly. Then $\unif_2[X]=2^{-n}$ for every $X$, whereas $\eqs{2}[X]=\frac 1{n+1}\binom n{\abs{X}}^{-1}$, since there is a probability $1/(n+1)$ that $s = \abs{X}$, and all sets of size $\abs{X}$ are equally likely to be chosen. Therefore, the largest ratio of $\unif_2[X]$ to $\eqs{2}[X]$ occurs when $\abs{X}=\lfloor n/2\rfloor$ or $\lceil n/2\rceil$.\noteryan{by unimodality of binomial coefficients} In this case, the ratio is $(n+1)\binom n{\lfloor n/2\rfloor}2^{-n}$.
 
Since $\eqs{2}(\mathcal{A})\leq 1/(n+1)$, it follows that $2^{-n}\abs{\mathcal{A}}=\unif_2(\mathcal{A})\leq\binom n{\lfloor n/2\rfloor}2^{-n}$, which proves the theorem.
\end{proof}
 
As one sees from the proof, it is very natural to consider different probability distributions on $\{0,1\}^n$, or equivalently on the set of all subsets of $[n]$. The first is the uniform distribution $\unif_2$, which is forced on us by the way the question is phrased. The second is what we called $\eqs{2}$; the reader may check that this is precisely the ``equal-slices'' distribution $\eqs{2}^n$ described in Section~\ref{sec:pdhj}.  After seeing the above proof, one might take the attitude that the ``correct'' statement of Sperner's theorem is that if $\calA$ is an antichain, then $\eqs{2}(\calA) \leq 1/(n+1)$, and that the statement given above is a slightly artificial and strictly weaker consequence.
 
\subsection{Probabilistic DHJ(\texorpdfstring{$2$}{2})} \label{sec:prob-sperner}


In the other direction, consider the following way of choosing a random subset of $[n]$. First, we choose, uniformly at random, a permutation $\pi$ of $[n]$. Next, we choose, uniformly at random and independently of $\pi$, an integer $j$ from the set $\{0,1,\dots,n\}$. Finally, we set $A=\{\pi(1),\dots,\pi(j)\}$ (where this is interpreted as the empty set if $j=0$).
Indeed, what the proof (essentially) establishes is the ``equal-slices DHJ($2$) theorem''; i.e., that in Theorem~\ref{thm:edhj} one may take $\edhj{2}{\delta} = 1/\delta$.\noteryan{minus $1$, even}  We say ``essentially'' because of the small distinction between the distribution $\eqs{2}^n$ used in the proof and the distribution $\ens{2}^n$ in the statement. It will be convenient in this introductory discussion of Sperner's theorem to casually ignore this.  We will introduce $\ens{k}^n$ and be more careful about its distinction with $\eqs{k}^n$ in Section~\ref{sec:eqs}.


Let $\mathcal{A}$ be an antichain. Then the probability that a set $A$ that is chosen randomly in the above manner belongs to $\mathcal{A}$ is at most $1/(n+1)$, since whatever $\pi$ is at most one of the $n+1$ sets $\{\pi(1),\dots,\pi(j)\}$ can belong to $\mathcal{A}$.
To further bolster the claim that $\eqs{2}^n$ is natural in this context we will show an easy proof of the \emph{probabilistic} DHJ($2$) theorem. Looking at the statement of Theorem~\ref{thm:pdhj}, the reader will see it requires defining $\eqs{3}^n$; we will make this definition in the course of the proof.
\begin{lemma} \label{lem:p-sperner} For every real $\delta>0$, every $A \subseteq \{0,1\}^n$ with $\eqs{2}^n$-density at least $\delta$ satisfies
\[
\Pr_{\lambda \sim \eqs{3}^n}[\lambda \subseteq A] \geq \delta^2.
\]
\end{lemma}


However, what we are really interested in is the probability that $A\in\mathcal{A}$ if $A$ is chosen \textit{uniformly} from all subsets of $[n]$. Let us write $\mu(A)$ for the probability that we choose $A$ according to the distribution defined above, and $\nu(A)$ for the probability that we choose it uniformly. Then $\nu(A)=2^{-n}$ for every $A$, whereas $\mu(A)=\frac 1{n+1}\binom n{|A|}^{-1}$, since there is a probability $1/(n+1)$ that $j=|A|$, and all sets of size $|A|$ are equally likely to be chosen. Therefore, the largest ratio of $\nu(A)$ to $\mu(A)$ occurs when $|A|=\lfloor n/2\rfloor$ or $\lceil n/2\rceil$. In this case, the ratio is $(n+1)\binom n{\lfloor n/2\rfloor}2^{-n}$.
\noindent Note that there is no lower bound necessary on $n$; this is because a template $\lambda \sim \eqs{3}^n$ may be degenerate.


Since $\mu(\mathcal{A})\leq 1/(n+1)$, it follows that $2^{-n}|\mathcal{A}|=\nu(\mathcal{A})\leq\binom n{\lfloor n/2\rfloor}2^{-n}$, which proves the theorem.
\begin{proof}
\end{proof}
As in our proof of Sperner's theorem, let us choose a permutation $\tau$ of $[n]$ uniformly at random.  Suppose we now choose $s \in \{0, 1, \dotsc, n\}$ and also $t \in \{0, 1, \dotsc, n\}$ \emph{independently}.  Let $x(\tau,s) \in \{0,1\}^n$ denote the string which has $1$'s in coordinates $\tau(1), \dotsc, \tau(s)$ and $0$'s in coordinates $\tau(s+1), \dotsc, \tau(n)$, and similarly define $x(\tau,t)$.  These two strings both have the distribution $\eqs{2}^n$, but are not independent.


We have given this proof in order to draw attention to the fact that it is very natural to consider two different measures on $\{0,1\}^n$, or equivalently on the set of all subsets of $[n]$. The first is the uniform measure, which is forced on us by the way the question is phrased. The second is what one might call the ``equal-slices measure'', where the cardinality of a set is chosen uniformly at random and then the set is chosen uniformly at random given that cardinality. This name comes from the fact that it is common to refer to sets of the form $\{A\subset[n]:|A|=k\}$ as \textit{slices} of the discrete cube, and the measure $\nu$ gives equal density to each slice.  
A key observation is that $\{x(\tau,s), x(\tau,t)\}$ is a combinatorial line in $\{0,1\}^n$, unless $s = t$ in which case the two strings are equal. The associated line template is $\lambda \in \{0,1,\wild\}^n$, with
\[
\lambda_i = \begin{cases}
                1      & \text{if $i \leq \min\{s,t\}$,}\\
                \wild  & \text{if $\min\{s,t\} < i \leq \max\{s,t\}$,} \\
                0      & \text{if $i > \max\{s,t\}$.}
            \end{cases}
\]
This gives the definition of how to draw $\lambda \sim \eqs{3}^n$ (with alphabet $\{0,1,\wild\}$).  Note that $\lambda$ is a degenerate template with probability $1/(n+1)$.


After seeing the above proof, one might take the attitude that the ``correct'' statement of Sperner's theorem is that if $\mathcal{A}$ is an antichain, then $\mu(\mathcal)\leq 1/(n+1)$, and that the statement given above is a slightly artificial and strictly weaker consequence.
Assuming $\Pr_{x \sim \eqs{2}^n}[x \in A] \geq \delta$, our goal is to show that $\Pr[x(\tau,s), x(\tau,t) \in A ] \geq \delta^2$.  But
\begin{align*}
\Pr[x(\tau,s), x(\tau,t) \in A] &= \Ex_{\tau} \Bigl[\Pr_{s,t} [x(\tau,s), x(\tau,t) \in A]\Bigr] & \\
&= \Ex_{\tau} \Bigl[\Pr_{s} [x(\tau,s) \in A] \Pr_{t}[x(\tau,s) \in A]\Bigr] &\text{(independence of $s$, $t$)} \\
&= \Ex_{\tau} \Bigl[\Pr_{s} [x(\tau,s) \in A]^2\Bigr] & \\
&\geq \Ex_{\tau} \Bigl[\Pr_{s} [x(\tau,s) \in A]\Bigr]^2 & \text{(Cauchy-Schwarz)}\\
&= \Pr_{\tau,s} [x(\tau,s) \in A]^2, &
\end{align*}
and $x(\tau,s)$ has the distribution $\eqs{2}^n$, completing the proof.
\end{proof}


Since Sperner's theorem is the first non-trivial case of the theorem we are trying to prove, an obvious question is whether we can generalize the above proof. The answer seems to be no, essentially because there is no obvious analogue of the system of nested intervals that plays such an important role. However, the idea of looking at equal-slices measure does turn out to be very useful technically, so in the next section we shall define it for subsets of $[k]^n$ and prove a few simple facts about it for use later in the proof.
Having proved the probabilistic DHJ($2$) theorem rather easily, an obvious question is whether we can generalize the proof to $k = 3$. The answer seems to be no; there is no obvious way to generate random length-$3$ lines in which the points are independent, or even partially independent as in the previous proof. Nevertheless, the equal-slices distribution remains important for our proof of the general case of DHJ($k$); in Section~\ref{sec:eqs} we shall introduce both $\eqs{k}^n$ and $\ens{k}^n$ and prove some basic facts about them.

Latest revision as of 20:21, 24 June 2009

\section{Sperner's theorem} \label{sec:sperner}

The first nontrivial case of Szemer\'edi's theorem is the case of arithmetic progressions of length $3$. However, for the density Hales--Jewett theorem even the case $k=2$ is interesting. DHJ($2$) follows from a basic result in extremal combinatorics: Sperner's theorem. \ignore{This result has been known for a long time. \noteryan{amusingly, published after vdW's theorem}} In this section we review a standard probabilistic proof of Sperner's theorem. Besides suggesting the equal-slices distribution, this proof easily gives the $k=2$ case of the probabilistic Density Hales--Jewett theorem, a key component in our proof of DHJ($3$).

To investigate DHJ($2$) it slightly more convenient to take the alphabet to be $\Omega = \{0,1\}$. Then a combinatorial line in $\{0,1\}^n$ is a pair of distinct binary strings $x$ and $y$ such that to obtain $y$ from $x$ one changes some $0$'s to $1$'s. If we think of the strings $x$ and $y$ as the indicators of two subsets $X$ and $Y$ of $[n]$, then this is saying that $X$ is a proper subset of $Y$. Therefore, when $k=2$ we can formulate the density Hales-Jewett theorem as follows: there exists $\dhj{2}{\delta}$ such that for $n \geq \dhj{2}{\delta}$, if $\calA$ is a collection of at least $\delta 2^n$ subsets of $[n]$, then there must exist two distinct sets $X,Y \in \calA$ with $X \subset Y$. In the language of combinatorics, this is saying that $\calA$ is \emph{not} an antichain. (Recall that an \textit{antichain} is a collection $\mathcal{A}$ of sets such that no set in $\mathcal{A}$ is a proper subset of any other.) Sperner's theorem gives something slightly stronger: a precise lower bound on the cardinality of any antichain.

\begin{named}{Sperner's theorem} \label{thm:sperner} For every positive integer $n$, the largest cardinality of any antichain of subsets of $[n]$ is $\binom{n}{\lfloor n/2\rfloor}$. \end{named} As the bound suggests, the best possible example is the collection of all subsets of $[n]$ of size $\lfloor n/2\rfloor$. (It can be shown that this example is essentially unique: the only other example is to take all sets of size $\lceil n/2\rceil$, and even this is different only when $n$ is odd.) It is well known that $\binom{n}{\lfloor n/2\rfloor}2^{-n} \geq 1/2\sqrt{n}$ for all $n$; hence Sperner's theorem implies that one may take $\dhj{2}{\delta} = 4/\delta^2$.\noteryan{The constant can be sharpened to $\pi/2$ for small $\delta$, of course. Also, $4/\delta^2$ is technically not an integer.} Let us present a standard probabilistic proof of Sperner's theorem (see, e.g.,~\cite{Spe90}):

\begin{proof} (\emph{Sperner's theorem.}) Consider the following way of choosing a random subset of $[n]$. First, we choose, uniformly at random, a permutation $\tau$ of $[n]$. Next, we choose, uniformly at random and independently of $\tau$, an integer $s$ from the set $\{0,1,\dots,n\}$. Finally, we set $X=\{\tau(1),\dots,\tau(s)\}$ (where this is interpreted as the empty set if $s=0$).

Let $\mathcal{A}$ be an antichain. Then the probability that a set $X$ that is chosen randomly in the above manner belongs to $\mathcal{A}$ is at most $1/(n+1)$, since whatever $\tau$ is at most one of the $n+1$ sets $\{\tau(1),\dots,\tau(s)\}$ can belong to $\mathcal{A}$.

However, what we are really interested in is the probability that $X\in\mathcal{A}$ if $X$ is chosen \textit{uniformly} from all subsets of $[n]$. Let us write $\eqs{2}[X]$ for the probability that we choose $X$ according to the distribution defined above, and $\unif_2[X]$ for the probability that we choose it uniformly. Then $\unif_2[X]=2^{-n}$ for every $X$, whereas $\eqs{2}[X]=\frac 1{n+1}\binom n{\abs{X}}^{-1}$, since there is a probability $1/(n+1)$ that $s = \abs{X}$, and all sets of size $\abs{X}$ are equally likely to be chosen. Therefore, the largest ratio of $\unif_2[X]$ to $\eqs{2}[X]$ occurs when $\abs{X}=\lfloor n/2\rfloor$ or $\lceil n/2\rceil$.\noteryan{by unimodality of binomial coefficients} In this case, the ratio is $(n+1)\binom n{\lfloor n/2\rfloor}2^{-n}$.

Since $\eqs{2}(\mathcal{A})\leq 1/(n+1)$, it follows that $2^{-n}\abs{\mathcal{A}}=\unif_2(\mathcal{A})\leq\binom n{\lfloor n/2\rfloor}2^{-n}$, which proves the theorem. \end{proof}

As one sees from the proof, it is very natural to consider different probability distributions on $\{0,1\}^n$, or equivalently on the set of all subsets of $[n]$. The first is the uniform distribution $\unif_2$, which is forced on us by the way the question is phrased. The second is what we called $\eqs{2}$; the reader may check that this is precisely the ``equal-slices distribution $\eqs{2}^n$ described in Section~\ref{sec:pdhj}. After seeing the above proof, one might take the attitude that the ``correct statement of Sperner's theorem is that if $\calA$ is an antichain, then $\eqs{2}(\calA) \leq 1/(n+1)$, and that the statement given above is a slightly artificial and strictly weaker consequence.

\subsection{Probabilistic DHJ(\texorpdfstring{$2$}{2})} \label{sec:prob-sperner}

Indeed, what the proof (essentially) establishes is the ``equal-slices DHJ($2$) theorem; i.e., that in Theorem~\ref{thm:edhj} one may take $\edhj{2}{\delta} = 1/\delta$.\noteryan{minus $1$, even} We say ``essentially because of the small distinction between the distribution $\eqs{2}^n$ used in the proof and the distribution $\ens{2}^n$ in the statement. It will be convenient in this introductory discussion of Sperner's theorem to casually ignore this. We will introduce $\ens{k}^n$ and be more careful about its distinction with $\eqs{k}^n$ in Section~\ref{sec:eqs}.

To further bolster the claim that $\eqs{2}^n$ is natural in this context we will show an easy proof of the \emph{probabilistic} DHJ($2$) theorem. Looking at the statement of Theorem~\ref{thm:pdhj}, the reader will see it requires defining $\eqs{3}^n$; we will make this definition in the course of the proof. \begin{lemma} \label{lem:p-sperner} For every real $\delta>0$, every $A \subseteq \{0,1\}^n$ with $\eqs{2}^n$-density at least $\delta$ satisfies \[ \Pr_{\lambda \sim \eqs{3}^n}[\lambda \subseteq A] \geq \delta^2. \] \end{lemma}

\noindent Note that there is no lower bound necessary on $n$; this is because a template $\lambda \sim \eqs{3}^n$ may be degenerate.

\begin{proof} As in our proof of Sperner's theorem, let us choose a permutation $\tau$ of $[n]$ uniformly at random. Suppose we now choose $s \in \{0, 1, \dotsc, n\}$ and also $t \in \{0, 1, \dotsc, n\}$ \emph{independently}. Let $x(\tau,s) \in \{0,1\}^n$ denote the string which has $1$'s in coordinates $\tau(1), \dotsc, \tau(s)$ and $0$'s in coordinates $\tau(s+1), \dotsc, \tau(n)$, and similarly define $x(\tau,t)$. These two strings both have the distribution $\eqs{2}^n$, but are not independent.

A key observation is that $\{x(\tau,s), x(\tau,t)\}$ is a combinatorial line in $\{0,1\}^n$, unless $s = t$ in which case the two strings are equal. The associated line template is $\lambda \in \{0,1,\wild\}^n$, with \[ \lambda_i = \begin{cases}

                1      & \text{if $i \leq \min\{s,t\}$,}\\
                \wild  & \text{if $\min\{s,t\} < i \leq \max\{s,t\}$,} \\
                0      & \text{if $i > \max\{s,t\}$.}
           \end{cases}

\] This gives the definition of how to draw $\lambda \sim \eqs{3}^n$ (with alphabet $\{0,1,\wild\}$). Note that $\lambda$ is a degenerate template with probability $1/(n+1)$.

Assuming $\Pr_{x \sim \eqs{2}^n}[x \in A] \geq \delta$, our goal is to show that $\Pr[x(\tau,s), x(\tau,t) \in A ] \geq \delta^2$. But \begin{align*} \Pr[x(\tau,s), x(\tau,t) \in A] &= \Ex_{\tau} \Bigl[\Pr_{s,t} [x(\tau,s), x(\tau,t) \in A]\Bigr] & \\ &= \Ex_{\tau} \Bigl[\Pr_{s} [x(\tau,s) \in A] \Pr_{t}[x(\tau,s) \in A]\Bigr] &\text{(independence of $s$, $t$)} \\ &= \Ex_{\tau} \Bigl[\Pr_{s} [x(\tau,s) \in A]^2\Bigr] & \\ &\geq \Ex_{\tau} \Bigl[\Pr_{s} [x(\tau,s) \in A]\Bigr]^2 & \text{(Cauchy-Schwarz)}\\ &= \Pr_{\tau,s} [x(\tau,s) \in A]^2, & \end{align*} and $x(\tau,s)$ has the distribution $\eqs{2}^n$, completing the proof. \end{proof}

Having proved the probabilistic DHJ($2$) theorem rather easily, an obvious question is whether we can generalize the proof to $k = 3$. The answer seems to be no; there is no obvious way to generate random length-$3$ lines in which the points are independent, or even partially independent as in the previous proof. Nevertheless, the equal-slices distribution remains important for our proof of the general case of DHJ($k$); in Section~\ref{sec:eqs} we shall introduce both $\eqs{k}^n$ and $\ens{k}^n$ and prove some basic facts about them.