From Polymath1Wiki
Jump to: navigation, search

\section{Sperner's theorem} \label{sec:sperner}

The first nontrivial case of Szemer\'edi's theorem is the case of arithmetic progressions of length $3$. However, for the density Hales--Jewett theorem even the case $k=2$ is interesting. DHJ($2$) follows from a basic result in extremal combinatorics: Sperner's theorem. \ignore{This result has been known for a long time. \noteryan{amusingly, published after vdW's theorem}} In this section we review a standard probabilistic proof of Sperner's theorem. Besides suggesting the equal-slices distribution, this proof easily gives the $k=2$ case of the probabilistic Density Hales--Jewett theorem, a key component in our proof of DHJ($3$).

To investigate DHJ($2$) it slightly more convenient to take the alphabet to be $\Omega = \{0,1\}$. Then a combinatorial line in $\{0,1\}^n$ is a pair of distinct binary strings $x$ and $y$ such that to obtain $y$ from $x$ one changes some $0$'s to $1$'s. If we think of the strings $x$ and $y$ as the indicators of two subsets $X$ and $Y$ of $[n]$, then this is saying that $X$ is a proper subset of $Y$. Therefore, when $k=2$ we can formulate the density Hales-Jewett theorem as follows: there exists $\dhj{2}{\delta}$ such that for $n \geq \dhj{2}{\delta}$, if $\calA$ is a collection of at least $\delta 2^n$ subsets of $[n]$, then there must exist two distinct sets $X,Y \in \calA$ with $X \subset Y$. In the language of combinatorics, this is saying that $\calA$ is \emph{not} an antichain. (Recall that an \textit{antichain} is a collection $\mathcal{A}$ of sets such that no set in $\mathcal{A}$ is a proper subset of any other.) Sperner's theorem gives something slightly stronger: a precise lower bound on the cardinality of any antichain.

\begin{named}{Sperner's theorem} \label{thm:sperner} For every positive integer $n$, the largest cardinality of any antichain of subsets of $[n]$ is $\binom{n}{\lfloor n/2\rfloor}$. \end{named} As the bound suggests, the best possible example is the collection of all subsets of $[n]$ of size $\lfloor n/2\rfloor$. (It can be shown that this example is essentially unique: the only other example is to take all sets of size $\lceil n/2\rceil$, and even this is different only when $n$ is odd.) It is well known that $\binom{n}{\lfloor n/2\rfloor}2^{-n} \geq 1/2\sqrt{n}$ for all $n$; hence Sperner's theorem implies that one may take $\dhj{2}{\delta} = 4/\delta^2$.\noteryan{The constant can be sharpened to $\pi/2$ for small $\delta$, of course. Also, $4/\delta^2$ is technically not an integer.} Let us present a standard probabilistic proof of Sperner's theorem (see, e.g.,~\cite{Spe90}):

\begin{proof} (\emph{Sperner's theorem.}) Consider the following way of choosing a random subset of $[n]$. First, we choose, uniformly at random, a permutation $\tau$ of $[n]$. Next, we choose, uniformly at random and independently of $\tau$, an integer $s$ from the set $\{0,1,\dots,n\}$. Finally, we set $X=\{\tau(1),\dots,\tau(s)\}$ (where this is interpreted as the empty set if $s=0$).

Let $\mathcal{A}$ be an antichain. Then the probability that a set $X$ that is chosen randomly in the above manner belongs to $\mathcal{A}$ is at most $1/(n+1)$, since whatever $\tau$ is at most one of the $n+1$ sets $\{\tau(1),\dots,\tau(s)\}$ can belong to $\mathcal{A}$.

However, what we are really interested in is the probability that $X\in\mathcal{A}$ if $X$ is chosen \textit{uniformly} from all subsets of $[n]$. Let us write $\eqs{2}[X]$ for the probability that we choose $X$ according to the distribution defined above, and $\unif_2[X]$ for the probability that we choose it uniformly. Then $\unif_2[X]=2^{-n}$ for every $X$, whereas $\eqs{2}[X]=\frac 1{n+1}\binom n{\abs{X}}^{-1}$, since there is a probability $1/(n+1)$ that $s = \abs{X}$, and all sets of size $\abs{X}$ are equally likely to be chosen. Therefore, the largest ratio of $\unif_2[X]$ to $\eqs{2}[X]$ occurs when $\abs{X}=\lfloor n/2\rfloor$ or $\lceil n/2\rceil$.\noteryan{by unimodality of binomial coefficients} In this case, the ratio is $(n+1)\binom n{\lfloor n/2\rfloor}2^{-n}$.

Since $\eqs{2}(\mathcal{A})\leq 1/(n+1)$, it follows that $2^{-n}\abs{\mathcal{A}}=\unif_2(\mathcal{A})\leq\binom n{\lfloor n/2\rfloor}2^{-n}$, which proves the theorem. \end{proof}

As one sees from the proof, it is very natural to consider different probability distributions on $\{0,1\}^n$, or equivalently on the set of all subsets of $[n]$. The first is the uniform distribution $\unif_2$, which is forced on us by the way the question is phrased. The second is what we called $\eqs{2}$; the reader may check that this is precisely the ``equal-slices distribution $\eqs{2}^n$ described in Section~\ref{sec:pdhj}. After seeing the above proof, one might take the attitude that the ``correct statement of Sperner's theorem is that if $\calA$ is an antichain, then $\eqs{2}(\calA) \leq 1/(n+1)$, and that the statement given above is a slightly artificial and strictly weaker consequence.

\subsection{Probabilistic DHJ(\texorpdfstring{$2$}{2})} \label{sec:prob-sperner}

Indeed, what the proof (essentially) establishes is the ``equal-slices DHJ($2$) theorem; i.e., that in Theorem~\ref{thm:edhj} one may take $\edhj{2}{\delta} = 1/\delta$.\noteryan{minus $1$, even} We say ``essentially because of the small distinction between the distribution $\eqs{2}^n$ used in the proof and the distribution $\ens{2}^n$ in the statement. It will be convenient in this introductory discussion of Sperner's theorem to casually ignore this. We will introduce $\ens{k}^n$ and be more careful about its distinction with $\eqs{k}^n$ in Section~\ref{sec:eqs}.

To further bolster the claim that $\eqs{2}^n$ is natural in this context we will show an easy proof of the \emph{probabilistic} DHJ($2$) theorem. Looking at the statement of Theorem~\ref{thm:pdhj}, the reader will see it requires defining $\eqs{3}^n$; we will make this definition in the course of the proof. \begin{lemma} \label{lem:p-sperner} For every real $\delta>0$, every $A \subseteq \{0,1\}^n$ with $\eqs{2}^n$-density at least $\delta$ satisfies \[ \Pr_{\lambda \sim \eqs{3}^n}[\lambda \subseteq A] \geq \delta^2. \] \end{lemma}

\noindent Note that there is no lower bound necessary on $n$; this is because a template $\lambda \sim \eqs{3}^n$ may be degenerate.

\begin{proof} As in our proof of Sperner's theorem, let us choose a permutation $\tau$ of $[n]$ uniformly at random. Suppose we now choose $s \in \{0, 1, \dotsc, n\}$ and also $t \in \{0, 1, \dotsc, n\}$ \emph{independently}. Let $x(\tau,s) \in \{0,1\}^n$ denote the string which has $1$'s in coordinates $\tau(1), \dotsc, \tau(s)$ and $0$'s in coordinates $\tau(s+1), \dotsc, \tau(n)$, and similarly define $x(\tau,t)$. These two strings both have the distribution $\eqs{2}^n$, but are not independent.

A key observation is that $\{x(\tau,s), x(\tau,t)\}$ is a combinatorial line in $\{0,1\}^n$, unless $s = t$ in which case the two strings are equal. The associated line template is $\lambda \in \{0,1,\wild\}^n$, with \[ \lambda_i = \begin{cases}

                1      & \text{if $i \leq \min\{s,t\}$,}\\
                \wild  & \text{if $\min\{s,t\} < i \leq \max\{s,t\}$,} \\
                0      & \text{if $i > \max\{s,t\}$.}

\] This gives the definition of how to draw $\lambda \sim \eqs{3}^n$ (with alphabet $\{0,1,\wild\}$). Note that $\lambda$ is a degenerate template with probability $1/(n+1)$.

Assuming $\Pr_{x \sim \eqs{2}^n}[x \in A] \geq \delta$, our goal is to show that $\Pr[x(\tau,s), x(\tau,t) \in A ] \geq \delta^2$. But \begin{align*} \Pr[x(\tau,s), x(\tau,t) \in A] &= \Ex_{\tau} \Bigl[\Pr_{s,t} [x(\tau,s), x(\tau,t) \in A]\Bigr] & \\ &= \Ex_{\tau} \Bigl[\Pr_{s} [x(\tau,s) \in A] \Pr_{t}[x(\tau,s) \in A]\Bigr] &\text{(independence of $s$, $t$)} \\ &= \Ex_{\tau} \Bigl[\Pr_{s} [x(\tau,s) \in A]^2\Bigr] & \\ &\geq \Ex_{\tau} \Bigl[\Pr_{s} [x(\tau,s) \in A]\Bigr]^2 & \text{(Cauchy-Schwarz)}\\ &= \Pr_{\tau,s} [x(\tau,s) \in A]^2, & \end{align*} and $x(\tau,s)$ has the distribution $\eqs{2}^n$, completing the proof. \end{proof}

Having proved the probabilistic DHJ($2$) theorem rather easily, an obvious question is whether we can generalize the proof to $k = 3$. The answer seems to be no; there is no obvious way to generate random length-$3$ lines in which the points are independent, or even partially independent as in the previous proof. Nevertheless, the equal-slices distribution remains important for our proof of the general case of DHJ($k$); in Section~\ref{sec:eqs} we shall introduce both $\eqs{k}^n$ and $\ens{k}^n$ and prove some basic facts about them.