Sperner.tex
\section{Remarks about Sperner's theorem}
The first non-trivial case of Szemer\'edi's theorem is the case of arithmetic progressions of length 3. However, for the density Hales-Jewett theorem even the case $k=2$ is interesting. For the purposes of this discussion, let us define $[2]$ to be the set $\{0,1\}$. Then a combinatorial line in $[2]^n$ is a pair of $01$ sequences $x$ and $y$, such that to obtain $y$ from $x$ one changes a few $0$s to $1$s. If we think of the sequences $x$ and $y$ as the characteristic functions of two subsets $A$ and $B$ of $[n]$, then this is saying that $A$ is a proper subset of $B$. Therefore, when $k=2$ we can formulate the density Hales-Jewett theorem as follows: for every $\delta>0$ there exists $n$ such that if $\mathcal{A}$ is a collection of at least $\delta 2^n$ subsets of $[n]$, then there must exist two distinct sets $A,B\in\mathcal{A}$ with $A\subset B$.
This result has been known for a long time: it is an immediate consequence of the following basic result in extremal combinatorics due to Sperner. Recall that an \textit{antichain} is a collection $\mathcal{A}$ of sets such that no set in $\mathcal{A}$ is a proper subset of any other.
\begin{theorem} \label{Sperner} For every positive integer $n$, the largest cardinality of any antichain of subsets of $[n]$ is $\binom n{\lfloor n/2\rfloor}$. \end{theorem}
\begin{proof} As the bound suggests, the best possible example is the collection of all subsets of $[n]$ of size $\lfloor n/2\rfloor$. (It can be shown that this example is essentially unique: the only other example is to take all sets of size $\lceil n/2\rceil$, and even this is different only when $n$ is odd.)
In the other direction, consider the following way of choosing a random subset of $[n]$. First, we choose, uniformly at random, a permutation $\pi$ of $[n]$. Next, we choose, uniformly at random and independently of $\pi$, an integer $j$ from the set $\{0,1,\dots,n\}$. Finally, we set $A=\{\pi(1),\dots,\pi(j)\}$ (where this is interpreted as the empty set if $j=0$).
Let $\mathcal{A}$ be an antichain. Then the probability that a set $A$ that is chosen randomly in the above manner belongs to $\mathcal{A}$ is at most $1/(n+1)$, since whatever $\pi$ is at most one of the $n+1$ sets $\{\pi(1),\dots,\pi(j)\}$ can belong to $\mathcal{A}$.
However, what we are really interested in is the probability that $A\in\mathcal{A}$ if $A$ is chosen \textit{uniformly} from all subsets of $[n]$. Let us write $\mu(A)$ for the probability that we choose $A$ according to the distribution defined above, and $\nu(A)$ for the probability that we choose it uniformly. Then $\nu(A)=2^{-n}$ for every $A$, whereas $\mu(A)=\frac 1{n+1}\binom n{|A|}^{-1}$, since there is a probability $1/(n+1)$ that $j=|A|$, and all sets of size $|A|$ are equally likely to be chosen. Therefore, the largest ratio of $\nu(A)$ to $\mu(A)$ occurs when $|A|=\lfloor n/2\rfloor$ or $\lceil n/2\rceil$. In this case, the ratio is $(n+1)\binom n{\lfloor n/2\rfloor}2^{-n}$.
Since $\mu(\mathcal{A})\leq 1/(n+1)$, it follows that $2^{-n}|\mathcal{A}|=\nu(\mathcal{A})\leq\binom n{\lfloor n/2\rfloor}2^{-n}$, which proves the theorem. \end{proof}
We have given this proof in order to draw attention to the fact that it is very natural to consider two different measures on $\{0,1\}^n$, or equivalently on the set of all subsets of $[n]$. The first is the uniform measure, which is forced on us by the way the question is phrased. The second is what one might call the ``equal-slices measure, where the cardinality of a set is chosen uniformly at random and then the set is chosen uniformly at random given that cardinality. This name comes from the fact that it is common to refer to sets of the form $\{A\subset[n]:|A|=k\}$ as \textit{slices} of the discrete cube, and the measure $\nu$ gives equal density to each slice.
After seeing the above proof, one might take the attitude that the ``correct statement of Sperner's theorem is that if $\mathcal{A}$ is an antichain, then $\mu(\mathcal)\leq 1/(n+1)$, and that the statement given above is a slightly artificial and strictly weaker consequence.
Since Sperner's theorem is the first non-trivial case of the theorem we are trying to prove, an obvious question is whether we can generalize the above proof. The answer seems to be no, essentially because there is no obvious analogue of the system of nested intervals that plays such an important role. However, the idea of looking at equal-slices measure does turn out to be very useful technically, so in the next section we shall define it for subsets of $[k]^n$ and prove a few simple facts about it for use later in the proof.