# Concepts.tex

\section{Basic concepts} \label{sec:concepts}

In this section we introduce some basic concepts around the density Hales--Jewett theorem, and also state formally the extensions of DHJ($k$) that we will prove.

\subsection{Points, lines, and line templates} In the density Hales--Jewett theorem the arithmetic properties of $[k] = \{1, 2, \dotsc, k\}$ play no role. It is only important that this set has $k$ elements; this dictates the length of a combinatorial line. Therefore we can equivalently replace $[k]$ by any other \emph{alphabet} $\Omega$ of $k$ \emph{symbols}, treating the elements of $\Omega^n$ as \emph{strings}. We will typically use letters $a$, $b$,~\ldots for symbols, $x$, $y$,~\ldots for strings/points, and $i$, $j$,~\ldots for coordinates; e.g., $x_j = b$ means the $j$th coordinate of string $x$ is symbol $b$.

We have already seen that it was convenient to use a different alphabet $\Omega$ when we took $\Omega = \{0, 1, \dots, k-1\}$ to deduce van der Waerden's theorem from Hales--Jewett. A more interesting observation is that a combinatorial line template over $[k]^n$, i.e.\ a string in $([k] \cup \{\wild\})^n$, is again just a string over an alphabet of size $k+1$. If we use the symbol `$k + 1$' in place of $\wild$, we see that \emph{a point in $[k+1]^n$ can be interpreted as a line in $[k]^n$}. For example, the point $13332213 \in [3]^8$ corresponds to the length-$2$ line $\{1\bs{1}\bs{1}\bs{1}221\bs{1}, 1\bs{2}\bs{2}\bs{2}221\bs{2}\} \subset [2]^8$. We introduce the notation $ \chg{x}{b}{a} $ for the string formed by changing all occurrences of symbol $b$ to symbol $a$ in string $x$. Thus a line template $\lambda \in ([3] \cup \{\wild\})^n$ corresponds to the line $\{\chg{\lambda}{\wild}{1}, \chg{\lambda}{\wild}{2}, \chg{\lambda}{\wild}{3}\} \subseteq [3]^n$.

Let us give some formal definitions, taking care of the somewhat tedious possibility of degenerate line templates:
\begin{definition} We say that $x \in \Omega^n$ is a \emph{degenerate string} if it is missing at least one symbol from $\Omega$. A \emph{line template over $\Omega^n$ with wildcard symbol $c$} ($\not \in \Omega$) is a string $\lambda \in (\Omega \cup \{c\})^n$. We say $z$ is a \emph{degenerate line template} if $z$ contains no `$c$' symbols. If $\lambda$ is nondegenerate, we associate to it the \emph{combinatorial line} $\{\chg{\lambda}{c}{a} : a \in \Omega\} \subseteq \Omega^n$, which has cardinality $|\Omega|$.
\end{definition}
We will often blur the distinction between a line template and a line; however, please note that for us a ``line* is always nondegenerate, whereas a line template may not be.*

\subsection{Subspaces} As mentioned, we will find it useful to think of lines over $[k]^n$ as points in $[k+1]^n$, with $k+1$ serving as the wildcard symbol. It's natural then to ask for an interpretation of a point in, say, $[k+2]^n$. This can be thought of as a combinatorial \emph{plane} over $[k]^n$. For simplicity, let's consider $k = 3$ and replace $[k+2]$ with $[3] \cup \{\wild_1, \wild_2\}$. A string in $([3] \cup \{\wild_1, \wild_2\})^n$ such as \[ \sigma = 1 3 \wild_1 \wild_2 2 1 \wild_1 \wild_1 3 \wild_2 1 \] (here $n = 11$) corresponds to the following $9$-point combinatorial plane: \[ \begin{array}{rcl} \phantom{.[3]^{11} \supset} \{13\bs{1}\bs{1}21\bs{1}\bs{1}3\bs{1}1,&13\bs{2}\bs{1}21\bs{2}\bs{2}3\bs{1}1,&13\bs{3}\bs{1}21\bs{3}\bs{3}3\bs{1}1, \\

13\bs{1}\bs{2}21\bs{1}\bs{1}3\bs{2}1,&13\bs{2}\bs{2}21\bs{2}\bs{2}3\bs{2}1,&13\bs{3}\bs{2}21\bs{3}\bs{3}3\bs{2}1, \\ 13\bs{1}\bs{3}21\bs{1}\bs{1}3\bs{3}1,&13\bs{2}\bs{3}21\bs{2}\bs{2}3\bs{3}1,&13\bs{3}\bs{3}21\bs{3}\bs{3}3\bs{3}1\} \subset [3]^{11}. \\

\end{array} \] More generally: \begin{definition} For $d \geq 1$, a \emph{$d$-dimensional subspace template over $\Omega^n$ with wildcard symbols $c_1, \dots, c_d$} (distinct symbols not in $\Omega$) is a string $\sigma \in (\Omega \cup \{c_1, \dots, c_d\})^n$. We say $\sigma$ is a \emph{degenerate template} if $\sigma$ is missing any of the $c_i$ symbols. If $\sigma$ is nondegenerate, we associate to it the \emph{$d$-dimensional combinatorial subspace} $\{\chg{\sigma}{(c_1, \dotsc, c_d)}{(a_1, \dotsc, a_d)} : (a_1, \dotsc, a_d) \in \Omega^d\} \subseteq \Omega^n$, which has cardinality $|\Omega|^d$. \end{definition} Here we have introduced the extended notation $\chg{x}{(c_1, \dotsc, c_d)}{(a_1, \dotsc, a_d)}$ for the string in $\Omega^n$ formed by changing all occurrences of symbol $c_i$ to symbol $a_i$ in string $x$, simultaneously for all $i \in [d]$.

The \emph{multidimensional} density Hales--Jewett theorem (also proved by Furstenberg and Katznelson~\cite{FK91}) states that dense subsets of $[k]^n$ contain not just lines but combinatorial subspaces: \begin{named}{Multidimensional density Hales--Jewett Theorem} \label{thm:mdhj} For every real $\delta > 0$ and every pair of positive integers $k$ and $d$, there exists a positive integer $\mdhj{k}{\delta}{d}$ such that every subset of $[k]^n$ of density at least $\delta$ contains a $d$-dimensional combinatorial subspace, provided $n \geq \mdhj{k}{\delta}{d}$. \end{named}

The $k = 2$ case of this theorem was established by Gunderson, R\"odl, and Sidorenko~\cite{GRS99}; using purely combinatorial means they proved the following:
\begin{named}{Gunderson--R\"odl--Sidorenko theorem} \label{thm:grs} One may take $\mdhj{2}{\delta}{d} = (10d)^{d2^d} \cdot (1/\delta)^{2^d}$.
\end{named}
\noindent(To see this from~\cite{GRS99}, combine that paper's Theorem~4.2 and equation~8, and note that its ``$c(d)$* is at most $(10d)^d$.) The authors also showed this bound is fairly close to being sharp.*
\ignore{
\begin{theorem}
Let $A \subseteq [2]^n$ have (uniform) density at least $\delta$, and assume $n \geq (10d)^{d2^d} \cdot (1/\delta)^{2^d}$. Then $A$ contains a nondegenerate $d$-dimensional subspace.
\end{theorem}
\begin{proof}
Theorem 4.2 and equation~(8) of~\cite{GRS99} taken together state that the maximum density of a $d$-dimensional subspace-free subset of $[2]^n$ is at most
\[
c(d) \cdot n^{-1/2^d}, \quad \text{where } c(d) = 10^d 2^{-1/2^{d-1}}d^{d - 1/2^d}.
\]
Note that $c(d) < (10d)^d$. Hence it suffices to check that our lower bound on $n$ implies $\delta \geq (10d)^d n^{-1/2^d}$.
\end{proof}}

\subsubsection{Passing to a subspace}
A useful aspect of subspaces is that they ``look like* the original space.\noteryan{I plagiarized this phrase from Walters.} More precisely, a $d$-dimensional subspace $\{\chg{\sigma}{(c_1, \dotsc, c_d)}{(a_1, \dotsc, a_d)} : (a_1, \dotsc, a_d) \in \Omega^d\} \subseteq \Omega^n$ is isomorphic to $\Omega^d$, in the sense that $d'$-dimensional subspaces of $\sigma$ map to $d'$-dimensional subspaces of $\Omega^n$. This is identification is easiest to see when the subspace template $\sigma$ has exactly one copy of each wildcard symbol $c_i$. In this case, writing $J$ for the set of $d$ wildcard coordinates, we will call $\sigma \cong \Omega^J$ a \emph{restriction} to the coordinates $J$ (see Definition~\ref{def:restriction}).*

This isomorphism is useful because it allows us to ``pass to subspaces* when looking for combinatorial lines (or subspace) in a subset $A \subseteq [k]^n$. By this we mean the following: Given a $d$-dimensional subspace $\sigma$ of $[k]^n$, passing to a $\sigma$ means considering $A' = A \cap \sigma$ and as a subset of $[k]^d$. Note that if we manage to find a line (or subspace) contained in $A'$, this maps back to a line (subspace) contained in $A$. This is helpful is $A'$ has some kind of improved ``structure*; e.g., a higher density (within $[k]^d$). Indeed, finding density increments on subspaces is the basic strategy for our proof of DHJ($k$); see Section~\ref{sec:outline}. The technique also gives an easy method for deducing mDHJ($k$) from DHJ($k$) (although with very poor quantitative dependence on $d$).

When passing to a subspace, we will henceforth not make make a distinction between $A$ and $A'$, writing $A$ for both.

\subsection{Probabilistic DHJ}
To show that a certain combinatorial object exists, the Probabilistic Method suggests choosing it a random. Could this work for the density Hales--Jewett theorem? A difficulty is that it is not immediately clear how one should choose a random combinatorial line. The most obvious way to choose a line over $[3]^n$, say, is to simply choose a line template $\lambda \in [4]^n$ uniformly at random (taking care of the extremely unlikely chance of a degenerate template). Unfortunately this has no real chance of working because the uniform distribution on lines is not ``compatible* with the uniform distribution on points. *

\ignore{Before clarifying this point, let us introduce some notation. We write $\mu_{\Omega}$ for the uniform probability distribution on $\Omega$\noteryan{not bothering to point out that $\Omega$ will always denote a finite set} and $\mu_k$ in the case $\Omega = [k]$. We will also write $\mu_\Omega^{\otimes n}$ for the associated product distribution on $\Omega^n$, which is just the uniform distribution. Given $A \subseteq \Omega^n$ we write \[ \mu_\Omega^{\otimes n}(A) = \Prx_{x \sim \Omega^{\otimes n}(A)}[x \in A], \] for the (uniform-distribution) density of $A$ in $\Omega^n$. We will also write this as $\mu_\Omega(A)$ or even just $\mu(A)$ when the meaning is clear from context.

Let us now return to the difficulty with choosing lines uniformly at random.} To clarify, suppose we choose $\lambda \in [4]^n$ according to the uniform distribution. Then the distribution on the point $\chg{\lambda}{4}{1} \in [3]^n$ will be a product distribution $\pi^{\otimes n}$ in which symbol $1$ has probability $1/2$ and symbols $2$ and $3$ have probability $1/4$ each. However this distribution is very ``far* from the uniform distribution on $[3]^n$, meaning that even if $A$ has density $\delta$ (i.e., a uniformly random $x \in [3]^n$ is in $A$ with probability $\delta$), the probability that $\chg{\lambda}{4}{1} \in [3]^n$ may be a negligibly small function of $n$. *

It's inevitable that $\chg{\lambda}{4}{1}$ will ``have more $1$'s than $2$'s or $3$'s*, and similarly for $\chg{\lambda}{4}{2}$, $\chg{\lambda}{4}{3}$. To evade this difficulty we take inspiration from the probabilistic proof Sperner's theorem~\cite{Spe90}\noteryan{earliest citation I could find; note that Lubell's proof is a bit different} and introduce a new non-uniform, non-product distribution on points in $[k]^n$. We call this distribution the \emph{equal-slices} distribution and denote it by $\eqs{k}^n$. We will discuss $\eqs{2}^n$ in Section~\ref{sec:sperner} on Sperner's theorem, and the generalization $\eqs{k}^n$ in Section~\ref{sec:eqs}. For now we simply explain the name: To draw a string $x \in [2]^n$ from $\eqs{k}^n$, first choose a ``slice* $s \in \{0, 1, \dots, n\}$ uniformly, then choose $x$ to be a random string of with exactly $s$ many $2$'s. The more general distribution $\eqs{k}^n$ can be thought of as the uniform \emph{mixture} of all product distributions.

As we will see, equal-slices is the natural distribution to use for a probabilistic version of DHJ($k$). There is a small catch, which is that a line template $\lambda \sim \eqs{k}^n$ has a very tiny chance of being being degenerate. To compensate for this, we introduce the very closely-related distribution $\ens{k}^n$ (``equal-nondegenerate-slices*), which is $\eqs{k}^n$ conditioned on $\lambda$ being a nondegenerate \emph{string} (i.e., having at least one of each symbol in $[k]$). For more details, see Section~\ref{sec:eqs}. We then prove the following two theorems:*
\begin{theorem} \label{thm:edhj} (Equal-slices density Hales--Jewett theorem.) For every positive integer $k$ and real $\delta>0$ there exists a positive integer $\edhj{k}{\delta}$ such that every $A \subseteq [k]^n$ with $\ens{k}^n$-density at least $\delta$ (i.e., $\Pr_{x \sim \ens{k}^n}[x \in A] \geq \delta$) contains a combinatorial line, provided $n \geq \edhj{k}{\delta}$.
\end{theorem}
\begin{theorem} \label{thm:pdhj} (Probabilistic density Hales--Jewett theorem.) For every positive integer $k$ and real $\delta>0$ there exists a positive integer $\Pdhj{k}{\delta}$ and a positive real $\pdhj{k}{\delta}$ such that every $A \subseteq [k]^n$ with $\ens{k}^n$-density at least $\delta$ satisfies
\[
\Pr_{\lambda \sim \ens{k+1}^n}[\lambda \subseteq A] \geq \pdhj{k}{\delta},
\]
provided $n \geq \Pdhj{k}{\delta}$.
\end{theorem}
Theorem~\ref{thm:edhj} follows from the density Hales--Jewett theorem by playing around with probability distributions; see Section~XXX. Theorem~\ref{thm:pdhj} follows almost immediately from Theorem~\ref{thm:edhj}, thanks to some nice properties of the equal-nondegenerate-slices distribution. Finally, the same two deductions also give us the multidimensional versions of the theorems:
\begin{theorem} \label{thm:pmdhj} (Probabilistic multidimensional density Hales--Jewett theorem.) For every real $\delta > 0$ and every pair of positive integers $k$ and $d$, there exists a positive integer $\Pmdhj{k}{\delta}{d}$ and a positive real $\pmdhj{k}{\delta}{d}$ such that every $A \subseteq [k]^n$ with $\ens{k}^n$-density at least $\delta$ satisfies
\[
\Pr_{\sigma \sim \ens{k+d}^n}[\sigma \subseteq A] \geq \pmdhj{k}{\delta}{d},
\]
provided $n \geq \Pmdhj{k}{\delta}{d}$.
\end{theorem}

\subsection{XXX Definitions needed for later that need to be put somewhere for now}

\begin{definition} Given a distribution $\pi$ on $\Omega$, write $\min(\pi) = \min_{x \in \Omega} \pi[x]$. \end{definition}

\begin{definition} For $\pi$ and $\nu$ probability distributions on the same finite set $\Omega$, the \emph{total variation distance} $\dtv{\pi}{\nu}$ is defined by \[ \dtv{\pi}{\nu} = \frac12 \sum_{x \in \Omega} |\pi[x] - \nu[x]| = \max_{A \subseteq X} |\pi[A] - \nu[A]|. \] \end{definition}

\begin{fact} \label{fact:tv-mix} Let $(\nu_\kappa)_{\kappa \in K}$ be a family of distributions on $\Omega$, let $\lambda$ be a distribution on $K$, and let $\mu$ be the associated mixture distribution, given by drawing $\kappa \sim \lambda$ and then drawing from $\nu_\kappa$. Then\noteryan{need to prove? It's just the triangle inequality} \[ \dtv{\pi}{\mu} \leq \Ex_{\kappa \sim \lambda}[\dtv{\pi}{\nu_\kappa}]. \] \end{fact}