Intro.tex

From Polymath Wiki
Jump to navigationJump to search

\section{Introduction}

The purpose of this paper is to give the first fully combinatorial\noteryan{Perhaps ``finitary is better?} proof of a theorem of Furstenberg and Katznelson~\cite{FK89,FK91} that has the same relation to the Hales-Jewett theorem as Szemer\'edi's theorem has to van der Waerden's theorem. Before we go any further, let us state all four theorems. We shall use the notation $[k]$ to stand for the set $\{1,2,\dots,k\}$.\ignore{(or sometimes, particularly when $k$ is small, for the set $\{0,1,\dots,k-1\}$)} If $X$ is a set and $r$ is a positive integer, then an $r$-\textit{colouring} of $X$ will mean a function $\kappa:X\rightarrow [r]$ for some set $C$ of size $r$. A subset $Y$ of $X$ is called \textit{monochromatic} if $\kappa(y)$ is the same for every $y\in Y$.

First, let us state van der Waerden's theorem and Szemer\'edi's theorem.

\begin{theorem} \label{vdw} For every pair of positive integers $k$ and $r$ there exists $N$ such that for every $r$-colouring of $[N]$ there is a monochromatic arithmetic progression of length $k$. \end{theorem}

\begin{theorem}\label{szem} For every positive integer $k$ and every $\delta>0$ there exists $N$ such that every subset $A\subset[N]$\noteryan{I have a very slight preference for $\subseteq$ here; doesn't really matter, but we should unify throughout} of size at least $\delta N$ contains an arithmetic progression of length $k$. \end{theorem}

It is usually better to focus on the \textit{density} $|A|/N$ of a subset $A\subset[N]$ than on its cardinality, since this gives us a parameter that we can think of independently of $N$. Szemer\'edi's theorem is often referred to as \textit{the density version} of van der Waerden's theorem.

To state the Hales-Jewett theorem, we need a little more terminology. The theorem is concerned with subsets of $[k]^n$, elements of which are referred to as \emph{points}. More generally, we consider \emph{strings} from $\Sigma^n$, where $\Sigma$ is a finite \emph{alphabet}. A \emph{line template} (over $[k]$) is a string $\lambda \in ([k] \cup \{\star\})^n$, and the associated \emph{(combinatorial) line} is the set of points \[ \{\chg{\lambda}{\star}{1}, \dots, \chg{\lambda}{\star}{k}\}. \] Here we have introduced the notation $\chg{z}{i}{j}$ for the string formed by changing all occurrences of symbol $i$ to symbol $j$ in string $z$. For instance, when $k = 3$ and $\lambda = \star 1 3 \!\star \!\star$, the associated combinatorial line is the following set of $3$ points: \[ \{11311, 21322, 31333\}. \] We say a line (template) over $[k]$ is \emph{degenerate} if it contains no occurrences of $\star$; i.e., if the line contains $1$ point rather than the expected $k$ points. \ignore{ Let $x=(x_1,\dots,x_n)$ be an element of $[k]^n$, let $W\subset[n]$ and let $j\in[k]$. Let us write $x\oplus jW$ for the sequence $(y_1,\dots,y_n)$ such that $y_i=x_i$ if $i\notin A$ and $y_i=j$ if $i\in A$. That is, we take the sequence $x$ and overwrite the terms with index in $W$ with a $j$. For example, if $k=3$, $n=5$, then \begin{equation*} (1,1,3,2,1)+2\{1,4,5\}=(2,1,3,2,2). \end{equation*} The \textit{combinatorial line} $x\oplus[k]W$ is the set $\{x\oplus jW:j\in [k]\}$. For instance, \begin{equation*} (1,1,3,2,1)\oplus[3]\{1,4,5\}=\{(1,1,3,1,1),(2,1,3,2,2),(3,1,3,3,3)\}. \end{equation*} It may seem strange that the combinatorial line $x\oplus [k]W$ does not depend on the values of $x_i$ with $i\in A$, but this turns out to be a useful convention and makes it possible to state results in a concise way.}

We are now ready to state the Hales-Jewett theorem. \begin{theorem}\label{hj} For every pair of positive integers $k$ and $r$ there exists $n$ such that for every $r$-colouring of the set $[k]^n$ there is a monochromatic nondegenerate combinatorial line. \end{theorem} The theorem of Furstenberg and Katznelson that we shall prove in a different way in this paper is the following density version of the Hales-Jewett theorem. \begin{theorem}\label{dhj} For every $k$ and every $\delta>0$ there exists $n$ such that every subset of $[k]^n$ of size at least $\delta k^n$ contains a nondegenerate combinatorial line.\noteryan{Would be good to point out somewhere that as soon as it works for one $n$ it works for all larger $n$ too.} \end{theorem}

Why is it interesting to give a new proof of this result? There are two main reasons. The first is connected with the history of results in this area. One of the main benefits of Furstenberg's proof of Szemer\'edi's theorem was that it introduced a technique that could be developed in many directions, which did not seem to be the case with Szemer\'edi's proof. As a result, many far-reaching generalizations of Szemer\'edi's theorem were proved, and for a long time nobody could prove them in any other way than by using Furstenberg's methods. In the last few years that has changed, and a programme has developed to find new and finitary proofs of the results that were previously known only by infinitary ergodic methods. Giving a non-ergodic proof of the density Hales-Jewett theorem was seen as a key goal for this programme: on the ergodic side it appeared to be significantly harder than Szemer\'edi's theorem, and this seemed to be reflected by certain significant combinatorial difficulties that we shall discuss later in this paper.

A second reason is that the density Hales-Jewett theorem implies Szemer\'edi's theorem, and finding a new proof of Szemer\'edi's theorem seems always to be illuminating---or at least this has been the case for the four main approaches discovered so far. In retrospect, one can add to this the fact that the proof we have discovered is, against all our expectations, \textit{simpler} than the previous approaches to Szemer\'edi's theorem: it seems that by looking at a more general problem we have removed some of the difficulty. Related to this is another surprise. We started out by trying to prove the first difficult case of the theorem, which is when $k=3$. The experience of all four of the earlier proofs of Szemer\'edi's theorem has been that interesting ideas are needed to prove results about progressions of length $3$, but significant extra difficulties arise when one tries to generalize an argument from the length-$3$ case to the general case. However, it turned out that once we had proved the case $k=3$ of the density Hales-Jewett theorem, it was straightforward to generalize the argument to the general case. We still do not have a convincing explanation of why our proof should differ from all the others in this respect.

Here, briefly, is how to deduce Szemer\'edi's theorem from the density Hales-Jewett theorem. Given $k$ and $\delta$, let $n$ be such that the density Hales-Jewett theorem holds for $k$ and $\delta$ and let $N=k^n$. There is no harm in thinking of $[N]$ as the set $\{0,1,\dots,N-1\}$. Now given $A \subset [N]$ of density $\delta$, we can write the integers in $A$ in base $k$, thus thinking of it as a subset of density $\delta$ in~$[k]^n$ (where again, there is no harm in thinking of $[k]$ as $\{0, 1, \dots, k-1\}$). The density Hales-Jewett theorem gives a nondegenerate combinatorial line in $A$, and it is easy to see that this corresponds to a length-$k$ arithmetic progression: if the line's template is $\lambda$, with $S = \{i : \lambda_i = \star\}$, then the progression's common difference is $\sum_{i \in S} k^{n-i}$. \ignore{Thinking of $[k]$ as the set $\{0,1,\dots,k-1\}$ and $[N]$ as the set $\{0,1,\dots,N-1\}$, let $\phi:[k]^n\rightarrow[N]$ be the map that takes the sequence $(x_1,\dots,x_n)$ to the number $x_1+x_2k+x_3k^2+\dots+x_nk^{n-1}$, and note that $\phi$ is a bijection.\noteryan{Could we say here the catchier phrase, ``write the integers in $A$ in base $k$?} Now let $A\subset[N]$ be a set of density $\delta$. Then $\phi^{-1}(A)$ is a subset of $[k]^n$ of density $\delta$, so by the density Hales-Jewett theorem it contains a combinatorial line. It is easy to see that the image of a combinatorial line under $\phi$ is an arithmetic progression of length $k$: indeed, if the combinatorial line is $x\oplus[k]W$,\noteryan{should convert to new notation} then the first term of the arithmetic progression is $\phi(x\oplus 0W)$ and the common difference is $\sum_{i\in W}k^i$. A very similar argument shows that the Hales-Jewett theorem implies van der Waerden's theorem.}

We remark here that it is also possible to deduce the multidimensional Szemer\'edi theorem from the density Hales-Jewett theorem. Previously there were two known approaches: the ergodic approach and approaches that use hypergraph regularity.

Before we start working towards the proof of the theorem, we would like briefly to mention that it was proved in a rather unusual ``open source" way, which is why it is being published under a pseudonym. The work was done by several mathematicians, who wrote their thoughts, as they had them, in the form of blog comments. Anybody who wanted to could participate, and at all stages of the process the comments were fully open to anybody who was interested. This was in complete contrast to the usual way that results are proved in private and presented in a finished form. The blog comments are still available, so although this paper is a somewhat polished account of the argument, it is possible to read a record of the entire thought process that led to the proof. The participants also created a wiki, which contains sketches of the argument, links to the blog comments, and a great deal of related material. The URL is \url{http://michaelnielsen.org/polymath1/}.\\

\textbf{[[Tim: 1. Perhaps it is worth giving the deduction of multidimensional Szemer\'edi as it is an excuse to introduce the notion of a combinatorial subspace, which we will need later. \noteryan{Yes; might also be convenient to identify $\star$ with $k+1$ and then one gets an easy way to define subspaces (the character $k+2$ gives the coordinates in the second dimension, etc.)} 2. Somewhere we should mention Tim Austin's work and how it relates to ours. Perhaps Terry is best placed to do this.]]}


\subsection{XXX Definitions needed for later that need to be put somewhere for now}

\begin{definition} Given a distribution $\pi$ on $\Omega$, write $\min(\pi) = \min_{x \in \Omega} \pi[x]$. \end{definition}

\begin{definition} For $\pi$ and $\nu$ probability distributions on the same finite set $\Omega$, the \emph{total variation distance} $\dtv{\pi}{\nu}$ is defined by \[ \dtv{\pi}{\nu} = \frac12 \sum_{x \in \Omega} |\pi[x] - \nu[x]| = \max_{A \subseteq X} |\pi[A] - \nu[A]|. \] \end{definition}