Outline.tex

From Polymath Wiki
Revision as of 20:21, 24 June 2009 by 67.186.58.92 (talk) (New page: \section{Outline of the proof} \label{sec:outline} In this section we sketch our proof of DHJ($3$), which involves reducing it to DHJ($2$). The general deduction of DHJ($k+1$) from DHJ($k...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

\section{Outline of the proof} \label{sec:outline}

In this section we sketch our proof of DHJ($3$), which involves reducing it to DHJ($2$). The general deduction of DHJ($k+1$) from DHJ($k$) is not much more difficult.

\subsection{Passing between distributions, via subspaces} \label{sec:outline-dist} According to the statement of DHJ($3$), we are a given a subset $A \subseteq [3]^n$ with uniform density $\mu_3(A) \geq \delta$, and would like to find a combinatorial line contained in $A$, assuming $n$ is large enough. It is helpful to think of $\delta$ as a fixed explicit constant such as $1\%$ and to think of $n$ as enormous, so that any function $o_n(1)$ is negligible compared with $\delta$.

As we saw with Sperner's theorem, it is often more natural to think of the DHJ problem under the equal-(nondegenerate-)slices distribution. Our first task therefore is to reduce to proving Theorem~\ref{thm:edhj}, by ``embedding the equal-nondegenerate-slices distribution $\ens{3}$ in the uniform distribution $\unif_3^{\otimes n}$. To do this embedding, we define more explicitly the notion of \emph{restrictions}:

\begin{definition} \label{def:restriction} A \emph{restriction} on $[k]^n$ is a pair $(J, x_\barJ)$, where $J \subseteq [n]$ and $x_\barJ \in [k]^{\barJ}$, where $\barJ = [n] \setminus J$. It is thought of as a ``partial string, where the $\barJ$-coordinates are ``fixed according to $x_\barJ$ and the $J$-coordinates are still ``free. Given a fixing $y_J \in [k]^J$, we write $(x_\barJ, y_J)$ for the complete composite string in $[k]^n$.\noteryan{Although this notation looks awkward, I believe it's useful.} \end{definition}

\begin{definition} Given a set $A \subseteq [k]^n$ and a restriction $(J,x_\barJ)$, we sometimes write $A_{x_\barJ}$ for the subset of $[k]^{J}$ defined by $A_{x_\barJ} = \{y \in [k]^J : (x_{\barJ}, y_J) \in A\}$. \end{definition}

\begin{remark} As mentioned in Section~\ref{sec:pass-subspace}, a restriction $(J, x_\barJ)$ can be identified with a particular kind of combinatorial subspace. Specifically, it corresponds to the $\abs{J}$-dimensional subspace whose template is $(x_{\barJ}, w_J)$, where $w_J$ consists of $\abs{J}$ distinct wildcard symbols. The set $A_{x_\barJ}$ is renamed ``$A$ when we ``pass to this subspace. \end{remark}

Having defined restrictions, let's warm up by seeing how to embed the uniform distribution in the uniform distribution. (This was implicitly done in our deduction of multidimensional DHJ($k$) from DHJ($k$), Proposition~\ref{prop:mdhj-from-dhj}.) Suppose we have $\unif_3^{\ot n}(A) \geq \delta$. Fix an arbitrary subset of the coordinates $J \subset [n]$ with $\abs{J} = r$. Since $\unif_3^{\ot n}$ is a product distribution, if we draw $x_\barJ \sim \unif_3^{\ot \barJ}$ and $y_J \sim \unif_3^{\ot J}$ independently, then the composite string $z = (x_\barJ, y_J)$ is distributed as $\unif_3^{\ot n}$. Hence \[ \delta \leq \unif_3^{\ot n}(A) = \Pr_z[z \in A] = \Pr_{x_\barJ, y_J}[(x_\barJ, y_J) \in A] = \Ex_{x_\barJ}\Bigr[\Pr_{y_J}[(x_\barJ, y_J) \in A]\Bigr] = \Ex_{x_\barJ}\left[\unif_3^{\otimes J}(A_{x_\barJ})\right]. \] It follow that there must exist a \emph{particular} substring $x_\barJ$ for which $\unif_3^{\otimes J}(A_{x_\barJ}) \geq \delta$ as well. We can then pass to the subspace defined by $(J,x_\barJ)$, and we have constructed as new instance of the original problem: ``$n$ has decreased to $r = \abs{J}$, and ``$A$ (i.e., $A_{x_\barJ}$) still has $\unif_3(A) \geq \delta$.

Of course this hasn't gained us anything, but it illustrates the basic technique of passing to a subspace. We can now explain how to use this technique to pass to a different \emph{distribution}; namely, equal-nondegenerate-slices. Suppose we construct strings $z \in [3]^n$ as follows: we choose $r$ coordinates $J \subset [n]$ \emph{randomly}, we choose the substring $x_\barJ$ according to the uniform distribution $\unif_3^{\ot \barJ}$, and we choose the substring $y_J$ according to equal-nondegenerate-slices $\ens{3}^J$. Write $\wt{\unif}_3^n$ for the resulting distribution on the composite string $z = (x_\barJ, y_J)$. Now the count of each symbol in a draw from $\unif_3^{\ot n}$ tends to fluctuate within a range of size roughly $\sqrt{n}$; thus if $r \ll \sqrt{n}$, we may hope that the ``corrupted distribution $\wt{\unif}_3^n$ is still close to the original distribution $\unif_3^n$. In Section~\ref{sec:measures} we do some technical calculations to show that this is indeed the case; essentially, the total variation distance between distributions $\unif_3^{\otimes n}$ and $\wt{\unif}_3^n$ is at most $O(r/\sqrt{n})$. Thus by taking, say, $r \approx n^{1/4}$, we may conclude \[ \abs{\unif_3^{\otimes n}(A) - \wt{\unif}_3^n(A)} \leq O(1/n^{1/4}) \ll \delta \quad\Rightarrow\quad \wt{\unif}_3^n(A) \gtrapprox \delta. \] Hence \[ \delta \lessapprox \wt{\unif}_3^n(A) = \Pr_{J, x_\barJ, y_J}[(x_\barJ, y_J) \in A] = \Ex_{J, x_\barJ}\Bigr[\Pr_{y_J}[(x_\barJ, y_J) \in A]\Bigr] = \Ex_{J, x_\barJ}\left[\ens{3}^{J}(A_{x_\barJ})\right], \] and so again there must exist a particular $J$ and $x_\barJ$ such that $\ens{3}^J(A_{x_\barJ}) \gtrapprox \delta$. If we pass to the associated subspace, we get a set $A$ with \emph{equal-nondegenerate-slices} density at least $\delta$ (essentially), and ``$n$ has only decreased to $r \approx n^{1/4}$, which is rather affordable (especially in light of the quantitative bounds we ultimately have in Theorem~\ref{thm:our-dhj}). Now we may try to find a combinatorial line in this new $A$.

\subsection{The density increment strategy} \label{sec:outline-incr}

As we have seen, given a set $A$ of density at least $\delta$, it is not difficult to pass to a subspace and maintain density essentially $\delta$ (and we may even switch between probability distributions, if we wish). Our overall strategy for proving DHJ($3$) involves passing to subspaces on which $A$'s density \emph{increases} by a noticeable amount; specifically, from $\delta$ to $\delta + \Omega(\delta^3)$. (Here $\delta^3$ represents $\delta$ times the quantity $\delta^2 = \pdhj{2}{\delta}$ from Lemma~\ref{lem:p-sperner}.) More precisely, we will show the following result: \begin{theorem} \label{thm:incr3} Assume $n \geq 2 \uparrow^{(7)} (1/\delta)$.\noteryan{check this} If $\ens{3}^n(A) \geq \delta$, then either $A$ contains a combinatorial line, or we may pass to a subspace of dimension at least $\log^{(7)} n$ on which $\ens{3}(A) \geq \delta + \Omega(\delta^3)$. \end{theorem} \noindent Here and throughout $\log n$ denotes $\max\{2, \log_2 n\}$; $\log^{(7)}$ is this function iterated $7$ times.

Using such a ``density increment theorem is a well-known strategy\noteryan{references}, and yields the equal-slices DHJ($3$) theorem as follows: Assuming $n$ is ``sufficiently large to begin with, we either find a combinatorial line or we pass to a $\log^{(7)}(n)$-dimensional subspace on which $A$ has density at least $\delta + \Omega(\delta^3)$. Repeating this $C/7\delta^3$ times, for $C$ a sufficiently large absolute constant, we either find a combinatorial line somewhere along the way, or pass to a $\log^{(C/\delta^3)}(n)$-dimensional subspace on which $A$ has density at least $2\delta$. (All the while we are assuming the dimension is still ``sufficiently large.) Schematically, \[ \delta \xrightarrow{\text{$C/\delta^3$ many $\log$s}} 2\delta. \] Iterating this scheme, we get: \[ \delta \quad \xrightarrow{\text{$C/\delta^3$ many $\log$s}} \quad 2\delta \quad \xrightarrow{\text{$C/(2\delta)^3$ many $\log$s}} \quad 4\delta \quad \xrightarrow{\text{$C/(4\delta)^3$ many $\log$s}} \quad 8\delta \quad \longrightarrow \cdots \] Of course the density cannot exceed $1$, so eventually we \emph{must} find a combinatorial line, after at most $C/\delta^3 + C/8\delta^3 + C/64\delta^3 + \cdots \leq 2C/\delta^3$ many $\log$s. All the while we have assumed that the dimension is sufficiently large that the hypothesis of Theorem~\ref{thm:incr3} holds; to ensure this, it certainly suffices to ensure that the ``last dimension, $\log^{(2C/\delta^3)} n$, is at least $2 \uparrow^{(7)} (1/\delta)$. And this holds presuming the initial dimension, $n$, is at least $2 \upuparrows O(1/\delta^3)$. Thus we obtain the quantitative bound for $n$ given in our main Theorem~\ref{thm:our-dhj} (noting that the fourth-root arising in the previous section's reduction from DHJ($3$) to equal-slices DHJ($3$) is indeed negligible).

\subsection{Extending length-\texorpdfstring{$2$}{2} lines to length-\texorpdfstring{$3$}{3} lines} \label{sec:outline-corr} We now look to establish the density increment Theorem~\ref{thm:incr3}. In this section we will describe our basic strategy for finding lines in a subset $A \subseteq [3]^n$ with $\ens{3}(A) = \delta$. We will show that if the strategy does not work, we can obtain the density increment stated in Theorem~\ref{thm:incr3}. The idea is to begin with what we already know: probabilistic DHJ($2$), i.e., Lemma~\ref{lem:p-sperner}. At a high level, this can give us a large number of length-$2$ lines in $A$, one of which we hope to extend to a length-$3$ line.

To begin with, though, we will need to be able to apply probabilistic DHJ($2$), i.e.\ Lemma~\ref{lem:p-sperner}, to our $A$. It's not immediately clear how to do this: since $\ens{3}^n$ is supported on nondegenerate strings, we might have $\ens{3}(A) \geq \delta$ and yet $\ens{2}^n(A) = 0$. To evade this we can embed $\ens{2}$ into $\ens{3}^n$ on $n^{1/4}$ random coordinates into $\ens{3}^n$, as in Section~\ref{sec:outline-dist}. This will let us pass to a subspace on which $\ens{2}(A) \gtrapprox \delta$. But this is not enough; if $\ens{3}(A)$ becomes tiny, we'll have no hope of extending a length-$2$ line in $A$ into a length-$3$ line.\noteryan{note explained satisfactorily}

Looking at the argument in Section~\ref{sec:outline-dist}, we see that for a random restriction $(J, x_\barJ)$ to some $n^{1/4}$ coordinates $J$, the expected $\ens{2}$-density of $A$ is essentially $\delta$ and so is the expected $\ens{3}$-density. What we would like is for both densities to be $\gtrapprox \delta$ \emph{simultaneously}. This can be arranged with a density increment trick. The idea is to look at the \emph{variance} of the $\ens{3}$-density of $A$ under the random restriction. There are two cases. First, suppose this variance is somewhat \emph{high}, say at least $\delta^{C}$ for some large constant $C$. Then it must happen that there is some restriction wherein the $\ens{3}$-density of $A$ increases a little above its mean, to at least $\delta + \delta^{C-1}$. In this case, we have a density increment which is almost good enough for Theorem~\ref{thm:incr3} (we can repeat the present argument some $\poly(1/\delta)$ more times to get the full density increment claimed). Otherwise, the variance is quite \emph{low}. This means that for ``almost every restriction $(J, x_\barJ)$, the $\ens{3}$-density of $A$ is very close to $\delta$. Since we know that the expected $\ens{2}$-density of $A$ is $\delta$, it follows that for a noticeable fraction of restrictions the $\ens{2}$-density is $\gtrapprox \delta$. Thus we can find a restriction with both $\ens{2}(A) \gtrapprox \delta$ and $\ens{3}(A) \approx \delta$, as desired. \noteryan{not explained very well}.

Having arranged for both $\ens{2}(A), \ens{3}(A) \gtrapprox \delta$, we now come to a key idea in the proof. Suppose we pick a random $z \sim \ens{3}^n$. On one hand, we can think of $z$ as a \emph{point}, which will be in $A$ with probability at least $\delta$. On the other hand, we can think of $z$ as a \emph{line template over $[2]^n$} with wildcard symbol `$3$', and the probabilistic DHJ($2$) tells us that the associated length-$2$ line is in $A$ with probability at least $\delta^2$. \emph{If both events happen, we have a length-$3$ line in $A$:} $\{\chg{z}{3}{1}, \chg{z}{3}{2}, z\}$.

To rephrase this, let $L \subseteq [3]^n$ be the set of line templates over $[2]^n$ for which the associated line is in $A$. If $L \cap A \neq \emptyset$ then we have found the desired line in $A$. Unfortunately, all we know is that $\ens{3}(L) \geq \delta^2$ and $\ens{3}(A) \geq \delta$, and there is plenty of room for a density-$\delta^2$ set and a density-$\delta$ set to not intersect.

However, if $A$ does not intersect $L$ then we have that \[ \frac{\ens{3}(A)}{\ens{3}(\overline{L})} \geq \frac{\delta}{1-\delta^2} \geq \delta + \delta^3, \] where $\overline{L}$ denotes $[3]^n \setminus L$. In other words, $A$ has a ``relative density increment on the set $\overline{L}$. Unfortunately, $\overline{L}$ is not a combinatorial subspace of $[3]^n$, so we have not established the density increment Theorem~\ref{thm:incr3}. Still, as we will see, $\overline{L}$ is not a completely arbitrary subset; it has just enough structure that we will be able to convert $A$'s density increment on it to a density increment on a subspace.

\subsection{\texorpdfstring{$ab$}{ab}-insensitive sets} \label{sec:outline-insens} Recall that \[ L = \{z \in [3]^n : \chg{z}{3}{1} \in A,\ \chg{z}{3}{2} \in A\}, \] and hence \[ \overline{L} = B_1 \cup B_2, \quad \text{ where } B_1 = \{z \in [3]^n : \chg{z}{3}{1} \not \in A\},\ B_2 = \{z \in [3]^n : \chg{z}{3}{2} \not \in A\}. \] Consider, e.g., $B_1$. It has an important structural property: it is ``$13$-insensitive.

\begin{definition} Let $a, b \in \Omega$ be distinct symbols. We say that $C \subseteq \Omega^n$ is \emph{$ab$-insensitive} if the following condition holds: $x \in C$ iff $\chg{x}{a}{b} \in C$. \end{definition} \begin{remark} This definition is symmetric in $a$ and $b$. It is perhaps easier to understand the condition as follows: ``altering some $a$'s to $b$'s and some $b$'s to $a$'s does not affect presence/absence in $C$. \end{remark} \begin{remark} Suppose $C$ and $C'$ are $ab$-insensitive subsets of $\Omega^n$. Then so too are $\overline{C}$ and $C \cap C'$. Further, if $\sigma$ is a $d$-dimensional subspace of $\Omega^n$, and we view $C \cap \sigma$ as a subset of $\sigma \cong \Omega^d$, then $C \cap \sigma$ is $ab$-insensitive. \end{remark}


It is clear from the definition that $B_1$ is a $13$-insensitive set and $B_2$ is a $23$-insensitive set. Let us temporarily pretend that there is no $B_1$ and we simply have $\overline{L} = B_2$. (This simplification will ultimately prove easy to patch up.) Thus we have a density increment of roughly $\delta^3$ for $A$ on the $23$-insensitive set $B_2$; our last task is to convert this to an equally good density increment on a combinatorial subspace of dimension at least $\log^{(7)} n$. It turns out to be convenient to carry out this task under the uniform distribution, rather than under the equal-slices distribution; we can arrange for this by another distribution-embedding argument.

Sets which are $ab$-insensitive are very useful, for the following reason: for such sets, we can ``boost DHJ($k$) to DHJ($k+1$), and similarly for the multidimensional version. Suppose the $23$-insensitive set $B_2$ has $\unif_3(B_2) \geq \eta$. By ``conditioning on the location of the $3$'s, it is easy to find a restriction under which $\unif_2(B_2) \geq \eta$ as well. Hence we can apply the multidimensional DHJ($2$) theorem find a ``length-$2$ $d$-dimensional combinatorial subspace $\sigma$ contained in $B_2$. This $\sigma$ will only be an isomorphic copy of $\{1,2\}^d$, not $\{1,2,3\}^d$. But the key point is that $B_2$ is $23$-insensitive; hence $\chg{z}{2}{3} \in B_2$ for all $z \in \sigma$, and we conclude that the full ``length-$3$ $d$-dimensional subspace (copy of $\{1,2,3\}^d$) is contained in $B_2$ as well.


\subsection{Completing the proof: partitioning \texorpdfstring{$ab$}{ab}-insensitive sets} \label{sec:outline-partition} We now come to perhaps the key step in the proof: We show that a $23$-insensitive set can be almost completely partitioned into a disjoint union $\calS$ of $d$-dimensional combinatorial subspaces, where $d \approx \log^{(7)} n$. Thus if $A$ has a density increment on the $23$-insensitive set $B_2$, it must have an equally good density increment on one of the $d$-dimensional subspaces in $\calS$.

Let us explain this slightly more carefully. Suppose we have some $23$-insensitive set $B \subseteq [3]^n$ and we wish to partition it into a collection of disjoint $d$-dimensional subspaces $\calS$, along with an ``error set $E$ satisfying $\unif_3(E) \leq \delta^3/100$, say. If $\unif_3(B) \leq \delta^3/100$ already, we are done. Otherwise, using the density and $23$-insensitivity of $B$, we use the multidimensional DHJ($2$) to find a $d$-dimensional subspace contained in $B$. (We will describe later why we may take $d \approx \log^{(7)} n$.)

We can do even better than simply finding a single $d$-dimensional subspace in $B$. By an argument similar to Proposition~\ref{prop:mdhj-from-dhj}, we can ensure that there is a set of coordinates $J \subseteq [n]$ with $\abs{J} = m = \mdhj{2}{\delta^3/100}{d}$ and a particular $d$-dimensional subspace $\sigma \subseteq [3]^J$ such that $\sigma \times \{y\} \subseteq B$ for at least some $\tau = \tau(\delta, d) > 0$ fraction of all strings $y \in [3]^{\barJ}$. These sets $\sigma \times \{y\}$ are disjoint $d$-dimensional subspaces that we may remove from $B$ and place into the collection $\calS$.

We now wish to iterate this argument, but there is a catch; having removed the sets $\sigma \times \{y\}$ from $B$, it is likely no longer $23$-insensitive. Fortunately, its $23$-insensitivity is only spoiled on the $m$ coordinates $J$: \begin{definition} Let $a, b \in \Omega$ be distinct symbols. We say that $C \subseteq \Omega^n$ is \emph{$ab$-insensitive on $I \subseteq [n]$} if the following condition holds: given a string $x$, altering some $a$'s to $b$'s and some $b$'s to $a$'s \emph{within coordinates $I$} does not affect $x$'s presence/absence in $C$. If $I$ is unspecified we take it to be all of~$[n]$. \end{definition} We can thus remove a collection of disjoint $d$-dimensional subspaces from $B$ of probability mass at least $\tau(\delta,d) 3^{-m}$, at the expense of making $B$ only $23$-insensitive on coordinates $[n] \setminus J$. We now iterate the argument. Since $\abs{J}$ depends only on $\delta$ and $d$, we may assume that $n \gg \abs{J}$. Thus $B$ still has plenty of $23$-insensitive coordinates, and we can find another $d$-dimensional subspace contained in $B$, localized to some new $m$ coordinates $J_2$. This lets us remove another collection of $d$-dimensional subspaces from $B$ of mass at least $\tau(\delta,d) 3^{-m}$. Since this quantity depends only on $\delta$ and $d$, we need only iterate some $T(\delta, d)$ many times before $B$'s total density drops below $\delta^3/100$. And we may always continue so long as we still have $m$ coordinates on which $B$ is $23$-insensitive; we ensure this by assuming $n \geq T(\delta, d)m$. Using the Gunderson-R\"{o}dl-Sidirenko theorem, roughly speaking $m = \mdhj{2}{\delta^3/100}{d}$ doubly-exponential in $d$ and so $\tau(\delta,d) 3^{-m}$ is triply-exponential in $d$. In fact, we need to iterate this argument once more, leading to a six-fold exponential, because $\overline{L}$ was not just the single $23$-insensitive set $B_2$ but was rather $B_1 \cup B_2$. In the end, we are able to safely take $d \approx \log^{(7)} n$, as needed to prove Theorem~\ref{thm:incr3}.\noteryan{Ran out of steam here; exposition quality got low}