http://michaelnielsen.org/polymath1/api.php?action=feedcontributions&user=67.186.58.92&feedformat=atomPolymath1Wiki - User contributions [en]2019-10-20T20:14:15ZUser contributionsMediaWiki 1.23.5http://michaelnielsen.org/polymath1/index.php?title=Outro.texOutro.tex2009-06-25T04:23:33Z<p>67.186.58.92: </p>
<hr />
<div>\section{Discussion}<br />
<br />
\noteryan{Perhaps comparison with other proofs, discussion of how one might improve the argument, or at least discussion on how to improve the quantitatives of the present argument\dots}</div>67.186.58.92http://michaelnielsen.org/polymath1/index.php?title=Theproof.texTheproof.tex2009-06-25T04:23:16Z<p>67.186.58.92: </p>
<hr />
<div>\section{Completing the proof}<br />
<br />
In this section, we show how to deduce DHJ($k$) from DHJ($k-1$). We will treat $k = 3$ separately, since we have better bounds in this case due to the Gunderson-R\"{o}dl-Sidirenko theorem (and the probabilistic Sperner theorem).<br />
<br />
\subsection{The \texorpdfstring{$k = 3$}{k=3} case}<br />
As described in Section~\ref{sec:outline-incr}, to establish DHJ($3$) it suffices to establish the density increment theorem stated therein. (One also needs the rigorous reduction from DHJ($3$) to the equal-slices version, Proposition~\ref{prop:edhj-equiv-dhj}.) For the reader's convenience, we restate the density increment theorem:<br />
\begin{named}{Theorem~\ref{thm:incr3}} Assume $n \geq 2 \uparrow^{(7)} (1/\delta)$. If $\ens{3}^n(A) \geq \delta$, then either $A$ contains a combinatorial line, or we may pass to a subspace of dimension at least $\log^{(7)} n$ on which $\ens{3}(A) \geq \delta + \Omega(\delta^3)$.<br />
\end{named}<br />
\begin{proof}<br />
Throughout the proof we may assume $\delta \leq 2/3$, as otherwise DHJ($3$) is easy. By Proposition~\ref{prop:degen}, we have $\eqs{3}^n(A) \geq \delta - 6/n \geq \delta - \delta^{50}$, by our assumption on $n$. <br />
<br />
Let $\theta = 2\delta^{50}$. (We will not be especially economical when it comes to parameters which do not matter.) Applying Lemma~\ref{lem:32}\noteryan{not bothering to point out that the hypothesis holds}, we pass to subspace of dimension at least $(\delta^{50}/7.5) \sqrt{n} \geq n^{1/3}$ on which either $\eqs{3}(A) \geq \delta + \delta^{50}$, or both $\eqs{3}(A) \geq \delta - 4\delta^{13}$ and $\eqs{2}(A) \geq \delta - 4\delta^{25}$. We repeat this step until either the second conclusion obtains or until the step has been repeated $\lfloor 1/\delta^{47}\rfloor$ times. In either case, the dimension $n$ becomes no smaller than $n' = n \uparrow 3 \uparrow (-1/\delta^{47})$. Having done this, we either have $\eqs{3}(A) \geq \delta + \Omega(\delta^3)$ and may stop (as $n' \gg \log^{(7)} n$ by the assumption on $n$), or we have $\eqs{3}(A), \eqs{2}(A) \geq \delta - 4.5\delta^{13}$. \noteryan{$4.5$ is weird, I know, I messed up the constants earlier and was too lazy to readjust them} With another application of Proposition~\ref{prop:degen} we may obtain $\ens{3}(A), \ens{2}(A) \geq \delta - 5 \delta^{13}$, reducing $n'$ to at most, say, $n \uparrow 3 \uparrow (-1/\delta^{48})$. <br />
<br />
Next, we apply Theorem~\ref{thm:correlate}, taking $\delta_k = \delta_{k-1} = \delta - 5\delta^{13}$. Note that we have proved the probabilistic DHJ($2$) theorem in Lemma~\ref{lem:pdhj2}. We certainly have $n' \geq \pdhj{2}{\delta_{k-1}} = \Theta(1/\delta^2)$, and thus we either obtain a combinatorial line in $A$, or a set $B \subseteq [3]^{n'}$ with $\ens{3}(B) \geq \delta - 5 \delta^{13}$ and <br />
\[<br />
\frac{\ens{3}(A \cap B)}{\ens{3}(B)} \geq \delta - 5\delta^{13} + (\delta - 5\delta^{13})(\delta - 5\delta^{13})^2/2 \geq \delta + \delta^3/2 - 12.5\delta^{13} \geq \delta + \delta^3/4,<br />
\]<br />
where we used $\pdhj{2}{\delta_{k-1}} = \delta_{k-1}^2/2$. \noteryan{check these calculations} Also, $B$ may be written as $B_1 \cup B_2$, where $B_a$ is $a3$-insensitive. We henceforth write $A$ instead of $A \cap B$.<br />
<br />
We now return to the uniform distribution by applying Lemma~\ref{lem:back-to-uniform}. Its $\beta$ satisfies $\beta \geq \delta - 5\delta^{13}$, and its $\delta$ needs (crucially) to be replaced by $\delta + \delta^3/4$. We choose its $\eta = \delta^3/8$. Note that the resulting $r = (\beta \eta/15k)\sqrt{n'}$ indeed satisfies $r \gg 2 \ln n$, by the initial assumption on $n$.\noteryan{I think I checked.} We now pass to a restriction on some $n' \geq r \geq n \uparrow 3 \uparrow (-1/\delta^{49})$ coordinates on which we have, say, <br />
\begin{equation} \label{eqn:3inc}<br />
\unif_3(B) \geq \delta^4/50, \quad \unif_3(A) \geq (\delta + \delta^3/8) \unif_3(B).<br />
\end{equation}<br />
Crucially, since $B$ was the union of a $13$-insensitive set and a $23$-insensitive set before, it remains so in the restriction. We also now apply Lemma~\ref{lem:switch} to partition $B$ into sets $C_1$ and $C_2$, where $C_1$ is $13$-insensitive and $C_2$ is an intersection of $13$- and $23$-insensitive sets.<br />
<br />
Nearing the end, we now apply Corollary~\ref{cor:multi-partition} twice; once with $C = C_1$ and once with $C = C_2$. We choose $\eta = \delta^7/4000$ and for simplicity we take $\ell = 2$ both times. We will select the parameter $d$ and check the hypothesis on $n'$ later. We can therefore partition each of $C_1$, $C_2$ into disjoint $d$-dimensional combinatorial subspaces, along with error sets, $E_1$, $E_2$. Writing $E = E_1 \cup E_2$ we have <br />
\begin{equation} \label{eqn:3err}<br />
\unif_3(E) \leq 2\eta + 2\eta \leq \frac{\delta^7}{1000} \leq \frac{\delta^3}{20} \unif_3(B).<br />
\end{equation}<br />
We now simply discard the strings in $E$ from $A$ and $B$. Then $B$ is perfectly partitioned into disjoint $d$-dimensional subspaces, and some arithmetic shows that, whereas we had $\unif_3(A)/\unif_3(B) \geq \delta + \delta^3/8$ in~\eqref{eqn:3inc}, we now still have<br />
\[<br />
\frac{\unif_3(A)}{\unif_3(B)} \geq \delta + \delta^3/8 - \frac{\unif_3(E)}{\unif_3(B) - \unif_3(E)},<br />
\]<br />
and the fraction on the right is easily seen to be at most $\delta^3/19$, using~\eqref{eqn:3err}. Thus we have $\unif_3(A)/\unif_3(B) \geq \delta + \delta^3/20$, with $B$ perfectly partitioned into disjoint $d$-dimensional subspaces. It follows that we can pass to at least one $d$-dimensional subspace on which $\unif_3(A) \geq \delta + \delta^3/20$. <br />
<br />
We come now to the final setting of parameters. To execute the partitioning argument, we required that <br />
\begin{equation} \label{eqn:crazy3}<br />
n' \geq \left(d^{3m(d)}\right)^{3m(d^{3m(d)})},<br />
\end{equation}<br />
where <br />
\[<br />
m(d) = 2\mdhj{2}{\delta^7/16000}{d} = 2(10d)^{d2^d}(16000/\delta^7)^{2^d},<br />
\]<br />
using the Gunderson-R\"{o}dl-Sidirenko theorem. We will ensure that <br />
\begin{equation} \label{eqn:sat-me}<br />
d \geq 800/\delta^7,<br />
\end{equation}<br />
so that we may bound $m(d) \leq (20d)^{2d2^d} \leq 2 \uparrow 4 \uparrow d$. Then the right-hand side of~\eqref{eqn:crazy3} is at most $2 \uparrow^{(6)} (3d)$, and with $n' \geq n \uparrow 3 \uparrow (-1/\delta^{49})$ we get that~\eqref{eqn:crazy3} is satisfied so long as $d \ll \log^{(6)} n$, using also the initial assumption on $n$.\noteryan{Getting pretty out of control at this point.} Thus we set, say,<br />
\[<br />
d \approx (\log^{(7)} n)^{100}<br />
\]<br />
and we've satisfied~\eqref{eqn:sat-me} using the initial assumption $n \geq 2 \uparrow^{(7)} (1/\delta)$.<br />
<br />
It remains to note that, having passed to a $(\log^{(7)} n)^{100}$-dimensional subspace on which $\unif_3(A) \geq \delta + \delta^3/20$, we can pass to a further subspace of dimension at least the claimed $\log^{(7)} n$ on which $\unif_3(A) \geq \delta + \delta^3/40$ via the embedding argument sketched in Section~\ref{sec:outline-dist}, justified by Lemma~\ref{lem:distributions}.<br />
\end{proof}<br />
<br />
\subsection{The general \texorpdfstring{$k$}{k} case}<br />
<br />
For general $k \geq 4$ we will need to use the inductive deduction of the multidimensional DHJ($k-1$) theorem from the DHJ($k-1$) theorem, given in Proposition~\ref{prop:mdhj-from-dhj}. Because the overall argument then becomes a double-induction\noteryan{I think}, the resulting bounds for DHJ($k$) are of Ackermann-type \noteryan{I think}, and we do not work them out precisely.\noteryan{cop-out}.<br />
<br />
\begin{theorem} The general $k$ case of the density Hales--Jewett theorem follows from the $k-1$ case.<br />
\end{theorem}<br />
\begin{proof} (Sketch.) Having proved DHJ($k-1$), we deduce the multidimensional version via Proposition~\ref{prop:mdhj-from-dhj}, the equal-slices version via Proposition~\ref{prop:edhj-equiv-dhj}, and the probabilistic version from Proposition~\ref{prop:pdhj-from-edhj} (only the $d = 1$ case is necessary). The overall goal for proving DHJ($k$) is, as in the $k = 3$ case, to show that for $n$ sufficiently large as a function of $\delta$, we can either find a line in $A$ or can obtain a density increment of $\Omega(\eps)$ on a combinatorial subspace of dimension $d'$, where $\eps = \delta \cdot \pdhj{k-1}{\delta/2}$ depends only on $\delta$ and $k$, and where $d' = \omega_1(n)$.<br />
<br />
Beginning with $\ens{k}(A) \geq \delta$, we choose $\theta = 2\eps^C$ for some large constant $C$; note that $\theta$ only depends on $\delta$ and $k$. We pass from $\ens{k}(A)$ to $\eqs{k}(A)$ via Proposition~\ref{prop:degen}, incurring density loss only $\eps^c$ by assuming $n$ large enough ($n$ becomes slightly smaller here). We next apply Lemma~\ref{lem:32}. This gives us either a density increment of $\eps^c$, or achieves $\eqs{k}(A), \eqs{k-1}(A) \geq \delta - \eps^c$ for some smaller but still large constant $c$. We can repeat this step $1/\eps^c$ times, eventually achieving a density increment of $\Omega(\eps)$ and stopping if the first outcome keeps happening; otherwise achieving $\eqs{k}(A), \eqs{k-1}(A) \geq \delta - O(\eps^c)$. This brings $n$ down to some fractional power $n'$, but the fraction depends only on $\eps^c$ and hence on $\delta$ and $k$; thus we may continue to assume it is ``large enough''. \noteryan{guess I'm getting pretty sketchy here} This lets us ensure that $\ens{k}(A), \ens{k-1}(A) \geq \delta - O(\eps^c) \geq \delta/2$.<br />
<br />
We now apply Theorem~\ref{thm:correlate} with $\delta_k = \delta_{k-1} = \delta/2$. This requires $n'$ to be larger than $\Pdhj{k-1}{\delta/2}$, which will be a very large but still ``constant'' quantity, depending only on $\delta$ and $k$. Having applied the theorem, we either get a combinatorial line, or a density increment of $\delta_k \cdot \pdhj{k-1}{\delta/2} = \Omega(\eps)$ on some set $B$. Note that $\Omega(\eps) \gg O(\eps^c)$, so are able to obtain the desired relative density increment. The set $B$ has $\ens{k}(B) \geq \delta/2$ and is a union of $ak$-insensitive sets, as in the theorem statement. We can preserve this relative density increment under the uniform distribution by passing to another subspace according to Lemma~\ref{lem:back-to-uniform} with $\eta = \eps^C$. The dimension $n'$ continues to decrease to some $r$, but this is still $\omega_n(1)$ as we have only involved ``constant'' factors depending on $\delta$ and $k$. As in the DHJ($3$) proof, $B$ still has the special structure as a union of $ak$-insensitive sets. We again apply Lemma~\ref{lem:switch} to write it as a disjoint union of $k$ sets, each of which is an intersection of up to $k$ many $ak$-insensitive sets.<br />
<br />
We now apply Corollary~\ref{cor:multi-partition} to each of these sets, taking $\eta = \eps^C$ again and $\ell = k$ each time for simplicity. It suffice now to analyze what parameter $d$ we may choose; the remainder of the proof is essentially the same as in the DHJ($3$) case. At this point, the original $n$ has shrunk to some $n'$ which is some fractional power of $n$ depending on $\delta$ and $k$ only, and this $n'$ is required to be large enough compared to $\pdhj{k-1}{\delta/2}$, a very large function of $\delta$ and $k$ only. The function $m(d)$ arising in the corollary depends on $\eta$ and hence on $\delta$ and $k$, and also on $d$; the dependence of $f(d) = d^{3m(d)}$ on $d$ will be quite bad due to the inductive argument Proposition~\ref{prop:mdhj-from-dhj}. Still, when we iterate it $\ell = k$ times, the function $f^{(k)}(d)$ is just another (huge) function of $k$, $\delta$, and $d$. We will then have some demand on how large $n'$ needs to be as a function of $d$. Crucially, we may take $d$ to be some function $\omega_n(1)$ which satisfies this demand. <br />
<br />
Finally, we need to take our uniform-distribution density increment of $\Omega(\eps)$ on a $d$-dimensional subspace, and produce an equal-slices density increment of $\Omega(\eps)$. This will require reducing $d$ further to some $d'$, a function of $d$ and $\eps$ (hence $\delta$ and $k$), but this $d'$ will still be $\omega_n(1)$. \noteryan{this is all poorly explained and probably wrongly explained in several places :) luckily it's right overall :)}<br />
\end{proof}</div>67.186.58.92http://michaelnielsen.org/polymath1/index.php?title=Partitioning.texPartitioning.tex2009-06-25T04:22:58Z<p>67.186.58.92: </p>
<hr />
<div>\section{Partitioning \texorpdfstring{$ab$}{ab}-insensitive sets}<br />
<br />
The last tool we need is the result described in Section~\ref{sec:outline-partition}; given the multidimensional DHJ($k-1$) theorem, we show that $ab$-insensitive subsets of $[k]^n$ can be almost completely partitioned into disjoint $d$-dimensional subspaces.<br />
<br />
\begin{theorem} \label{thm:partition} Let $k \geq 3$, $d \geq 1$, $0 < \eta < 1/2$. Let $C \subseteq [k]^n$ be $ab$-insensitive on $I$, and assume <br />
\[<br />
\abs{I} \geq m \bigl(k(k+d)\bigr)^m \ln(1/\eta),<br />
\] <br />
where <br />
\[<br />
m = 2\mdhj{k-1}{\eta/4}{d}.<br />
\] <br />
Then $C$ can be partitioned into a collection $\calS$ of disjoint $d$-dimensional subspaces, along with an error set $E$ satisfying $\unif_k^{\ot n}(E) \leq \eta$.<br />
\end{theorem}<br />
\begin{proof}<br />
\noteryan{Maybe change all the $\unif_k^{\otimes n}$'s just to $\unif$'s, for typographical simplicity.}<br />
The proof proceeds in ``rounds'', $t = 1, 2, 3, \dots$. In each round, we remove some disjoint subspaces from $C$ and put them into $\calS$; and, the set $I$ shrinks by $m$ coordinates. We will show that after the $t$th round, <br />
\begin{equation} \label{eqn:C-shrinks}<br />
\unif_k^{\otimes n}(C) \leq \left(1 - \bigl(k(k+d)\bigr)^{-m}\right)^t.<br />
\end{equation}<br />
We continue the process until $\unif_k^{\otimes n}(C) \leq \eta$, at which point we may stop and set $E = C$. Because of~\eqref{eqn:C-shrinks}, the process stops after at most $T = \bigl(k(k+d)\bigr)^m \ln(1/\eta)$ rounds. Since we insisted $\abs{I} \geq mT$ initially, the set $I$ never ``runs out of coordinates''.<br />
<br />
Suppose we are about to begin the $t$th round; hence, writing $\alpha = \unif_k^{\otimes n}(C)$ we have<br />
\[<br />
\eta < \alpha \leq \left(1 - \bigl(k(k+d)\bigr)^{-m}\right)^{t-1}.<br />
\]<br />
The round begins by choosing an arbitrary $J \subseteq I$ with $\abs{J} = m$. We have <br />
\[<br />
\alpha = \Pr_{x \sim \unif_k^{\otimes n}}[x \in C] = \Ex_{y \sim \unif_k^{\otimes \barJ}}[\unif_k^{\otimes J}(C_y)],<br />
\]<br />
where we have written $C_y = \{z \in [k]^J : (y,z) \in C\}$. Hence $\unif_k^{\otimes J}(C_y) \geq \alpha/2$ for at least a $\alpha/2$ probability mass of $y$'s (under $\unif_k^{\otimes \barJ}$); call these $y$'s ``good''. Since $J \subseteq I$ and $C$ is $ab$-insensitive on $I$, it follows that each $C_y$ is $ab$-insensitive on $J$. Since $\abs{J} = m = 2\mdhj{k-1}{(\eta/2)/2}{d} \geq 2\mdhj{k-1}{(\alpha/2)/2}{d}$, it follows from Proposition~\ref{prop:mdhj-insens} that for each good $y$ there must exist a $d$-dimensional subspace $\rho \subseteq C_y$. Since the number of $d$-dimensional subspaces in $[k]^m$ is at most $(k+d)^m$, there must exist a \emph{fixed} $d$-dimensional subspace $\rho_0 \subseteq [k]^J$ such that <br />
\begin{equation} \label{eqn:C-mass-decr}<br />
\Pr_{y \sim \unif_k^{\otimes \barJ}}[\rho_0 \subseteq C_y] \geq \frac{\alpha}{2(k+d)^m}.<br />
\end{equation}<br />
Let $R \subseteq [k]^{\barJ}$ be the set of $y$'s with $\rho_0 \subseteq C_y$. Since $C$ is $ab$-insensitive on $I$, it is easy to see\noteryan{Honest.} that $R$ is $ab$-insensitive on $I \setminus J$. Thus $R \times \rho_0$ is $ab$-insensitive on $I \setminus J$; hence so too is $C \setminus (R \times \rho_0)$. We therefore complete the round by setting $I = I \setminus J$ and transferring $R \times \rho_0$ (a disjoint union of subspaces $\{y\} \times \rho_0$) from $C$ into $\calS$. This shrinks the number of coordinates in $I$ by $m$, as promised. And since we can crudely bound $\unif_k^{\otimes J}(\rho_0) = k^{d-m} \geq 2k^{-m}$, we conclude \[<br />
\unif_k^{\otimes n}(R \times \rho_0) \geq \frac{\alpha}{\bigl(k(k+d)\bigr)^m},<br />
\]<br />
as required to establish~\eqref{eqn:C-shrinks} inductively. <br />
\end{proof}<br />
<br />
We conclude this section by simplifying parameters slightly, then deducing Theorem~\ref{thm:partition} for intersections of $ab$-insensitive sets.<br />
<br />
\begin{corollary} \label{cor:partition} Let $d \geq k \geq 3$, $0 < \eta < 1/2$, and define $m$ as in Theorem~\ref{thm:partition}. If $C \subseteq [k]^n$ is $ab$-insensitive and $n \geq d^{3m}$, then the conclusion of Theorem~\ref{thm:partition} holds.<br />
\end{corollary}<br />
\begin{proof}<br />
We only need check that $d^{3m} \geq m \bigl(k(k+d)\bigr)^m \ln(1/\eta)$. We use the bounds $k \leq d$ and $m \geq 4/\eta$ (which is certainly necessary), and hence must show $d^{3m} \geq m \ln(m/4) (2d^2)^m$. This holds for every $d \geq 3$ and $m > 4$.\noteryan{I checked}<br />
\end{proof}<br />
<br />
\begin{corollary} \label{cor:multi-partition} Let $d, k, \eta, m$ be as in Corollary~\ref{cor:partition}, and write $f(d) = d^{3m(d)}$, treating $m$ as a function of $d$, with $k$ and $\eta$ fixed. Let $C \subseteq [k]^n$ be expressible as $C = C_1 \cap \cdots \cap C_\ell$, where each $C_s$ is $a_sb_s$-insensitive. If $n \geq f^{(\ell)}(d)$\noteryan{that's $f$ iterated $\ell$ times} then the conclusion of Theorem~\ref{thm:partition} holds with error bound $\ell \eta$.<br />
\end{corollary}<br />
\begin{proof}<br />
The proof is an induction on $\ell$, with Corollary~\ref{cor:partition} serving as the base case. In the general case, since $n \geq f^{(\ell-1)}(f(d))$, by induction we can partition $C' = (C_1 \cap \cdots \cap C_{\ell-1})$ into a collection $\calS'$ of disjoint nondegenerate $f(d)$-dimensional subspaces, along with an error set $E'$ satisfying $\unif_k^{\otimes n}(E') \leq (\ell-1)\eta$. For each $\sigma \in \calS'$, define $D_\sigma = C_\ell \cap \sigma$. If we identify $\sigma$ with $[k]^{f(d)}$ then we may think of $D_\sigma$ as an $a_\ell b_\ell$-insensitive subset of $[k]^{f(d)}$\noteryan{justify?}. Thus we may apply Corollary~\ref{cor:partition} and partition $D_\sigma$ into a collection $\calS_\sigma$ of nondegenerate $d$-dimensional subspaces, along with an error set $E_\sigma$ satisfying ``$\unif_k^\sigma(E_\sigma)$'' $\leq \eta$.\noteryan{bad invented notation} Note that in the original space $[k]^n$ we have $\unif_k^{\otimes n}(E_\sigma) \leq \eta \cdot \unif_k^{\otimes n}(\sigma)$.\noteryan{$ = \eta \cdot k^{f(d)-n}$. There's a slight subtlety here.} Hence we may complete the induction by taking $\calS$ to be $\{D_\sigma : \sigma \in \calS'\}$\noteryan{Point out that a $d$-dim subspace of a subspace is a $d$-dim subspace?} and $E$ to be $E' \cup \bigcup \{E_\sigma : \sigma \in \calS'\}$, observing that $\unif_k^{\otimes n}(E) \leq (\ell-1)\eta + \eta (\sum_{\sigma \in \calS'} \unif_k^{\otimes n}(\sigma)) \leq \ell \eta$ as required.<br />
\end{proof}</div>67.186.58.92http://michaelnielsen.org/polymath1/index.php?title=Correlation.texCorrelation.tex2009-06-25T04:22:40Z<p>67.186.58.92: </p>
<hr />
<div>\section{Line-free sets correlate with insensitive set intersections}<br />
In this section we make rigorous the argument described in Section~\ref{sec:outline-corr}; we show that a dense subset $A$ either contains a combinatorial line, or it has slight correlation with a union of $ab$-insensitive sets.<br />
<br />
\noteryan{I think it would be good to go through the paper and soften explicit constants to $O(\cdot)$'s, for readability. }<br />
<br />
First we reduce to the case where the set is dense under both $\eqs{k}$ and $\eqs{k-1}$.<br />
<br />
\begin{lemma} \label{lem:32} Let $k \geq 3$, suppose $A \subseteq [k]^n$ satisfies $\eqs{k}(A) \geq \delta$, and let $\theta$ satisfy<br />
\[<br />
10 k \frac{\ln n}{\sqrt{n}} \leq \theta \leq \delta.<br />
\]<br />
Then there exists a restriction $(J, x_\barJ)$ with $\abs{J} \geq (\theta/5k) \sqrt{n}$ such that one of the following two conditions holds:<br />
\begin{enumerate}<br />
\item \label{eqn:lem32_1} $\eqs{k}^J(A_{x_\barJ}) \geq \delta + \theta$; or,<br />
\item \label{eqn:lem32_2} both $\eqs{k}^J(A_{x_\barJ}) \geq \delta - 3\theta^{1/4}\delta^{1/2}$ and $\eqs{k-1}^J(A_{x_\barJ}) \geq \delta - 3\theta^{1/2}$.<br />
\end{enumerate}<br />
\end{lemma}<br />
\begin{proof}<br />
Let $r = \lceil (\theta/5k) \sqrt{n} \rceil$. As in Lemma~\ref{lem:distributions}, let $J$ be an $[r,4r]$-random subset of $[n]$ and let $x$ be drawn from $\eqs{k}^{\barJ}$. If $\Ex_{J, x}[\eqs{k}(A_{x})^2] \geq (\delta + \theta)^2$, then there must exist some restriction $(J,x)$ with $\abs{J} \geq r$ and $\eqs{k}^J(A_{x}) \geq \delta + \theta$. In this case, conclusion~\ref{eqn:lem32_1} above holds. We henceforth assume <br />
\begin{equation} \label{eqn:moment}<br />
\Ex_{J, x}[\eqs{k}(A_{x})^2] < (\delta + \theta)^2<br />
\end{equation}<br />
and show that conclusion~\ref{eqn:lem32_2} holds.<br />
<br />
\ignore{Let $y$ be drawn from $\eqs{k}^{J}$, and let $z$ be drawn from $\eqs{k-1}^{J}$.} Since $\eqs{k}(A) \geq \delta$, two applications of Lemma~\ref{lem:distributions}.\ref{eqn:distrs-eqs-eqs} yield<br />
\begin{align}<br />
\Ex_{J, x}[\eqs{k}(A_{x})] & \geq \delta - 4kr/\sqrt{n} \geq \delta - \theta \label{eqn:mean3} \\<br />
\Ex_{J, x}[\eqs{k-1}(A_{x})] & \geq \delta - 4kr/\sqrt{n} \geq \delta - \theta, \label{eqn:mean2}<br />
\end{align}<br />
where we've used our definition of $r$ (and the hypothesis $\theta \geq (10 k \ln n)/\sqrt{n}$ ensures that $r \geq 2 \ln n$ as required for the lemma.) Combining~\eqref{eqn:moment}, \eqref{eqn:mean3} gives<br />
\[<br />
\Varx_{J,x}[\eqs{k}(A_{x})] < (\delta + \theta)^2 - (\delta - \theta)^2 = 4 \theta \delta.<br />
\]<br />
Thus Chebyshev's inequality implies that except with probability at most $\theta^{1/2}$ over the choice of $(J,x)$, we have that $\eqs{k}(A_x)$ is within $\theta^{-1/4} \cdot (4 \theta \delta)^{1/2} = 2 \theta^{1/4} \delta^{1/2}$ of its expectation. When this happens, <br />
\begin{equation} \label{eqn:conc2a}<br />
\eqs{k}(A_x) \geq \delta - \theta - 2 \theta^{1/4} \delta^{1/2} \geq \delta - 3 \theta^{1/4} \delta^{1/2}<br />
\end{equation}<br />
(using $\theta \leq \delta$). On the other hand, from~\eqref{eqn:mean2} we immediately deduce that with probability at least $2\theta^{1/2}$ over the choice of $(J,x)$ we have <br />
\begin{equation} \label{eqn:conc2b}<br />
\eqs{k-1}(A_{x}) \geq \delta - \theta - 2 \theta^{1/2} \geq \delta - 3\theta^{1/2}.<br />
\end{equation}<br />
Thus with probability at least $2\theta^{1/2} - \theta^{1/2} > 0$ over the choice of $(J,x)$, we have both~\eqref{eqn:conc2a}, \eqref{eqn:conc2b}, establishing<br />
conclusion~\ref{eqn:lem32_2} of the lemma.<br />
\end{proof}<br />
<br />
We now come to the main theorem in this section, deducing a key step in the proof of DHJ($k$) from the probabilistic DHJ($k-1$) theorem.<br />
\begin{theorem} \label{thm:correlate} Let $k \geq 3$ and suppose we have established the probabilistic DHJ($k-1$) theorem. Further suppose $A \subseteq [k]^n$ has $\ens{k}(A) \geq \delta_k$ and $\ens{k-1}(A) \geq \delta_{k-1}$. Finally, assume $n \geq \Pdhj{k-1}{\delta_{k-1}}$. Then either $A$ contains a combinatorial line (of length $k$), or there exists $B \subseteq [k]^n$ with $\ens{k}(B) \geq \delta_k$ and <br />
\[<br />
\frac{\ens{k}(A \cap B)}{\ens{k}(B)} \geq \delta_k + \delta_k \cdot \pdhj{k-1}{\delta_{k-1}}<br />
\] <br />
such that<br />
\[<br />
B = B_{1} \cup B_2 \cup \cdots \cup B_{k-1},<br />
\]<br />
where for each $a \in [k-1]$, $B_a$ is $ak$-insensitive.<br />
\end{theorem}<br />
\begin{proof}<br />
For $a \in [k-1]$, let $L_a = \{x \in [k]^n : \chg{x}{k}{a} \in A\}$, an $ak$-insensitive set. Let $L = L_1 \cap L_2 \cap \cdots \cap L_{k-1}$. In other words, $L$ is the set line templates over $[k-1]^n$ (possibly degenerate) whose corresponding lines are entirely in $A$. Let $L' \subseteq L$ be the nondegenerate such line templates; i.e., $L' = \{x \in L : \exists j \in [n] \text{ s.t.\ } x_j = k\}$. Note that $\ens{k}(L \setminus L') = 0$.<br />
<br />
The key observation is that if $A \cap L' \neq \emptyset$, we have a (nondegenerate) combinatorial line of length~$k$ entirely in $A$. So assume otherwise; hence $A \subseteq \overline{L'} = [k]^n \setminus L'$. Since $\ens{k-1}(A) \geq \delta_{k-1}$ and $n \geq \Pdhj{k-1}{\delta_{k-1}}$, the probabilistic DHJ($k-1$) theorem implies that $\ens{k}(L') = \ens{k}(L) \geq \pdhj{k-1}{\delta_{k-1}}$. It follows that<br />
\[<br />
\frac{\ens{k}(A \cap \overline{L})}{\ens{k}(\overline{L})} = \frac{\ens{k}(A \cap \overline{L'})}{\ens{k}(\overline{L'})} = \frac{\eqs{k}(A)}{\eqs{k}(\overline{L'})} \geq \frac{\delta_k}{1 - \pdhj{k-1}{\delta_{k-1}}} \geq \delta_k + \delta_k \cdot \pdhj{k-1}{\delta_{k-1}}.<br />
\]<br />
The proof is completed by taking $B = \overline{L}$; this has $\ens{k}(B) = \ens{k}(\overline{L'}) \geq \delta_k$ because $\overline{L'} \supseteq A$.<br />
\end{proof}<br />
<br />
We will prefer to obtain this relative density increment under the uniform distribution, rather than under equal-nondegenerate-slices:<br />
\begin{lemma} \label{lem:back-to-uniform}<br />
Suppose $A \subseteq B \subseteq [k]^n$ satisfy $\ens{k}(B) = \beta$ and $\ens{k}(A) \geq \delta \cdot \ens{k}(B)$. Let $0 < \eta < 3\delta$ be a parameter, define $r = (\beta \eta /15k) \sqrt{n}$, and assume $2\ln n \leq r \leq n/2$. Then there exists a restriction $(J,x_{\barJ})$ with $\abs{J} \geq r$ under which $\unif_k(B_{x_\barJ}) \geq (\eta/3) \beta$ and $\unif_k(A_{x_\barJ}) \geq (\delta - \eta) \unif_k(B_{x_\barJ})$.\noteryan{And if $B$ was a union of $ij$-insensitive sets before\dots}<br />
\end{lemma}<br />
\begin{proof}<br />
Let $J$ be an $[r,4r]$-random subset of $[n]$, let $x \sim \eqs{k}^{\barJ}$, and let $y \sim \unif_k^J$. From Lemma~\ref{lem:distributions}, the distribution on $(x,y)$ is $(4kr/\sqrt{n})$-close to $\eqs{k}^n$. Further, from~\eqref{prop:degen} the total variation distance between $\eqs{k}^n$ and $\ens{k}^n$ is at most $k(k-1)/n \leq kr/\sqrt{n}$\noteryan{uses $k \leq \sqrt{n}$ or so}. By choice of $r$, the combined distance bound $5kr/\sqrt{n}$ is at most $(\eta/3)\beta$. We therefore have<br />
\begin{equation} \label{eqn:AxBx}<br />
\Ex_{J,x}[\unif_k(A_x)] \geq \delta \beta - (\eta/3)\beta, \qquad<br />
\Ex_{J,x}[\unif_k(B_x)] \leq \beta + (\eta/3)\beta.<br />
\end{equation}<br />
<br />
Let $H$ be the event that $\unif_k(B_x) \geq (\eta/3)\beta$. On one hand, since $A_x \subseteq B_x$ always we have<br />
\[<br />
\Ex_{J,x}[\unif_k(A_x)] \leq \Ex_{J,x}[\unif_k(A_x) \mid H] \Pr[H] + (\eta/3)\beta.<br />
\]<br />
On the other hand, we have<br />
\[<br />
\Ex_{J,x}[\unif_k(B_x)] \geq \Ex_{J,x}[\unif_k(B_x) \mid H] \Pr[H].<br />
\]<br />
Combining these two deductions with~\eqref{eqn:AxBx} yields<br />
\[<br />
\Ex_{J,x}[\unif_k(B_x) \mid H] \Pr[H] \leq \beta(1+\eta/3), \qquad (\delta - 2\eta/3)\beta \leq \Ex_{J,x}[\unif_k(A_x) \mid H] \Pr[H].<br />
\]<br />
Thus<br />
\[<br />
\frac{\delta - 2\eta/3}{1+\eta/3} \cdot \Ex_{J,x}[\unif_k(B_x) \mid H] \Pr[H] \leq \Ex_{J,x}[\unif_k(A_x) \mid H] \Pr[H],<br />
\]<br />
whence<br />
\[<br />
0 \leq \Ex_{J,x}[\unif_k(A_x) - (\delta - \eta)\unif_k(B_x) \mid H],<br />
\]<br />
where we used $(\delta - 2\eta/3)/(1+\eta/3) \geq \delta - \eta$. Thus there exist some $(J,x)$ where $H$ occurs and also $\unif_k(A_x) \geq (\delta - \eta) \unif_k(B_x)$, completing the proof.<br />
\end{proof}<br />
<br />
Finally, the set $B$ we get from Theorem~\ref{thm:correlate} is a union of insensitive sets; we would prefer an intersection:<br />
\begin{lemma} \label{lem:switch} Suppose $B = B_1 \cup B_2 \cup \cdots B_{k-1} \subseteq [k]^n$, where $B_a$ is $ak$-insensitive. Then $B$ is a disjoint union of $k-1$ sets $C_1, \dots, C_{k-1}$, where $C_b$ is an intersection of $ak$-insensitive sets, $a = 1 \dots b$.<br />
\end{lemma}<br />
\begin{proof}<br />
Take $C_b = B_b \setminus (B_1 \cup \cdots \cup B_{b-1})$.<br />
\end{proof}</div>67.186.58.92http://michaelnielsen.org/polymath1/index.php?title=Easy.texEasy.tex2009-06-25T04:22:23Z<p>67.186.58.92: New page: \section{Easy reductions} \label{sec:easy} In this section we collect some easy reductions between various DHJ problems for use in our main proof. \begin{proposition} \label{prop:edhj-eq...</p>
<hr />
<div>\section{Easy reductions} \label{sec:easy}<br />
<br />
In this section we collect some easy reductions between various DHJ problems for use in our main proof.<br />
<br />
\begin{proposition} \label{prop:edhj-equiv-dhj} For a given $k \geq 3$, the DHJ theorem is equivalent to the equal-slices DHJ theorem.<br />
\end{proposition}<br />
\begin{proof}<br />
The fact that DHJ follows from equal-slices DHJ was sketched in Section~\ref{sec:outline-dist}; we give the full proof here, using Lemma~\ref{lem:distributions}.\ref{eqn:distrs-prd-eqs} and Proposition~\ref{prop:degen}. The other direction has a very similar proof, using Lemma~\ref{lem:distributions}.\ref{eqn:distrs-eqs-prd} and we omit it.<br />
<br />
Suppose then that we have proven the equal-slices DHJ($k$). Let $\unif_k^{\ot n}(A) \geq \delta$. We use Lemma~\ref{lem:distributions}, taking $r = n^{1/4} $, letting $J$ be an $[r, 4r]$-random subset of $[n]$, $x \sim \unif_k^{\otimes \barJ}$, $y \sim \eqs{k}^J$. (Note that $2\ln n \leq r \leq n/2$ provided $n$ is at least a small universal constant.) According to the lemma, the resulting distribution on the composite string $(x,y)$ is $\tau$-close to $\unif_k^{\ot n}$, where<br />
\[<br />
\tau = (2\sqrt{k - 1} + 2) \cdot r/\sqrt{n} \leq 2k\cdot r/\sqrt{n} = 2k/n^{1/4}.<br />
\]<br />
This is at most $\delta/3$ provided that $n \geq (6k/\delta)^4$. In that case we conclude<br />
\[<br />
\Ex_{J,x}[\eqs{k}^J(A_x)] \geq \delta - \delta/3<br />
\]<br />
and so there must exist a restriction $(J,x)$ with $\abs{J} \geq r$ and $\eqs{k}^J(A_x) \geq 2\delta/3$. By Proposition~\ref{prop:degen} the distributions $\eqs{k}^J$ and $\ens{k}^J$ have total variation distance at most $k(k-1)/r \leq k^2/n^{1/4}$, and this quantity is also at most $\delta/3$ provided $n \geq (3k(k-1)/\delta)^4$. In that case $\ens{k}^J(A_x) \geq 2\delta/3 - \delta/3 = \delta/3$. Finally, if $r \geq \edhj{k}{\delta/3}$ then we may use equal-slices DHJ($k$) to find a combinatorial line in $A_x$, and hence in $A$. This holds provided $n \geq \edhj{k}{\delta/3}^4$. Presuming that $\edhj{k}{\delta/3} \geq 3k(k-1)/\delta$, the last requirement on $n$ is strictest. Hence we may take $\dhj{k}(\delta) = \edhj{k}{\delta/3}^4$.\noteryan{not the strongest of course, but it certainly doesn't matter for our bounds}<br />
\end{proof}<br />
<br />
The deduction of probabilistic multidimensional DHJ from the equal-slices version uses the handy property Lemma~\ref{lem:ens-compose}; we remark that we only need the $d = 1$ case for our overall proof:<br />
\ignore{\begin{theorem} \label{thm:dhj-dv} Let $m = n_{\ref{thm:dhj-d-ens}}(k,d,\delta/2,\ens{k})$.\noteryan{I.e., $m$ large enough that every set with $\ens{k}^m$ at least $\delta/2$ has a nondegen $d$-dim comb.\ subspace.} Then for all $n \geq n_{\ref{thm:dhj-dv}}(k,d,\delta) := m$ and $A \subseteq [k]^n$, <br />
\[<br />
\ens{k}^n(A) \geq \delta \quad\Rightarrow\quad \Pr_{\sigma \sim \ens{k+d}^n}[\sigma \subseteq A \text{ as a $d$-dimensional subspace}] \geq \eta_{\ref{thm:dhj-dv}}(k,d,\delta) := \frac{\delta}{2(k+d)^{2m}}.<br />
\]<br />
\end{theorem}}<br />
\begin{proposition} \label{prop:pdhj-from-edhj} For a given $k$, The probabilistic multidimensional DHJ theorem follows from the equal-slices multidimensional DHJ theorem.<br />
\end{proposition}<br />
\begin{proof}<br />
We will show that one can take $\Pmdhj{k}{\delta}{d} = \emdhj{k}{\delta/2}{d}$. Let us write $m$ for this number. Let $x \sim \ens{m}^n$ and $y \sim \ens{k}^m$.\noteryan{Surely $m \geq k$.} By Lemma~\ref{lem:ens-compose}, $y \circ x \sim \ens{k}^n$. Hence if $A \subseteq [k]^n$ satisfies $\ens{k}^n(A) \geq \delta$, we get that $\ens{k}^m(A_x) \geq \delta/2$ with probability at least $\delta/2$ over $x$, where $A_x$ denotes $\{z \in [k]^m : z \circ x \in A\}$. Call such $x$'s ``good''. By our choice of $m$, for all good~$x$ we have that $A_x$ contains a $d$-dimensional subspace. Thus by Proposition~\ref{prop:min-eqs}, a randomly drawn subspace $w \sim \ens{k+d}^m$ is in $A_x$ with probability at least $(k+d)^{-2m}$. We conclude that with probability at least $\pmdhj{k}{\delta}{d} = (\delta/2)(k+d)^{-2m}$ over the choice of $x$ and $w$, the $d$-dimensional subspace $w \circ x$ is in $A$. This completes the proof, since $w \circ x$ is distributed as $\ens{k+d}^n$ by Lemma~\ref{lem:ens-compose}.<br />
\end{proof}<br />
<br />
As we saw in Section~\ref{sec:prob-sperner}, there is an elementary proof of probabilistic DHJ($2$). For completeness, we observe here that Lemma~\ref{lem:p-sperner} and Proposition~\ref{prop:degen} immediately\noteryan{pretty immediately} yield the following:<br />
\begin{lemma} \label{lem:pdhj2} One may take $\Pdhj{2}{\delta} = 12/\delta^2$ and $\pdhj{2}{\delta} = \delta^2/2$.<br />
\end{lemma}<br />
<br />
<br />
\ignore{<br />
\noteryan{The following \emph{could} be done under equal-slices, using Lemma~\ref{lem:distributions}.\ref{eqn:distrs-eqs-eqs}, if necessary.}<br />
\begin{theorem} \label{thm:dhj-d-prod} Let $d \in \N$, $0 < \delta < 1$, and let $\pi$ be a distribution on $[k]$. Let $m_1 = n_{\ref{thm:dhj-d-prod}}(k,1,\delta/2,\pi)$\noteryan{Really, just a bound for DHJ(k) under $\pi^{\otimes n}$.}, $\eta = \delta/2(k+1)^{m_1}$, and $m_d = n_{\ref{thm:dhj-d-prod}}(k,d,\eta,\pi)$. Then for all $n \geq n_{\ref{thm:dhj-d-prod}}(k,d+1,\delta,\pi) := m_1 + m_d$ and<br />
and $A \subseteq [k]^n$,<br />
\[<br />
\pi^{\otimes n}(A) \geq \delta \quad\Rightarrow\quad A \text{ contains a nondegenerate $(d+1)$-dimensional subspace.}<br />
\]<br />
\end{theorem}<br />
\begin{proof}<br />
Choose $J \subseteq [n]$ with $\abs{J} = m_1$ arbitrarily, and let $\barJ = [n] \setminus J$, so $|\barJ| \geq m_d$ by assumption. Since $\pi^{\otimes n}(A) \geq \delta$, with probability at least $\delta/2$ over $y \sim \pi^{\otimes \barJ}$ we have $\pi^{\otimes J}(A_y) \geq \delta/2$. Call such $y$'s ``good''. Since $|J| \geq n_{\ref{thm:dhj-d-prod}}(k,1,\delta/2,\pi)$, we conclude that whenever $y$ is good, $A_y \subseteq [k]^J$ contains a nondegenerate line. Since there are at most $(k+1)^{m_1}$ such lines in $[k]^J$, we conclude there must be a particular line $\lambda$ over $[k]^J$ such that $\lambda \subseteq A_y$ for at least an $\eta$ fraction of $y \sim \pi^{\otimes \barJ}$. Write $L$ for the set of such $y$'s. Since $|\barJ| \geq n_{\ref{thm:dhj-d-prod}}(k,d,\eta,\pi)$, there must exist a nondegenerate $d$-dimensional subspace $\sigma$ over $[k]^{\barJ}$ such that $\sigma \subseteq L$. Now $\lambda \times \sigma$ is a nondegenerate $(d+1)$-dimensional subspace contained in $A$. <br />
\end{proof}<br />
}<br />
<br />
The following observation was sketched in Section~\ref{sec:outline-insens}:<br />
<br />
\begin{proposition} \label{prop:mdhj-insens} The multidimensional DHJ($k$) theorem for $ab$-insensitive sets follows from the multidimensional DHJ($k-1$) theorem. More precisely, let $k \geq 3$, $d \geq 1$, and assume we have established the multidimensional DHJ($k-1$) theorem. Suppose $B \subseteq [k]^n$ is $ab$-insensitive on $I \subseteq [n]$ and has $\unif_k^{\otimes n}(B) \geq \delta$. Then $B$ contains a $d$-dimensional subspace, provided $\abs{I} \geq 2\mdhj{k-1}{\delta/2}{d}$.<br />
\end{proposition}<br />
\begin{proof}<br />
We may assume without loss of generality that $b = k$, by renaming symbols. We may also assume $I = [n]$ without loss of generality; this is because we can always pass to a restriction $(I, x_{\overline{I}})$ on which $B$ has uniform density at least $\delta$. So from now on we assume $B$ is an $ak$-insensitive subset of $[k]^n$, with $n \geq 2\mdhj{k-1}{\delta/2}{d}.$<br />
<br />
Think of conditioning a draw from $\unif_k^{\otimes n}$ on the location of the symbols $k$. In other words, draw $z \sim \unif_k^{\otimes n}$ as follows: first choose a $(1-1/k)$-random subset $J \subseteq [n]$, form the restriction $(J,x_\barJ)$ where $x_\barJ$ is the all-$k$'s substring, and finally draw $y \sim \unif_{k-1}^{\otimes J}$ and take $z = (x_\barJ,y)$. We have $\unif_k(B) = \Ex_J[\unif_{k-1}(B_{x_\barJ})] \delta$ (where we may think of $B_{x_\barJ} \subseteq [k-1]^J$). We also have $\Ex[\abs{J}] = (1-1/k)n$, and $\Pr[\abs{J} < n/2] \leq \exp(-n/48)$ by a standard Chernoff bound\noteryan{bother citing?}. Thus there must exist a particular restriction $(J,x_\barJ)$ where $\abs{J} \geq n/2 \geq \mdhj{k-1}{\delta/2}{d}$ and $\unif_{k-1}^{\otimes J}(B_{x_\barJ}) \geq \delta - \exp(-n/48) \geq \delta/2$.\noteryan{This uses some extremely mild assumption about $\mdhj{k-1}{\delta/2}{d}$.} By the multidimensional DHJ($k-1$) theorem, $B_{x_\barJ}$ contains a nondegenerate $d$-dimensional subspace of $[k-1]^J$. Let us write the template for this subspace as $\sigma \in ([k-1] \cup \{k+1, \dots, k+d\})^J$, meaning that $\chg{\sigma}{(k+t)}{\ell} \in B_{x_\barJ}$ for each $t \in [d]$ and $\ell \in [k-1]$.<br />
<br />
We can extend $\sigma$ from coordinates $J$ to all of $[n]$ in the natural way, filling in $k$'s, and the resulting template has $\chg{\sigma}{(k+t)}{\ell} \in B$ for all $\ell \in [k-1]$. But $B$ is $ak$-sensitive by assumption, so each string $\chg{\sigma}{(k+t)}{k}$ is in $B$ because $\chg{\sigma}{(k+t)}{a}$ is. Thus $\sigma$ is a nondegenerate $d$-dimensional subspace template in the usual sense with all its strings in $B$.\noteryan{not elegantly explained}<br />
\end{proof}</div>67.186.58.92http://michaelnielsen.org/polymath1/index.php?title=Measures.texMeasures.tex2009-06-25T04:22:06Z<p>67.186.58.92: </p>
<hr />
<div>\section{Passing between probability measures} \label{sec:measures}<br />
<br />
The goal of this section is to work out bounds for the error arising when passing back and forth between $\unif_k$ and $\ens{k}$, as described in Section~\ref{sec:outline-dist}. Lemma~\ref{lem:distributions} below gives the bounds we need. The reader will not lose much by just reading its statement; the proof is just technical calculations.<br />
<br />
Before stating Lemma~\ref{lem:distributions} we need some definitions.<br />
<br />
\ignore{<br />
\begin{definition} Given a set $A \subseteq [k]^n$ and a restriction $(J,x_\barJ)$, we write $A_{x_\barJ}$ for the subset of $[k]^{J}$ defined by $A_{x_\barJ} = \{y \in [k]^J : (x_{\barJ}, y_J) \in A\}$.<br />
\end{definition}}<br />
<br />
\begin{definition} \label{def:r4r} For $0 \leq \ppn \leq 1$, we say that $J$ is a \emph{$\ppn$-random subset} of $[n]$ if $J$ is formed by including each coordinate $i \in [n]$ independently with probability $\ppn$. Assuming $r \leq n/2$, we say that $J$ is an \emph{$[r,4r]$-random subset} of $[n]$ if $J$ is a $\ppn$-random subset of $[n]$ conditioned on $r \leq \abs{J} \leq 4r$, with $\ppn = 2r/n$.<br />
\end{definition}<br />
\begin{definition} A \emph{distribution family} $(\distra^m)_{m \in \N}$ (over $[k]$) is a sequence of probability distributions, where $\distra^m$ is a distribution on $[k]^m$. In this paper the families we consider will either be the equal-(nondegenerate-)slices family $\distra^m = \ens{k}^m$ or $\distra^m = \eqs{k}^m$, or will be the product distributions based on a single distribution $\prd$ on $[k]$, $\distra^m = \prd^{\otimes m}$.<br />
\end{definition}<br />
<br />
<br />
\begin{lemma} \label{lem:distributions} Let $(\distra^m)$ and $(\distrb^m)$ be distribution families. Assume $2 \ln n \leq r \leq n/2$. Let $J$ be an $[r,4r]$-random subset of $[n]$, let $x$ be drawn from $[k]^{\barJ}$ according to $\distra^{\abs{\barJ}}$, and let $y$ be drawn from $[k]^J$ according to $\distrb^{\abs{J}}$. The resulting distribution on the composite string $(x,y) \in [k]^n$ has total variation distance from $\distra^n$ which can be bounded as follows:<br />
\begin{enumerate}<br />
\item (Product to equal-slices.) \label{eqn:distrs-prd-eqs} If $\distra^m = \prd^{\otimes m}$ and $\distrb^m = \eqs{\ell}^m$ for $\ell \leq k$, the bound is \noteryan{You know, we only need this result for the uniform distribution, in which case we can bound the below by the simpler $2k \cdot r/\sqrt{n}$.}<br />
\[<br />
(2{\textstyle \sqrt{\frac{1}{\min(\prd)}-1}})+2) \cdot r / \sqrt{n}.<br />
\]<br />
\item (Equal-slices to product.) \label{eqn:distrs-eqs-prd} If $\distra^m = \eqs{k}^m$ and $\distrb^m = \prd^{\otimes m}$, the bound is $4k \cdot r/\sqrt{n}$, independent of $\prd$.<br />
\item (Equal-slices to equal-slices.) \label{eqn:distrs-eqs-eqs} If $\distra^m = \eqs{k}^m$ and $\distrb^m = \eqs{\ell}^m$ for $\ell \leq k$, the bound is $4k \cdot r/\sqrt{n}$.<br />
\end{enumerate}<br />
\end{lemma}<br />
<br />
Although Lemma~\ref{lem:distributions} involves the equal-slices distribution, one can convert to equal-nondegenerate-slices if desired using Proposition~\ref{prop:degen}.<br />
<br />
<br />
Since $\eqs{k}^n$ is a mixture of product distributions (Proposition~\ref{prop:eqs-mix}), the main work in proving Lemma~\ref{lem:distributions} involves comparing product distributions.<br />
<br />
<br />
\subsection{Comparing product distributions}<br />
\begin{definition} For $\distra$ and $\distrb$ probability distributions on $\Omega^n$, the \emph{$\chi^2$ distance} $\dchi{\pi}{\nu}$ is defined by<br />
\[<br />
\dchi{\distra}{\distrb} = \sqrt{\Varx_{x \sim \distra}\left[\frac{\distrb[x]}{\distra[x]}\right]}.<br />
\]<br />
Note that $\dchi{\distra}{\distrb}$ is \emph{not} symmetric in $\distra$ and $\distrb$.<br />
\end{definition}<br />
<br />
The $\chi^2$ distance is introduced to help us prove the following fact:<br />
\begin{proposition} \label{prop:mix-distance} Let $\prd$ be a distribution on $\Omega$ with full support; i.e., $\min(\pi) \neq 0$. Suppose $\prd$ is slightly mixed with $\distrb$, forming $\wh{\prd}$; specifically, $\wh{\prd} = (1-\ppn) \prd + \ppn \distrb$. Then the associated product distributions $\prd^{\otimes n}$, $\wh{\prd}^{\otimes n}$ on $\Omega^{n}$ satisfy<br />
\[<br />
\dtv{\prd^{\otimes n}}{\wh{\prd}^{\otimes n}} \leq \dchi{\prd}{\distrb} \cdot \ppn \sqrt{n}.<br />
\]<br />
\end{proposition}<br />
\begin{proof} It is a straightforward consequence of Cauchy-Schwarz (see, e.g.~\cite[p.\ 101]{Rei89})\noteryan{This is the part using $\min(\prd) \neq 0$, by the way.} that <br />
\[<br />
\dtv{\prd^{\otimes n}}{\wh{\prd}^{\otimes n}} \leq \dchi{\prd}{\wh{\prd}} \cdot \sqrt{n},<br />
\]<br />
and the identity $\dchi{\prd}{\wh{\prd}} = \ppn \cdot \dchi{\prd}{\distrb}$ follows easily from the definitions.<br />
\end{proof}<br />
This can be bounded independently of $\distrb$, as follows:<br />
\begin{corollary} \label{cor:mix-distance} In the setting of Proposition~\ref{prop:mix-distance}, <br />
\[<br />
\dtv{\prd^{\otimes n}}{\wh{\prd}^{\otimes n}} \leq \sqrt{{\textstyle \frac{1}{\min(\prd)}} - 1} \cdot \ppn \sqrt{n},<br />
\]<br />
\end{corollary}<br />
\begin{proof} It is easy to check that the distribution $\distrb$ maximizing $\dchi{\prd}{\distrb}$ is the one putting all its mass on the $x$ minimizing $\prd[x]$. In this case one calculates $\dchi{\prd}{\distrb} = \sqrt{\frac{1}{\min(\pi)} - 1}$.<br />
\end{proof}<br />
<br />
<br />
<br />
<br />
<br />
\subsection{Proof of Lemma~\ref{lem:distributions}}<br />
<br />
<br />
\begin{definition} \label{def:compos-distr} Let $0 \leq \ppn \leq 1$ and let $(\distra^m)$, $(\distrb^m)$ be distribution families. Drawing from the \emph{$(\ppn, \distra, \distrb)$-composite distribution} on $[k]^n$ entails the following: $J$ is taken to be a $\ppn$-random subset of~$[n]$; $x$ is drawn from $[k]^{\barJ}$ according to $\distra^{\abs{\barJ}}$; and, $y$ is drawn from $[k]^J$ according to $\distrb^{\abs{J}}$. We sometimes think of this distribution as just being a distribution on composite strings $z = (x, y) \in [k]^n$.<br />
\end{definition}<br />
<br />
Note that the distribution described in Lemma~\ref{lem:distributions} is very similar to the $(\ppn, \distra, \distrb)$-composite distribution, except that it uses an $[r, 4r]$-random subset rather than a $\ppn$-random subset. We can account for this difference with a standard Chernoff (large-deviation) bound:\noteryan{Citation needed?} <br />
\begin{fact} \label{fact:dev} If $J$ is a $\ppn$-random subset of $[n]$ with $\ppn = 2r/n$ as in Definition~\ref{def:r4r}, then $r \leq \abs{J} \leq 4r$ holds except with probability at most $2\exp(-r/4)$.<br />
\end{fact}<br />
<br />
The utility of using $\ppn$-random subsets in Definition~\ref{def:compos-distr} is the following observation:<br />
\begin{fact} If $\prd$ and $\distrb$ are distributions on $[k]$, thought of also as product distribution families, then the $(\ppn, \prd, \distrb)$-composite distribution on $[k]^n$ is precisely the product distribution $\wh{\prd}^{\otimes n}$, where $\wh{\prd}$ is the mixture distribution $(1-\ppn)\prd + \ppn \distrb$ on $[k]$.<br />
\end{fact}<br />
<br />
Because of this, we can use Corollary~\ref{cor:mix-distance} to bound the total variation distance between $\prd^{\otimes n}$ and a composite distribution. We conclude:<br />
\begin{proposition} \label{prop:prod-composite} Let $\prd$ and $\distrb$ be any distributions on $[k]$, thought of also as product distribution families. Writing $\wt{\prd}$ for the $(\ppn,\prd,\distrb)$-composite distribution on strings in $[k]^n$, we have<br />
\[<br />
\dtv{\prd^{\otimes n}}{\wt{\prd}} \leq {\textstyle \sqrt{\frac{1}{\min(\prd)}-1}} \cdot \ppn \sqrt{n}.<br />
\]<br />
\end{proposition}<br />
<br />
Recall that for any $\ell \leq k$, the equal-slices distribution $\eqs{\ell}^{m}$ on $m$ coordinates is a mixture of product distributions $\spac^{\otimes m}$ on $[k]^m$. We can therefore average Proposition~\ref{prop:prod-composite} over $\distrb$ to obtain:<br />
\begin{proposition} \label{prop:prod-eqs} If $\wt{\pi}$ denotes the $(\ppn,\pi,\eqs{\ell})$-composite distribution on strings in $[k]^n$, where $\ell \leq k$, then we have<br />
\[<br />
\dtv{\pi^{\otimes n}}{\wt{\pi}} \leq {\textstyle \sqrt{\frac{1}{\min(\pi)}-1}} \cdot \ppn \sqrt{n}.<br />
\]<br />
\end{proposition}<br />
Here we have used the following basic bound, based on the triangle inequality:<br />
\begin{fact} \label{fact:tv-mix} Let $(\distrb_\kappa)_{\kappa \in K}$ be a family of distributions on $\Omega^n$, let $\varsigma$ be a distribution on $K$, and let $\overline{\distrb}$ denote the associated mixture distribution, given by drawing $\kappa \sim \varsigma$ and then drawing from $\distrb_\kappa$. Then<br />
\[<br />
\dtv{\distra}{\overline{\distrb}} \leq \Ex_{\kappa \sim \varsigma}[\dtv{\distra}{\distrb_\kappa}].<br />
\]<br />
\end{fact}<br />
<br />
If we instead use this fact to average Proposition~\ref{prop:prod-composite} over $\prd$, we can obtain:<br />
\begin{proposition} \label{prop:eqs-prod} Let $\distrb$ be any distribution on $[k]$. Writing $\distra$ for the $(\ppn, \eqs{k}, \distrb)$-composite distribution on strings in $[k]^n$, we have<br />
\[<br />
\dtv{\eqs{k}^n}{\distra} \leq (2k-1)\ppn \sqrt{n}.<br />
\]<br />
\end{proposition}<br />
\begin{proof}<br />
Thinking of $\eqs{k}^m$ as the mixture of product distributions $\spac^{\otimes m}$, where $\spac$ is a random spacing on $[k]$, Fact~\ref{fact:tv-mix} and Proposition~\ref{prop:prod-composite} imply<br />
\[<br />
\dtv{\eqs{k}^n}{\distra} \leq \Ex_{\spac}\left[{\textstyle \sqrt{\frac{1}{\min(\spac)}-1}}\right] \cdot \ppn \sqrt{n}.<br />
\]<br />
We can upper-bound the expectation\noteryan{Undoubtedly someone has worked hard on this $-1/2$th moment of the least spacing before (Devroye '81 or '86 perhaps), but I think it's probably okay to do the following simple thing here} by <br />
\begin{multline*}<br />
\Ex_{\spac}\left[{\textstyle \sqrt{\frac{1}{\min(\spac)}}}\right] \quad=\quad \int_{0}^\infty \Pr_{\spac}\left[{\textstyle \sqrt{\frac{1}{\min(\spac)}}} \geq t\right]\,dt <br />
\quad=\quad \int_{0}^\infty \Pr_{\spac}[\min(\spac) \leq 1/t^2]\,dt \\<br />
\leq\quad k + \int_{k}^\infty \Pr_{\spac}[\min(\spac) \leq 1/t^2]\,dt \quad\leq\quad k + \int_{k}^\infty (k(k-1)/t^2) \,dt \quad=\quad 2k-1,<br />
\end{multline*}<br />
where in the second-to-last step we used Proposition~\ref{prop:rand-min}.<br />
\end{proof}<br />
Averaging now once more in the second component, we obtain the following:<br />
\begin{proposition} \label{prop:eqs-eqs} Let $2 \leq \ell \leq k$ and let $\distra'$ denote the $(\ppn, \eqs{k}, \eqs{\ell})$-composite distribution on strings in $[k]^n$. Then<br />
\[<br />
\dtv{\eqs{k}^n}{\distra'} \leq (2k-1) \ppn \sqrt{n}.<br />
\]<br />
\end{proposition}<br />
<br />
<br />
<br />
<br />
We can now obtain the proof of Lemma~\ref{lem:distributions}:<br />
<br />
\begin{proof} The three statements in Lemma~\ref{lem:distributions} essentially follow from Propositions~\ref{prop:prod-eqs}, \ref{prop:eqs-prod}, and \ref{prop:eqs-eqs}, taking $\ppn = 2r/n$. This would give bounds of $2{\textstyle \sqrt{\frac{1}{\min(\pi)}-1}} \cdot r / \sqrt{n}$, $(4k-2) \cdot r/\sqrt{n}$, and $(4k-2) \cdot r/\sqrt{n}$, respectively. However we need to account for conditioning on $r \leq \abs{J} \leq 4r$. By Fact~\ref{fact:dev}, this conditioning increases the total variation distance by at most $2\exp(-r/4)$. Using the lower bound $r \geq 2 \ln n$ from the lemma's hypothesis, this quantity is at most $2r/\sqrt{n}$, completing the proof.<br />
\end{proof}</div>67.186.58.92http://michaelnielsen.org/polymath1/index.php?title=Equal-slices.texEqual-slices.tex2009-06-25T04:21:43Z<p>67.186.58.92: </p>
<hr />
<div>\section{Equal-slice distributions} \label{sec:eqs}<br />
<br />
In this section we describe the equal-slices distribution $\eqs{k}^n$ and the equal-nondegenerate-slices distribution $\ens{k}^n$. It will actually be more convenient to begin with the latter. Before proceeding, we briefly introduce some notation.<br />
<br />
\begin{definition} Given a distribution $\pi$ on $\Omega$, we write $\min(\pi) = \min_{x \in \Omega} \pi[x]$.<br />
\end{definition}<br />
<br />
A final important concept is the \emph{distance} between two probability measures:<br />
\begin{definition} For $\distra$ and $\distrb$ probability distributions on $\Omega^n$, the \emph{total variation distance} $\dtv{\distra}{\distrb}$ is defined by <br />
\[<br />
\dtv{\distra}{\distrb} = \frac12 \sum_{x \in \Omega^n} \abs{\distra[x] - \distrb[x]} = \max_{A \subseteq \Omega^n} \abs{\distra(A) - \distrb(A)}.<br />
\]<br />
If $\dtv{\distra}{\distrb} \leq \tau$ we say that $\distra$ and $\distrb$ are \emph{$\tau$-close}.<br />
\end{definition}<br />
<br />
<br />
\subsection{Equal-nondegenerate-slices}<br />
<br />
\begin{definition} Given a vector $\slice \in \N^k$ whose components sum to $n$, the associated \emph{slice} of $[k]^{n}$ is the set <br />
\[<br />
\{x \in [k]^{n} : x \text{ has $\slice_1$ many $1$'s, $\slice_2$ many $2$'s, \dots, $\slice_k$ many $k$'s}\}.<br />
\]<br />
We abuse terminology by also referring to $\slice$ as the slice. A slice is \emph{nondegenerate} when $\slice_a \neq 0$ for all $a \in [k]$. There are $\binom{n+k-1}{k-1}$ different slices, of which $\binom{n-1}{k-1}$ are nondegenerate.\noteryan{Justify? Justify later?}<br />
\end{definition}<br />
<br />
<br />
\begin{definition} Let $2 \leq k \leq n$. The \emph{equal-nondegenerate-slices} distribution on $[k]^n$, denoted $\ens{k}^n$, generates a string $x$ as follows: First, points $q_1, \dots, q_n$ are arranged on a circle according to a uniformly random permutation. This also forms $n$ ``gaps'' between the $q_i$'s. Second, $k$ gaps are chosen, uniformly from the $\binom{n}{k}$ possibilities; points $r_1, \dots, r_k$ are put into the chosen gaps, in random order. \ignore{Second, $k$ points $r_1, \dots, r_k$ are placed one by one into the gaps, each going into a random unoccupied gap.} This divides the $q_i$'s into ``arcs'', with the ``$r_a$ arc'' consists of those $q_i$'s whose nearest $r$-point counterclockwise is $r_a$. Finally, $x$ is defined by setting $x_i = a$ for all $i$ such that $q_i$ in the $r_a$ arc. See Figure~\ref{fig:ens} for an example.<br />
\end{definition}<br />
<br />
\begin{figure}[htb]<br />
\begin{center}<br />
\includegraphics[height=7.5cm]{ens-figure.png}<br />
\caption{An example of the two stages in drawing from $\ens{3}^{10}$. The resulting string is $1131232213$.}<br />
\label{fig:ens}<br />
\end{center}<br />
\end{figure}<br />
<br />
<br />
\begin{remark} The distribution is symmetric with respect to the $k$ symbols in $[k]$, and thus we may naturally extend it to any alphabet $\Omega$ with $\abs{\Omega} = k$.<br />
\end{remark} <br />
<br />
\ignore{\begin{remark} More generally, when $\abs{J} \geq k$ we write $\ens{k}^J$ for the analogous distribution on $[k]^{J}$, and we write simply $\ens{k}$ when $n$ or $J$ is clear from context.<br />
\end{remark} }<br />
<br />
\begin{remark} If $x \sim \ens{k}^n$ then $x$ always belongs to a nondegenerate slice.<br />
\end{remark}<br />
<br />
\begin{remark} It is not hard to see\noteryan{Honest, although we can clarify} that drawing $x \sim \ens{k}^n$ is equivalent to first choosing a nondegenerate slice $\slice$ uniformly from the $\binom{n-1}{k-1}$ possibilities, then choosing $x$ uniformly from this slice.<br />
\end{remark}<br />
<br />
<br />
The following lemma will be important; it essentially arose already in Lemma~\ref{lem:p-sperner}:<br />
\begin{lemma} \label{lem:last-pt} For $k \geq 2$, let $x \sim \ens{k}^n$, and let $a \sim [k-1]$ be uniformly random. Then $\chg{x}{k}{a}$ is distributed as~$\ens{k-1}^n$.<br />
\end{lemma}<br />
\begin{proof}<br />
Suppose we are drawing $x \sim \ens{k}^n$ and have chosen the arrangement of $q_1, \dots, q_n$ and also the $k$ gaps to be filled with $r$-points. To fill these gaps, we may do the following. First, choose one of the $k$ gaps at random to contain $r_{k}$. Next, take the first gap counterclockwise from $r_{k}$ and let it contain $r_a$, where $a \sim [k-1]$ is uniformly chosen. Finally, fill the remaining $k-2$ gaps according to a uniform ordering of $[k-1] \setminus \{a\}$. But having chosen $x$ in this way, $\chg{x}{k}{a}$ is obtained simply by ``deleting'' $r_{k}$ and interpreting the result as a (correctly distributed) draw from $\ens{k-1}^n$.\noteryan{not put very eloquently}<br />
\end{proof}<br />
<br />
<br />
<br />
<br />
We end our discussion of the equal-nondegenerate-slices distribution with a handy lemma (which is not true of the equal-slices distribution). First, one more piece of notation:<br />
\begin{definition} Let $x \in [m]^n$, $y \in [k]^m$. We write $y \circ x$ for the string $\chg{x}{(1, \dotsc, m)}{(y_1, \dotsc, y_m)} \in [k]^n$. For example, if $n = 6$, $m = 4$, $k = 3$, $x = 4124332$, $y = 3132$, then $y \circ x = 2312331$. This operation is indeed function composition if one thinks of a string in $\Omega^n$ as a map $[n] \to \Omega$.<br />
\end{definition}<br />
\begin{lemma} \label{lem:ens-compose} Let $x \sim \ens{m}^n$ and let $y \sim \ens{k}^m$. Then $y \circ x \in [k]^n$ is distributed as $\ens{k}^n$.<br />
\end{lemma}<br />
\begin{proof}<br />
The effect of forming $y \circ x$ can be thought of as follows: First the points $q_1, \dots, q_n$ and $r_1, \dots, r_m$ are randomly arranged as usual to form $x$. Next, we think of the \emph{arcs} of $q_i$'s as being the $m$ objects circularly permuted in the initial stage of drawing $y$; we then choose $k$ out of the $m$ gaps between the arcs at random and place points $s_1, \dots, s_k$ into them in a random order. Then $z = y \circ x$ is given by setting $z_i = j$ for all $i$ such that $q_i$ is in the $s_j$ ``super-arc''. <br />
<br />
But since $\ens{m}^n$ is symmetric under permutations of $[m]$, the middle stage in this composite process---permuting the $r_j$ arcs---is superfluous. Eliminating it, we see that the $s_j$'s are ultimately placed by first choosing $m$ out of $n$ gaps at random, and then further choosing $k$ out of these $m$ choices at random. This is equivalent to simply choosing $k$ out of the $n$ gaps at random; hence $z$ is indeed distributed as $\ens{k}^n$.\noteryan{Again, not put very eloquently}<br />
\end{proof}<br />
<br />
<br />
<br />
\subsection{Equal-slices}<br />
We now define the closely related variant, the equal-slices distribution:<br />
\begin{definition} \label{def:eqs} Let $k \geq 2$ and let $n$ be a positive integer. The \emph{equal-slices} distribution on $[k]^n$, denoted $\eqs{k}^n$, generates a string $x$ in a manner similar to that of equal-nondegenerate-slices. The only distinction is that the points $r_1, \dots, r_k$ are placed into random gaps one-by-one, and once a point $r_a$ is placed, it is considered to split the gap into two occupiable gaps. Thus when placing $r_a$, there are $n+a-1$ gaps to choose from. As a consequence, some of the resulting $r$-arcs may be empty of $q$-points.<br />
\end{definition}<br />
<br />
\begin{remark} It is not hard to see\noteryan{honest} that the process in Definition~\ref{def:eqs} yields a uniformly random (circular) ordering of the $n+k$ points. In particular, the definition of $\eqs{k}^n$ is symmetric with respect ot the symbols $[k]$.<br />
\end{remark}<br />
<br />
Let's first show that the two distributions $\ens{k}^n$ and $\eqs{k}^n$ are very close:<br />
\begin{proposition} \label{prop:degen} For $2 \leq k \leq n$ we have $\dtv{\ens{k}^n}{\eqs{k}^n} \leq k(k-1)/n$.<br />
\end{proposition}<br />
\begin{proof} The distribution $\ens{k}^n$ is precisely $\eqs{k}^n$ conditioned on no $r_a$'s ending up adjacent; hence it suffices to show this event occurs with probability at most $k(k-1)/n$. In a draw from $\eqs{k}^n$, the probability that $r_a$ is placed adjacent to one of $r_1, \dots, r_{a-1}$ is at most $2(a-1)/(n+a-1) \leq 2(a-1)/n$. Hence the probability of having any adjacent $r_a$'s is at most $\sum_{a=1}^k 2(a-1)/n = k(k-1)/n$, as desired.<br />
\end{proof}<br />
<br />
<br />
It will also be quite useful to observe that equal-slices is a \emph{mixture of product distributions}.<br />
\begin{definition} \label{def:spacings} Given $k$ points $r_1, \dotsc, r_k$ on the the circle of unit circumference, dividing it into arcs, the associated \emph{spacing} is the $k$-dimensional vector $\spac$ defined by setting $\spac[a]$ to be the length of the arc emanating clockwise from $r_a$. We identify a spacing $\spac$ with a probability distribution on $[k]$ in the natural way. We say that $\spac$ is a \emph{(uniformly) random spacing} if it is derived by choosing the points $r_1, \dotsc, r_k$ independently and uniformly at random on the circle.<br />
\end{definition}<br />
\begin{remark} It is well known (see, e.g.~\cite[Ch.\ 1]{Rei89})\noteryan{I think the traditional citation here is Pyke '65, but we can save a spot in the biblio by citing Reiss again (which is also a text, whereas Pyke is a bit painful to read)} that a random spacing $\spac$ can equivalently be defined as choosing the vector of probabilities $(\spac[1], \dots, \spac[k])$ uniformly from the simplex $\{(p_1, \dots, p_k) : \sum p_a = 1, p_a \geq 0\ \forall a \in [k]\}$. \noteryan{(Or, by taking the spacings of $k-1$ random points in a unit segment.) Since we don't use this fact, I think it's fair to not prove it.} <br />
\end{remark}<br />
<br />
\begin{proposition} \label{prop:eqs-mix} The equal-slices distribution $\eqs{k}^n$ can equivalently be defined as follows: First, draw a random spacing $\spac$ on $[k]$; then, draw a string from the product distribution $\spac^{\otimes n}$ on $[k]^{n}$. <br />
\end{proposition}<br />
\begin{proof}<br />
In the original definition of a draw $x \sim \eqs{k}^n$, the string $x$ only depends on the joint ordering of the $n+k$ points $q_i$ and $r_a$ around the circle. As we earlier remarked, this ordering is just a uniform (circular) permutation of $n+k$ objects. Hence we can also obtain it by choosing all $n+k$ points independently from the continuous uniform distribution on the circle of unit circumference.\noteryan{Not worth pointing out that we're disregarding some probability-$0$ events here.} Further, in this continuous distribution, we may choose the $r_a$'s first, and then the $q_i$'s next, one-by-one. The choice of $r_a$'s gives a random spacing $\spac$, and then the formation of $x$ by choice of $q_i$'s is indeed equivalent to drawing $x$ from the product distribution $\spac^{\otimes n}$.\noteryan{Not very eloquently put.}<br />
\end{proof}<br />
<br />
\begin{remark} \label{rem:eqs-equiv} \noteryan{proofs not necessary?} Drawing $x \sim \eqs{k}^n$ is equivalent to first choosing a slice $\slice$ uniformly from the $\binom{n+k-1}{k-1}$ possibilities (nondegenerate and degenerate), then choosing $x$ uniformly from this slice. Hence $\ens{k}^n$ is precisely $\eqs{k}^n$ conditioned on the string being nondegenerate. Equivalently, for each $x$ in slice $\slice$ we have <br />
\begin{equation} \label{eqn:eqs-equiv}<br />
\eqs{k}^n[x] = 1 \Bigm/ \binom{n+k-1}{k-1, \slice_1, \dots, \slice_k}.<br />
\end{equation}<br />
\end{remark}<br />
<br />
<br />
<br />
<br />
<br />
Finally, we will occasionally need the following technical bounds:<br />
\begin{proposition} \label{prop:rand-min} If $\spac$ is a random spacing on $[k]$ as in Definition~\ref{def:spacings}, then<br />
\[<br />
\Pr[\min(\spac) \leq \theta] \leq k(k-1) \theta.<br />
\]<br />
\end{proposition}<br />
\begin{proof}<br />
By a union bound and symmetry, $\Pr[\min(\spac) \leq \theta] \leq k \Pr[\spac[1] \leq \theta]$. The event $\spac[1] \leq \theta$ occurs only if one of $r_2, \dots, r_k$ falls within the arc of length $\theta$ clockwise of $r_1$. By another union bound, this has probability at most $(k-1)\theta$.<br />
\end{proof}<br />
\begin{proposition} \label{prop:min-eqs} Assume $2k^2 \leq n$. Then we may crudely bound $\min(\eqs{k}^n) \geq k^{-2n}$. \noteryan{Can actually get something like $k^kn^{-k/2}k^{-n}$ here, but probably best just to get the typographically shortest thing here.}<br />
\end{proposition}<br />
\begin{proof}<br />
Define $\alpha = k(k-1)/n$ and apply the multinomial theorem---<br />
\[<br />
(\alpha + 1 + \cdots + 1)^{n+k-1} = \sum_{t_0 + t_1 + \cdots t_k = n+k-1} \binom{n+k-1}{t_0, t_1, \dots, t_k} \alpha^{t_0} \geq \binom{n+k-1}{k-1, \slice_1, \dots, \slice_k} \alpha^{k-1}.<br />
\]<br />
Hence we can lower-bound the expression~\eqref{eqn:eqs-equiv} appearing in Remark~\ref{rem:eqs-equiv} by<br />
\[<br />
\frac{\alpha^{k-1}}{(\alpha + k)^{n+k-1}} = \frac{(k-1)^{k-1}}{(1 + \frac{k-1}{n})^{n+k-1} n^{k-1} k^n} \geq \frac{(k-1)^{k-1}}{\exp(k-1 + \frac{(k-1)^2}{n}) n^{k-1} k^n} \geq \frac{(k-1)^{k-1}}{\exp(k) n^{k-1} k^{n}},<br />
\]<br />
where the first step used the definition of $\alpha$, the second used $1+x \leq \exp(x)$, and the last used $(k-1)^2/n \leq 1/2 \leq 1$, by assumption. \noteryan{Here's where we waste a lot of factors.} Now it's not hard to check\noteryan{Honest.} that when $k \geq 2$ and $n \geq 2k^2$, the above quantity is at least $k^{-2n}$.<br />
\end{proof}</div>67.186.58.92http://michaelnielsen.org/polymath1/index.php?title=Outline.texOutline.tex2009-06-25T04:21:25Z<p>67.186.58.92: New page: \section{Outline of the proof} \label{sec:outline} In this section we sketch our proof of DHJ($3$), which involves reducing it to DHJ($2$). The general deduction of DHJ($k+1$) from DHJ($k...</p>
<hr />
<div>\section{Outline of the proof} \label{sec:outline}<br />
<br />
In this section we sketch our proof of DHJ($3$), which involves reducing it to DHJ($2$). The general deduction of DHJ($k+1$) from DHJ($k$) is not much more difficult.<br />
<br />
\subsection{Passing between distributions, via subspaces} \label{sec:outline-dist}<br />
According to the statement of DHJ($3$), we are a given a subset $A \subseteq [3]^n$ with uniform density $\mu_3(A) \geq \delta$, and would like to find a combinatorial line contained in $A$, assuming $n$ is large enough. It is helpful to think of $\delta$ as a fixed explicit constant such as $1\%$ and to think of $n$ as enormous, so that any function $o_n(1)$ is negligible compared with $\delta$.<br />
<br />
As we saw with Sperner's theorem, it is often more natural to think of the DHJ problem under the equal-(nondegenerate-)slices distribution. Our first task therefore is to reduce to proving Theorem~\ref{thm:edhj}, by ``embedding'' the equal-nondegenerate-slices distribution $\ens{3}$ in the uniform distribution $\unif_3^{\otimes n}$. To do this embedding, we define more explicitly the notion of \emph{restrictions}:<br />
<br />
\begin{definition} \label{def:restriction} A \emph{restriction} on $[k]^n$ is a pair $(J, x_\barJ)$, where $J \subseteq [n]$ and $x_\barJ \in [k]^{\barJ}$, where $\barJ = [n] \setminus J$. It is thought of as a ``partial string'', where the $\barJ$-coordinates are ``fixed'' according to $x_\barJ$ and the $J$-coordinates are still ``free''. Given a fixing $y_J \in [k]^J$, we write $(x_\barJ, y_J)$ for the complete composite string in $[k]^n$.\noteryan{Although this notation looks awkward, I believe it's useful.}<br />
\end{definition} <br />
<br />
\begin{definition} Given a set $A \subseteq [k]^n$ and a restriction $(J,x_\barJ)$, we sometimes write $A_{x_\barJ}$ for the subset of $[k]^{J}$ defined by $A_{x_\barJ} = \{y \in [k]^J : (x_{\barJ}, y_J) \in A\}$.<br />
\end{definition}<br />
<br />
\begin{remark} As mentioned in Section~\ref{sec:pass-subspace}, a restriction $(J, x_\barJ)$ can be identified with a particular kind of combinatorial subspace. Specifically, it corresponds to the $\abs{J}$-dimensional subspace whose template is $(x_{\barJ}, w_J)$, where $w_J$ consists of $\abs{J}$ distinct wildcard symbols. The set $A_{x_\barJ}$ is renamed ``$A$'' when we ``pass to'' this subspace.<br />
\end{remark}<br />
<br />
Having defined restrictions, let's warm up by seeing how to embed the uniform distribution in the uniform distribution. (This was implicitly done in our deduction of multidimensional DHJ($k$) from DHJ($k$), Proposition~\ref{prop:mdhj-from-dhj}.) Suppose we have $\unif_3^{\ot n}(A) \geq \delta$. Fix an arbitrary subset of the coordinates $J \subset [n]$ with $\abs{J} = r$. Since $\unif_3^{\ot n}$ is a product distribution, if we draw $x_\barJ \sim \unif_3^{\ot \barJ}$ and $y_J \sim \unif_3^{\ot J}$ independently, then the composite string $z = (x_\barJ, y_J)$ is distributed as $\unif_3^{\ot n}$. Hence<br />
\[<br />
\delta \leq \unif_3^{\ot n}(A) = \Pr_z[z \in A] = \Pr_{x_\barJ, y_J}[(x_\barJ, y_J) \in A] = \Ex_{x_\barJ}\Bigr[\Pr_{y_J}[(x_\barJ, y_J) \in A]\Bigr] = \Ex_{x_\barJ}\left[\unif_3^{\otimes J}(A_{x_\barJ})\right].<br />
\]<br />
It follow that there must exist a \emph{particular} substring $x_\barJ$ for which $\unif_3^{\otimes J}(A_{x_\barJ}) \geq \delta$ as well. We can then pass to the subspace defined by $(J,x_\barJ)$, and we have constructed as new instance of the original problem: ``$n$'' has decreased to $r = \abs{J}$, and ``$A$'' (i.e., $A_{x_\barJ}$) still has $\unif_3(A) \geq \delta$.<br />
<br />
Of course this hasn't gained us anything, but it illustrates the basic technique of passing to a subspace. We can now explain how to use this technique to pass to a different \emph{distribution}; namely, equal-nondegenerate-slices. Suppose we construct strings $z \in [3]^n$ as follows: we choose $r$ coordinates $J \subset [n]$ \emph{randomly}, we choose the substring $x_\barJ$ according to the uniform distribution $\unif_3^{\ot \barJ}$, and we choose the substring $y_J$ according to equal-nondegenerate-slices $\ens{3}^J$. Write $\wt{\unif}_3^n$ for the resulting distribution on the composite string $z = (x_\barJ, y_J)$. Now the count of each symbol in a draw from $\unif_3^{\ot n}$ tends to fluctuate within a range of size roughly $\sqrt{n}$; thus if $r \ll \sqrt{n}$, we may hope that the ``corrupted distribution'' $\wt{\unif}_3^n$ is still close to the original distribution $\unif_3^n$. In Section~\ref{sec:measures} we do some technical calculations to show that this is indeed the case; essentially, the total variation distance between distributions $\unif_3^{\otimes n}$ and $\wt{\unif}_3^n$ is at most $O(r/\sqrt{n})$. Thus by taking, say, $r \approx n^{1/4}$, we may conclude<br />
\[<br />
\abs{\unif_3^{\otimes n}(A) - \wt{\unif}_3^n(A)} \leq O(1/n^{1/4}) \ll \delta \quad\Rightarrow\quad \wt{\unif}_3^n(A) \gtrapprox \delta.<br />
\]<br />
Hence <br />
\[<br />
\delta \lessapprox \wt{\unif}_3^n(A) = \Pr_{J, x_\barJ, y_J}[(x_\barJ, y_J) \in A] = \Ex_{J, x_\barJ}\Bigr[\Pr_{y_J}[(x_\barJ, y_J) \in A]\Bigr] = \Ex_{J, x_\barJ}\left[\ens{3}^{J}(A_{x_\barJ})\right],<br />
\]<br />
and so again there must exist a particular $J$ and $x_\barJ$ such that $\ens{3}^J(A_{x_\barJ}) \gtrapprox \delta$. If we pass to the associated subspace, we get a set $A$ with \emph{equal-nondegenerate-slices} density at least $\delta$ (essentially), and ``$n$'' has only decreased to $r \approx n^{1/4}$, which is rather affordable (especially in light of the quantitative bounds we ultimately have in Theorem~\ref{thm:our-dhj}). Now we may try to find a combinatorial line in this new $A$.<br />
<br />
\subsection{The density increment strategy} \label{sec:outline-incr}<br />
<br />
As we have seen, given a set $A$ of density at least $\delta$, it is not difficult to pass to a subspace and maintain density essentially $\delta$ (and we may even switch between probability distributions, if we wish). Our overall strategy for proving DHJ($3$) involves passing to subspaces on which $A$'s density \emph{increases} by a noticeable amount; specifically, from $\delta$ to $\delta + \Omega(\delta^3)$. (Here $\delta^3$ represents $\delta$ times the quantity $\delta^2 = \pdhj{2}{\delta}$ from Lemma~\ref{lem:p-sperner}.) More precisely, we will show the following result:<br />
\begin{theorem} \label{thm:incr3} Assume $n \geq 2 \uparrow^{(7)} (1/\delta)$.\noteryan{check this} If $\ens{3}^n(A) \geq \delta$, then either $A$ contains a combinatorial line, or we may pass to a subspace of dimension at least $\log^{(7)} n$ on which $\ens{3}(A) \geq \delta + \Omega(\delta^3)$.<br />
\end{theorem}<br />
\noindent Here and throughout $\log n$ denotes $\max\{2, \log_2 n\}$; $\log^{(7)}$ is this function iterated $7$ times.<br />
<br />
Using such a ``density increment'' theorem is a well-known strategy\noteryan{references}, and yields the equal-slices DHJ($3$) theorem as follows: Assuming $n$ is ``sufficiently large'' to begin with, we either find a combinatorial line or we pass to a $\log^{(7)}(n)$-dimensional subspace on which $A$ has density at least $\delta + \Omega(\delta^3)$. Repeating this $C/7\delta^3$ times, for $C$ a sufficiently large absolute constant, we either find a combinatorial line somewhere along the way, or pass to a $\log^{(C/\delta^3)}(n)$-dimensional subspace on which $A$ has density at least $2\delta$. (All the while we are assuming the dimension is still ``sufficiently large''.) Schematically,<br />
\[<br />
\delta \xrightarrow{\text{$C/\delta^3$ many $\log$s}} 2\delta.<br />
\]<br />
Iterating this scheme, we get:<br />
\[<br />
\delta \quad \xrightarrow{\text{$C/\delta^3$ many $\log$s}} \quad 2\delta \quad \xrightarrow{\text{$C/(2\delta)^3$ many $\log$s}} \quad 4\delta \quad \xrightarrow{\text{$C/(4\delta)^3$ many $\log$s}} \quad 8\delta \quad \longrightarrow \cdots<br />
\]<br />
Of course the density cannot exceed $1$, so eventually we \emph{must} find a combinatorial line, after at most $C/\delta^3 + C/8\delta^3 + C/64\delta^3 + \cdots \leq 2C/\delta^3$ many $\log$s. All the while we have assumed that the dimension is sufficiently large that the hypothesis of Theorem~\ref{thm:incr3} holds; to ensure this, it certainly suffices to ensure that the ``last'' dimension, $\log^{(2C/\delta^3)} n$, is at least $2 \uparrow^{(7)} (1/\delta)$. And this holds presuming the initial dimension, $n$, is at least $2 \upuparrows O(1/\delta^3)$. Thus we obtain the quantitative bound for $n$ given in our main Theorem~\ref{thm:our-dhj} (noting that the fourth-root arising in the previous section's reduction from DHJ($3$) to equal-slices DHJ($3$) is indeed negligible). <br />
<br />
\subsection{Extending length-\texorpdfstring{$2$}{2} lines to length-\texorpdfstring{$3$}{3} lines} \label{sec:outline-corr}<br />
We now look to establish the density increment Theorem~\ref{thm:incr3}. In this section we will describe our basic strategy for finding lines in a subset $A \subseteq [3]^n$ with $\ens{3}(A) = \delta$. We will show that if the strategy does not work, we can obtain the density increment stated in Theorem~\ref{thm:incr3}. The idea is to begin with what we already know: probabilistic DHJ($2$), i.e., Lemma~\ref{lem:p-sperner}. At a high level, this can give us a large number of length-$2$ lines in $A$, one of which we hope to extend to a length-$3$ line.<br />
<br />
To begin with, though, we will need to be able to apply probabilistic DHJ($2$), i.e.\ Lemma~\ref{lem:p-sperner}, to our $A$. It's not immediately clear how to do this: since $\ens{3}^n$ is supported on nondegenerate strings, we might have $\ens{3}(A) \geq \delta$ and yet $\ens{2}^n(A) = 0$. To evade this we can embed $\ens{2}$ into $\ens{3}^n$ on $n^{1/4}$ random coordinates into $\ens{3}^n$, as in Section~\ref{sec:outline-dist}. This will let us pass to a subspace on which $\ens{2}(A) \gtrapprox \delta$. But this is not enough; if $\ens{3}(A)$ becomes tiny, we'll have no hope of extending a length-$2$ line in $A$ into a length-$3$ line.\noteryan{note explained satisfactorily}<br />
<br />
Looking at the argument in Section~\ref{sec:outline-dist}, we see that for a random restriction $(J, x_\barJ)$ to some $n^{1/4}$ coordinates $J$, the expected $\ens{2}$-density of $A$ is essentially $\delta$ and so is the expected $\ens{3}$-density. What we would like is for both densities to be $\gtrapprox \delta$ \emph{simultaneously}. This can be arranged with a density increment trick. The idea is to look at the \emph{variance} of the $\ens{3}$-density of $A$ under the random restriction. There are two cases. First, suppose this variance is somewhat \emph{high}, say at least $\delta^{C}$ for some large constant $C$. Then it must happen that there is some restriction wherein the $\ens{3}$-density of $A$ increases a little above its mean, to at least $\delta + \delta^{C-1}$. In this case, we have a density increment which is almost good enough for Theorem~\ref{thm:incr3} (we can repeat the present argument some $\poly(1/\delta)$ more times to get the full density increment claimed). Otherwise, the variance is quite \emph{low}. This means that for ``almost every'' restriction $(J, x_\barJ)$, the $\ens{3}$-density of $A$ is very close to $\delta$. Since we know that the expected $\ens{2}$-density of $A$ is $\delta$, it follows that for a noticeable fraction of restrictions the $\ens{2}$-density is $\gtrapprox \delta$. Thus we can find a restriction with both $\ens{2}(A) \gtrapprox \delta$ and $\ens{3}(A) \approx \delta$, as desired. \noteryan{not explained very well}.<br />
<br />
Having arranged for both $\ens{2}(A), \ens{3}(A) \gtrapprox \delta$, we now come to a key idea in the proof. Suppose we pick a random $z \sim \ens{3}^n$. On one hand, we can think of $z$ as a \emph{point}, which will be in $A$ with probability at least $\delta$. On the other hand, we can think of $z$ as a \emph{line template over $[2]^n$} with wildcard symbol `$3$', and the probabilistic DHJ($2$) tells us that the associated length-$2$ line is in $A$ with probability at least $\delta^2$. \emph{If both events happen, we have a length-$3$ line in $A$:} $\{\chg{z}{3}{1}, \chg{z}{3}{2}, z\}$.<br />
<br />
To rephrase this, let $L \subseteq [3]^n$ be the set of line templates over $[2]^n$ for which the associated line is in $A$. If $L \cap A \neq \emptyset$ then we have found the desired line in $A$. Unfortunately, all we know is that $\ens{3}(L) \geq \delta^2$ and $\ens{3}(A) \geq \delta$, and there is plenty of room for a density-$\delta^2$ set and a density-$\delta$ set to not intersect.<br />
<br />
However, if $A$ does not intersect $L$ then we have that <br />
\[<br />
\frac{\ens{3}(A)}{\ens{3}(\overline{L})} \geq \frac{\delta}{1-\delta^2} \geq \delta + \delta^3,<br />
\]<br />
where $\overline{L}$ denotes $[3]^n \setminus L$. In other words, $A$ has a ``relative'' density increment on the set $\overline{L}$. Unfortunately, $\overline{L}$ is not a combinatorial subspace of $[3]^n$, so we have not established the density increment Theorem~\ref{thm:incr3}. Still, as we will see, $\overline{L}$ is not a completely arbitrary subset; it has just enough structure that we will be able to convert $A$'s density increment on it to a density increment on a subspace.<br />
<br />
\subsection{\texorpdfstring{$ab$}{ab}-insensitive sets} \label{sec:outline-insens}<br />
Recall that<br />
\[<br />
L = \{z \in [3]^n : \chg{z}{3}{1} \in A,\ \chg{z}{3}{2} \in A\},<br />
\]<br />
and hence<br />
\[<br />
\overline{L} = B_1 \cup B_2, \quad \text{ where } B_1 = \{z \in [3]^n : \chg{z}{3}{1} \not \in A\},\ B_2 = \{z \in [3]^n : \chg{z}{3}{2} \not \in A\}.<br />
\]<br />
Consider, e.g., $B_1$. It has an important structural property: it is ``$13$-insensitive''.<br />
<br />
\begin{definition} Let $a, b \in \Omega$ be distinct symbols. We say that $C \subseteq \Omega^n$ is \emph{$ab$-insensitive} if the following condition holds: $x \in C$ iff $\chg{x}{a}{b} \in C$.<br />
\end{definition}<br />
\begin{remark} This definition is symmetric in $a$ and $b$. It is perhaps easier to understand the condition as follows: ``altering some $a$'s to $b$'s and some $b$'s to $a$'s does not affect presence/absence in $C$''.<br />
\end{remark}<br />
\begin{remark} Suppose $C$ and $C'$ are $ab$-insensitive subsets of $\Omega^n$. Then so too are $\overline{C}$ and $C \cap C'$. Further, if $\sigma$ is a $d$-dimensional subspace of $\Omega^n$, and we view $C \cap \sigma$ as a subset of $\sigma \cong \Omega^d$, then $C \cap \sigma$ is $ab$-insensitive.<br />
\end{remark}<br />
<br />
<br />
It is clear from the definition that $B_1$ is a $13$-insensitive set and $B_2$ is a $23$-insensitive set. Let us temporarily pretend that there is no $B_1$ and we simply have $\overline{L} = B_2$. (This simplification will ultimately prove easy to patch up.) Thus we have a density increment of roughly $\delta^3$ for $A$ on the $23$-insensitive set $B_2$; our last task is to convert this to an equally good density increment on a combinatorial subspace of dimension at least $\log^{(7)} n$. It turns out to be convenient to carry out this task under the uniform distribution, rather than under the equal-slices distribution; we can arrange for this by another distribution-embedding argument.<br />
<br />
Sets which are $ab$-insensitive are very useful, for the following reason: for such sets, we can ``boost'' DHJ($k$) to DHJ($k+1$), and similarly for the multidimensional version. Suppose the $23$-insensitive set $B_2$ has $\unif_3(B_2) \geq \eta$. By ``conditioning on the location of the $3$'s'', it is easy to find a restriction under which $\unif_2(B_2) \geq \eta$ as well. Hence we can apply the multidimensional DHJ($2$) theorem find a ``length-$2$'' $d$-dimensional combinatorial subspace $\sigma$ contained in $B_2$. This $\sigma$ will only be an isomorphic copy of $\{1,2\}^d$, not $\{1,2,3\}^d$. But the key point is that $B_2$ is $23$-insensitive; hence $\chg{z}{2}{3} \in B_2$ for all $z \in \sigma$, and we conclude that the full ``length-$3$'' $d$-dimensional subspace (copy of $\{1,2,3\}^d$) is contained in $B_2$ as well.<br />
<br />
<br />
\subsection{Completing the proof: partitioning \texorpdfstring{$ab$}{ab}-insensitive sets} \label{sec:outline-partition}<br />
We now come to perhaps the key step in the proof: We show that a $23$-insensitive set can be almost completely partitioned into a disjoint union $\calS$ of $d$-dimensional combinatorial subspaces, where $d \approx \log^{(7)} n$. Thus if $A$ has a density increment on the $23$-insensitive set $B_2$, it must have an equally good density increment on one of the $d$-dimensional subspaces in $\calS$.<br />
<br />
Let us explain this slightly more carefully. Suppose we have some $23$-insensitive set $B \subseteq [3]^n$ and we wish to partition it into a collection of disjoint $d$-dimensional subspaces $\calS$, along with an ``error'' set $E$ satisfying $\unif_3(E) \leq \delta^3/100$, say. If $\unif_3(B) \leq \delta^3/100$ already, we are done. Otherwise, using the density and $23$-insensitivity of $B$, we use the multidimensional DHJ($2$) to find a $d$-dimensional subspace contained in $B$. (We will describe later why we may take $d \approx \log^{(7)} n$.)<br />
<br />
We can do even better than simply finding a single $d$-dimensional subspace in $B$. By an argument similar to Proposition~\ref{prop:mdhj-from-dhj}, we can ensure that there is a set of coordinates $J \subseteq [n]$ with $\abs{J} = m = \mdhj{2}{\delta^3/100}{d}$ and a particular $d$-dimensional subspace $\sigma \subseteq [3]^J$ such that $\sigma \times \{y\} \subseteq B$ for at least some $\tau = \tau(\delta, d) > 0$ fraction of all strings $y \in [3]^{\barJ}$. These sets $\sigma \times \{y\}$ are disjoint $d$-dimensional subspaces that we may remove from $B$ and place into the collection $\calS$. <br />
<br />
We now wish to iterate this argument, but there is a catch; having removed the sets $\sigma \times \{y\}$ from $B$, it is likely no longer $23$-insensitive. Fortunately, its $23$-insensitivity is only spoiled on the $m$ coordinates $J$:<br />
\begin{definition} Let $a, b \in \Omega$ be distinct symbols. We say that $C \subseteq \Omega^n$ is \emph{$ab$-insensitive on $I \subseteq [n]$} if the following condition holds: given a string $x$, altering some $a$'s to $b$'s and some $b$'s to $a$'s \emph{within coordinates $I$} does not affect $x$'s presence/absence in $C$. If $I$ is unspecified we take it to be all of~$[n]$.<br />
\end{definition}<br />
We can thus remove a collection of disjoint $d$-dimensional subspaces from $B$ of probability mass at least $\tau(\delta,d) 3^{-m}$, at the expense of making $B$ only $23$-insensitive on coordinates $[n] \setminus J$. We now iterate the argument. Since $\abs{J}$ depends only on $\delta$ and $d$, we may assume that $n \gg \abs{J}$. Thus $B$ still has plenty of $23$-insensitive coordinates, and we can find another $d$-dimensional subspace contained in $B$, localized to some new $m$ coordinates $J_2$. This lets us remove another collection of $d$-dimensional subspaces from $B$ of mass at least $\tau(\delta,d) 3^{-m}$. Since this quantity depends only on $\delta$ and $d$, we need only iterate some $T(\delta, d)$ many times before $B$'s total density drops below $\delta^3/100$. And we may always continue so long as we still have $m$ coordinates on which $B$ is $23$-insensitive; we ensure this by assuming $n \geq T(\delta, d)m$. Using the Gunderson-R\"{o}dl-Sidirenko theorem, roughly speaking $m = \mdhj{2}{\delta^3/100}{d}$ doubly-exponential in $d$ and so $\tau(\delta,d) 3^{-m}$ is triply-exponential in $d$. In fact, we need to iterate this argument once more, leading to a six-fold exponential, because $\overline{L}$ was not just the single $23$-insensitive set $B_2$ but was rather $B_1 \cup B_2$. In the end, we are able to safely take $d \approx \log^{(7)} n$, as needed to prove Theorem~\ref{thm:incr3}.\noteryan{Ran out of steam here; exposition quality got low}</div>67.186.58.92http://michaelnielsen.org/polymath1/index.php?title=Sperner.texSperner.tex2009-06-25T04:21:01Z<p>67.186.58.92: </p>
<hr />
<div>\section{Sperner's theorem} \label{sec:sperner}<br />
<br />
The first nontrivial case of Szemer\'edi's theorem is the case of arithmetic progressions of length $3$. However, for the density Hales--Jewett theorem even the case $k=2$ is interesting. DHJ($2$) follows from a basic result in extremal combinatorics: Sperner's theorem. \ignore{This result has been known for a long time. \noteryan{amusingly, published after vdW's theorem}} In this section we review a standard probabilistic proof of Sperner's theorem. Besides suggesting the equal-slices distribution, this proof easily gives the $k=2$ case of the probabilistic Density Hales--Jewett theorem, a key component in our proof of DHJ($3$).<br />
<br />
To investigate DHJ($2$) it slightly more convenient to take the alphabet to be $\Omega = \{0,1\}$. Then a combinatorial line in $\{0,1\}^n$ is a pair of distinct binary strings $x$ and $y$ such that to obtain $y$ from $x$ one changes some $0$'s to $1$'s. If we think of the strings $x$ and $y$ as the indicators of two subsets $X$ and $Y$ of $[n]$, then this is saying that $X$ is a proper subset of $Y$. Therefore, when $k=2$ we can formulate the density Hales-Jewett theorem as follows: there exists $\dhj{2}{\delta}$ such that for $n \geq \dhj{2}{\delta}$, if $\calA$ is a collection of at least $\delta 2^n$ subsets of $[n]$, then there must exist two distinct sets $X,Y \in \calA$ with $X \subset Y$. In the language of combinatorics, this is saying that $\calA$ is \emph{not} an antichain. (Recall that an \textit{antichain} is a collection $\mathcal{A}$ of sets such that no set in $\mathcal{A}$ is a proper subset of any other.) Sperner's theorem gives something slightly stronger: a precise lower bound on the cardinality of any antichain.<br />
<br />
\begin{named}{Sperner's theorem} \label{thm:sperner}<br />
For every positive integer $n$, the largest cardinality of any antichain of subsets of $[n]$ is $\binom{n}{\lfloor n/2\rfloor}$.<br />
\end{named}<br />
As the bound suggests, the best possible example is the collection of all subsets of $[n]$ of size $\lfloor n/2\rfloor$. (It can be shown that this example is essentially unique: the only other example is to take all sets of size $\lceil n/2\rceil$, and even this is different only when $n$ is odd.) It is well known that $\binom{n}{\lfloor n/2\rfloor}2^{-n} \geq 1/2\sqrt{n}$ for all $n$; hence Sperner's theorem implies that one may take $\dhj{2}{\delta} = 4/\delta^2$.\noteryan{The constant can be sharpened to $\pi/2$ for small $\delta$, of course. Also, $4/\delta^2$ is technically not an integer.} Let us present a standard probabilistic proof of Sperner's theorem (see, e.g.,~\cite{Spe90}):<br />
<br />
\begin{proof} (\emph{Sperner's theorem.}) Consider the following way of choosing a random subset of $[n]$. First, we choose, uniformly at random, a permutation $\tau$ of $[n]$. Next, we choose, uniformly at random and independently of $\tau$, an integer $s$ from the set $\{0,1,\dots,n\}$. Finally, we set $X=\{\tau(1),\dots,\tau(s)\}$ (where this is interpreted as the empty set if $s=0$). <br />
<br />
Let $\mathcal{A}$ be an antichain. Then the probability that a set $X$ that is chosen randomly in the above manner belongs to $\mathcal{A}$ is at most $1/(n+1)$, since whatever $\tau$ is at most one of the $n+1$ sets $\{\tau(1),\dots,\tau(s)\}$ can belong to $\mathcal{A}$. <br />
<br />
However, what we are really interested in is the probability that $X\in\mathcal{A}$ if $X$ is chosen \textit{uniformly} from all subsets of $[n]$. Let us write $\eqs{2}[X]$ for the probability that we choose $X$ according to the distribution defined above, and $\unif_2[X]$ for the probability that we choose it uniformly. Then $\unif_2[X]=2^{-n}$ for every $X$, whereas $\eqs{2}[X]=\frac 1{n+1}\binom n{\abs{X}}^{-1}$, since there is a probability $1/(n+1)$ that $s = \abs{X}$, and all sets of size $\abs{X}$ are equally likely to be chosen. Therefore, the largest ratio of $\unif_2[X]$ to $\eqs{2}[X]$ occurs when $\abs{X}=\lfloor n/2\rfloor$ or $\lceil n/2\rceil$.\noteryan{by unimodality of binomial coefficients} In this case, the ratio is $(n+1)\binom n{\lfloor n/2\rfloor}2^{-n}$.<br />
<br />
Since $\eqs{2}(\mathcal{A})\leq 1/(n+1)$, it follows that $2^{-n}\abs{\mathcal{A}}=\unif_2(\mathcal{A})\leq\binom n{\lfloor n/2\rfloor}2^{-n}$, which proves the theorem. <br />
\end{proof}<br />
<br />
As one sees from the proof, it is very natural to consider different probability distributions on $\{0,1\}^n$, or equivalently on the set of all subsets of $[n]$. The first is the uniform distribution $\unif_2$, which is forced on us by the way the question is phrased. The second is what we called $\eqs{2}$; the reader may check that this is precisely the ``equal-slices'' distribution $\eqs{2}^n$ described in Section~\ref{sec:pdhj}. After seeing the above proof, one might take the attitude that the ``correct'' statement of Sperner's theorem is that if $\calA$ is an antichain, then $\eqs{2}(\calA) \leq 1/(n+1)$, and that the statement given above is a slightly artificial and strictly weaker consequence. <br />
<br />
\subsection{Probabilistic DHJ(\texorpdfstring{$2$}{2})} \label{sec:prob-sperner}<br />
<br />
Indeed, what the proof (essentially) establishes is the ``equal-slices DHJ($2$) theorem''; i.e., that in Theorem~\ref{thm:edhj} one may take $\edhj{2}{\delta} = 1/\delta$.\noteryan{minus $1$, even} We say ``essentially'' because of the small distinction between the distribution $\eqs{2}^n$ used in the proof and the distribution $\ens{2}^n$ in the statement. It will be convenient in this introductory discussion of Sperner's theorem to casually ignore this. We will introduce $\ens{k}^n$ and be more careful about its distinction with $\eqs{k}^n$ in Section~\ref{sec:eqs}.<br />
<br />
To further bolster the claim that $\eqs{2}^n$ is natural in this context we will show an easy proof of the \emph{probabilistic} DHJ($2$) theorem. Looking at the statement of Theorem~\ref{thm:pdhj}, the reader will see it requires defining $\eqs{3}^n$; we will make this definition in the course of the proof.<br />
\begin{lemma} \label{lem:p-sperner} For every real $\delta>0$, every $A \subseteq \{0,1\}^n$ with $\eqs{2}^n$-density at least $\delta$ satisfies<br />
\[<br />
\Pr_{\lambda \sim \eqs{3}^n}[\lambda \subseteq A] \geq \delta^2.<br />
\]<br />
\end{lemma}<br />
<br />
\noindent Note that there is no lower bound necessary on $n$; this is because a template $\lambda \sim \eqs{3}^n$ may be degenerate. <br />
<br />
\begin{proof}<br />
As in our proof of Sperner's theorem, let us choose a permutation $\tau$ of $[n]$ uniformly at random. Suppose we now choose $s \in \{0, 1, \dotsc, n\}$ and also $t \in \{0, 1, \dotsc, n\}$ \emph{independently}. Let $x(\tau,s) \in \{0,1\}^n$ denote the string which has $1$'s in coordinates $\tau(1), \dotsc, \tau(s)$ and $0$'s in coordinates $\tau(s+1), \dotsc, \tau(n)$, and similarly define $x(\tau,t)$. These two strings both have the distribution $\eqs{2}^n$, but are not independent.<br />
<br />
A key observation is that $\{x(\tau,s), x(\tau,t)\}$ is a combinatorial line in $\{0,1\}^n$, unless $s = t$ in which case the two strings are equal. The associated line template is $\lambda \in \{0,1,\wild\}^n$, with<br />
\[<br />
\lambda_i = \begin{cases}<br />
1 & \text{if $i \leq \min\{s,t\}$,}\\<br />
\wild & \text{if $\min\{s,t\} < i \leq \max\{s,t\}$,} \\<br />
0 & \text{if $i > \max\{s,t\}$.}<br />
\end{cases}<br />
\]<br />
This gives the definition of how to draw $\lambda \sim \eqs{3}^n$ (with alphabet $\{0,1,\wild\}$). Note that $\lambda$ is a degenerate template with probability $1/(n+1)$.<br />
<br />
Assuming $\Pr_{x \sim \eqs{2}^n}[x \in A] \geq \delta$, our goal is to show that $\Pr[x(\tau,s), x(\tau,t) \in A ] \geq \delta^2$. But<br />
\begin{align*}<br />
\Pr[x(\tau,s), x(\tau,t) \in A] &= \Ex_{\tau} \Bigl[\Pr_{s,t} [x(\tau,s), x(\tau,t) \in A]\Bigr] & \\<br />
&= \Ex_{\tau} \Bigl[\Pr_{s} [x(\tau,s) \in A] \Pr_{t}[x(\tau,s) \in A]\Bigr] &\text{(independence of $s$, $t$)} \\<br />
&= \Ex_{\tau} \Bigl[\Pr_{s} [x(\tau,s) \in A]^2\Bigr] & \\<br />
&\geq \Ex_{\tau} \Bigl[\Pr_{s} [x(\tau,s) \in A]\Bigr]^2 & \text{(Cauchy-Schwarz)}\\<br />
&= \Pr_{\tau,s} [x(\tau,s) \in A]^2, & <br />
\end{align*}<br />
and $x(\tau,s)$ has the distribution $\eqs{2}^n$, completing the proof.<br />
\end{proof}<br />
<br />
Having proved the probabilistic DHJ($2$) theorem rather easily, an obvious question is whether we can generalize the proof to $k = 3$. The answer seems to be no; there is no obvious way to generate random length-$3$ lines in which the points are independent, or even partially independent as in the previous proof. Nevertheless, the equal-slices distribution remains important for our proof of the general case of DHJ($k$); in Section~\ref{sec:eqs} we shall introduce both $\eqs{k}^n$ and $\ens{k}^n$ and prove some basic facts about them.</div>67.186.58.92http://michaelnielsen.org/polymath1/index.php?title=Concepts.texConcepts.tex2009-06-25T04:20:37Z<p>67.186.58.92: </p>
<hr />
<div>\section{Basic concepts} \label{sec:concepts}<br />
<br />
In this section we introduce some basic concepts around the density Hales--Jewett theorem, and also state formally the extensions of DHJ($k$) that we will prove.<br />
<br />
\subsection{Points, lines, and line templates} \label{sec:defs}<br />
In the density Hales--Jewett theorem the arithmetic properties of $[k] = \{1, 2, \dotsc, k\}$ play no role. It is only important that this set has $k$ elements; this dictates the length of a combinatorial line. Therefore we can equivalently replace $[k]$ by any other \emph{alphabet} $\Omega$ of $k$ \emph{symbols}, treating the elements of $\Omega^n$ as \emph{strings}. We will typically use letters $a$, $b$,~\ldots for symbols, $x$, $y$,~\ldots for strings/points, and $i$, $j$,~\ldots for coordinates; e.g., $x_j = b$ means the $j$th coordinate of string $x$ is symbol $b$.<br />
<br />
We have already seen that it was convenient to use a different alphabet $\Omega$ when we took $\Omega = \{0, 1, \dots, k-1\}$ to deduce van der Waerden's theorem from Hales--Jewett. A more interesting observation is that a combinatorial line template over $[k]^n$, i.e.\ a string in $([k] \cup \{\wild\})^n$, is again just a string over an alphabet of size $k+1$. If we use the symbol `$k + 1$' in place of `$\wild$', we see that \emph{a point in $[k+1]^n$ can be interpreted as a line in $[k]^n$}. For example, the point $13332213 \in [3]^8$ corresponds to the length-$2$ line $\{1\bs{1}\bs{1}\bs{1}221\bs{1}, 1\bs{2}\bs{2}\bs{2}221\bs{2}\} \subset [2]^8$. We introduce the notation <br />
$<br />
\chg{x}{b}{a}<br />
$<br />
for the string formed by changing all occurrences of symbol $b$ to symbol $a$ in string $x$. Thus a line template $\lambda \in ([3] \cup \{\wild\})^n$ corresponds to the line $\{\chg{\lambda}{\wild}{1}, \chg{\lambda}{\wild}{2}, \chg{\lambda}{\wild}{3}\} \subseteq [3]^n$.<br />
<br />
Let us give some formal definitions, taking care of the somewhat tedious possibility of degenerate line templates:<br />
\begin{definition} We say that $x \in \Omega^n$ is a \emph{degenerate string} if it is missing at least one symbol from~$\Omega$. A \emph{line template over $\Omega^n$ with wildcard symbol $c$} ($\not \in \Omega$) is a string $\lambda \in (\Omega \cup \{c\})^n$. We say $z$ is a \emph{degenerate line template} if $z$ contains no `$c$' symbols. If $\lambda$ is nondegenerate, we associate to it the \emph{combinatorial line} $\{\chg{\lambda}{c}{a} : a \in \Omega\} \subseteq \Omega^n$, which has cardinality $\abs{\Omega}$.<br />
\end{definition}<br />
We will often blur the distinction between a line template and a line; however, please note that for us a ``line'' is always nondegenerate, whereas a line template may not be.<br />
<br />
\subsection{Subspaces}<br />
As mentioned, we will find it useful to think of lines over $[k]^n$ as points in $[k+1]^n$, with $k+1$ serving as the wildcard symbol. It's natural then to ask for an interpretation of a point in, say, $[k+2]^n$. This can be thought of as a combinatorial \emph{plane} over $[k]^n$. For simplicity, let's consider $k = 3$ and replace $[k+2]$ with $[3] \cup \{\wild_1, \wild_2\}$. A string in $([3] \cup \{\wild_1, \wild_2\})^n$ such as <br />
\[<br />
\sigma = 1 3 \wild_1 \wild_2 2 1 \wild_1 \wild_1 3 \wild_2 1<br />
\]<br />
(here $n = 11$) corresponds to the following $9$-point combinatorial plane:<br />
\[<br />
\begin{array}{rcl}<br />
\phantom{.[3]^{11} \supset} \{13\bs{1}\bs{1}21\bs{1}\bs{1}3\bs{1}1,&13\bs{2}\bs{1}21\bs{2}\bs{2}3\bs{1}1,&13\bs{3}\bs{1}21\bs{3}\bs{3}3\bs{1}1, \\<br />
13\bs{1}\bs{2}21\bs{1}\bs{1}3\bs{2}1,&13\bs{2}\bs{2}21\bs{2}\bs{2}3\bs{2}1,&13\bs{3}\bs{2}21\bs{3}\bs{3}3\bs{2}1, \\<br />
13\bs{1}\bs{3}21\bs{1}\bs{1}3\bs{3}1,&13\bs{2}\bs{3}21\bs{2}\bs{2}3\bs{3}1,&13\bs{3}\bs{3}21\bs{3}\bs{3}3\bs{3}1\} \subset [3]^{11}. \\<br />
\end{array}<br />
\]<br />
More generally:<br />
\begin{definition} For $d \geq 1$, a \emph{$d$-dimensional subspace template over $\Omega^n$ with wildcard symbols $c_1, \dots, c_d$} (distinct symbols not in $\Omega$) is a string $\sigma \in (\Omega \cup \{c_1, \dots, c_d\})^n$. We say $\sigma$ is a \emph{degenerate template} if $\sigma$ is missing any of the $c_i$ symbols. If $\sigma$ is nondegenerate, we associate to it the \emph{$d$-dimensional combinatorial subspace} $\{\chg{\sigma}{(c_1, \dotsc, c_d)}{(a_1, \dotsc, a_d)} : (a_1, \dotsc, a_d) \in \Omega^d\} \subseteq \Omega^n$, which has cardinality $\abs{\Omega}^d$.<br />
\end{definition}<br />
\noindent Here we have introduced the extended notation $\chg{x}{(c_1, \dotsc, c_d)}{(a_1, \dotsc, a_d)}$ for the string in $\Omega^n$ formed by changing all occurrences of symbol $c_i$ to symbol $a_i$ in string $x$, for all $i \in [d]$.<br />
<br />
The \emph{multidimensional} density Hales--Jewett theorem (also proved by Furstenberg and Katznelson~\cite{FK91}) states that dense subsets of $[k]^n$ contain not just lines but combinatorial subspaces:<br />
\begin{named}{Multidimensional density Hales--Jewett theorem} \label{thm:mdhj} For every real $\delta > 0$ and every pair of positive integers $k$ and $d$, there exists a positive integer $\mdhj{k}{\delta}{d}$ such that every subset of $[k]^n$ of density at least $\delta$ contains a $d$-dimensional combinatorial subspace, provided $n \geq \mdhj{k}{\delta}{d}$.<br />
\end{named}<br />
<br />
Indeed, it is easy to deduce the multidimensional DHJ($k$) from the basic DHJ($k$):<br />
\begin{proposition} \label{prop:mdhj-from-dhj} For a given $k$, the multidimensional density Hales--Jewett theorem follows from the density Hales--Jewett theorem.<br />
\end{proposition}<br />
\begin{proof}<br />
By induction on $d$. Suppose we have proven the multidimensional DHJ($k$) theorem for $d-1$, and let $A \subseteq [k]^n$ have density at least $\delta$. Let $m = \mdhj{k}{\delta/2}{d-1}$, and write a generic string $z \in [k]^n$ as $(x,y)$, where $x \in [k]^m$, $y \in [k]^{n-m}$. Call a string $y \in [k]^{n-m}$ ``good'' if $A_y = \{x \in [k]^m : (x,y) \in A\}$ has density at least $\delta/2$ within $[k]^m$. Let $G \subseteq [k]^{n-m}$ be the set of good $y$'s. By a simple counting argument, the density of $G$ within $[k]^{n-m}$ must be at least $\delta/2$; otherwise, $A$ would not have density at least $\delta$ in $[k]^{n-m}$. <br />
<br />
By induction, for any good $y$ the set $A_y$ contains a $(d-1)$-dimensional combinatorial subspace. There are at most $M = (k+d-1)^m$ such subspaces (since each is defined by a template in $[k + (d-1)]^{m}$). Hence there must be a particular subspace $\sigma \subseteq [k]^m$ such that the set <br />
\[<br />
G_\sigma = \{y \in [k]^{n-m} : (x,y) \in A\ \forall x \in \sigma\}<br />
\]<br />
has density at least $(\delta/2)/M$ within $[k]^{n-m}$. Finally, provided $n \geq m + \dhj{k}{\delta/2M}$, we conclude from DHJ($k$) that $G_\sigma$ contains a combinatorial line, $\lambda$. Now $\sigma \times \lambda \subseteq [k]^n$ is the desired $d$-dimensional subspace contained in $A$.<br />
\end{proof}<br />
<br />
The argument in Proposition~\ref{prop:mdhj-from-dhj} gives a (seemingly) poor bound for large $d$. We note that in the $k = 2$ case, Gunderson, R\"odl, and Sidorenko~\cite{GRS99} have established a much superior bound:<br />
\begin{named}{Gunderson--R\"odl--Sidorenko theorem} \label{thm:grs} One may take $\mdhj{2}{\delta}{d} = (10d)^{d2^d} \cdot (1/\delta)^{2^d}$.<br />
\end{named}<br />
\noindent(To see this from~\cite{GRS99}, combine that paper's Theorem~4.2 and equation~8, and note that its ``$c(d)$'' is at most $(10d)^d$. The authors also showed this bound is fairly close to being sharp.)<br />
\ignore{<br />
\begin{theorem}<br />
Let $A \subseteq [2]^n$ have (uniform) density at least $\delta$, and assume $n \geq (10d)^{d2^d} \cdot (1/\delta)^{2^d}$. Then $A$ contains a nondegenerate $d$-dimensional subspace.<br />
\end{theorem}<br />
\begin{proof}<br />
Theorem 4.2 and equation~(8) of~\cite{GRS99} taken together state that the maximum density of a $d$-dimensional subspace-free subset of $[2]^n$ is at most <br />
\[<br />
c(d) \cdot n^{-1/2^d}, \quad \text{where } c(d) = 10^d 2^{-1/2^{d-1}}d^{d - 1/2^d}.<br />
\]<br />
Note that $c(d) < (10d)^d$. Hence it suffices to check that our lower bound on $n$ implies $\delta \geq (10d)^d n^{-1/2^d}$.<br />
\end{proof}}<br />
<br />
\subsubsection{Passing to a subspace} \label{sec:pass-subspace}<br />
A useful aspect of subspaces is that they ``look like'' the original space.\noteryan{I plagiarized this phrase from Walters.} More precisely, a $d$-dimensional subspace $\{\chg{\sigma}{(c_1, \dotsc, c_d)}{(a_1, \dotsc, a_d)} : (a_1, \dotsc, a_d) \in \Omega^d\} \subseteq \Omega^n$ is isomorphic to $\Omega^d$, in the sense that $d'$-dimensional subspaces of $\sigma$ map to $d'$-dimensional subspaces of $\Omega^n$. This identification is easiest to see when the subspace template $\sigma$ has exactly one copy of each wildcard symbol $c_i$. In this case, writing $J$ for the set of $d$ wildcard coordinates, we will call $\sigma \cong \Omega^J$ a \emph{restriction} to the coordinates $J$ (see Definition~\ref{def:restriction}).<br />
<br />
This isomorphism is useful because it allows us to ``pass to subspaces'' when looking for a combinatorial line (or subspace) in a subset $A \subseteq [k]^n$. By this we mean the following: Given a $d$-dimensional subspace $\sigma$ of $[k]^n$, ``passing to $\sigma$'' means considering $A' = A \cap \sigma$ as a subset of $[k]^d$. Note that if we manage to find a line (or subspace) contained in $A'$, this maps back to a line (subspace) contained in $A$. This is helpful if $A'$ has some kind of improved ``structure''; e.g., a higher density (within $[k]^d$). Indeed, finding density increments on subspaces is the basic strategy for our proof of DHJ($k$); see Section~\ref{sec:outline-incr}. We have also seen this technique of passing to a restriction already, in the proof of Proposition~\ref{prop:mdhj-from-dhj}.<br />
<br />
When passing to a subspace, we will often not make make a distinction between $A$ and $A'$, writing $A$ for both. <br />
<br />
\subsection{Probabilistic DHJ} \label{sec:pdhj}<br />
To show that a certain combinatorial object exists, the Probabilistic Method suggests choosing it a random. Could this work for the density Hales--Jewett theorem? A difficulty is that it is not immediately clear how one should choose a random combinatorial line. The most obvious way to choose a line over, say, $[3]^n$ is to simply choose a line template $\lambda \in [4]^n$ uniformly at random (taking care of the extremely unlikely chance of a degenerate template). Unfortunately this has no real chance of working because the uniform distribution on lines is not ``compatible'' with the uniform distribution on points. <br />
<br />
Before clarifying this point, let us introduce some notation. We will be concerned with probability distributions on the set $\Omega^n$, where $\Omega$ is a finite alphabet. We will use letters such as $\distra$, $\distrb$ for generic such distributions, and if $A \subseteq \Omega^n$ we write $\distra(A) = \Pr_{x \sim \distra}[x \in A]$ for the $\distra$-density of $A$. We will often be concerned with product distributions on $\Omega^n$; if $\pi$ is a distribution on $\Omega$ we write $\pi^{\ot n}$ for the associated product distribution on $\Omega^n$. We reserve $\unif$ for the uniform distribution, writing $\unif_\Omega^{\ot n}$ for the uniform distribution on $\Omega^n$ and simply $\unif_k^{\ot n}$ when $\Omega = [k]$. We may abbreviate this further to simply $\unif_k$ or even $\unif$ if the context is clear. <br />
<br />
Let us now return to the difficulty with choosing lines uniformly at random. To clarify, suppose we choose $\lambda \in [4]^n$ according to the uniform distribution $\unif_4^{\ot n}$. Then the distribution on the point $\chg{\lambda}{4}{1} \in [3]^n$ will be a product distribution $\prd^{\otimes n}$ in which symbol $1$ has probability $1/2$ and symbols $2$ and $3$ have probability $1/4$ each. However this distribution is very ``far'' from the uniform distribution on $[3]^n$, meaning that even if $\unif_3^{\ot n}(A) \geq \delta$, the probability that $\chg{\lambda}{4}{1} \in [3]^n$ may be a negligibly small function of $n$. <br />
<br />
Even for other natural distributions on lines, it seems inevitable that $\chg{\lambda}{4}{1}$ will ``have more $1$'s than $2$'s or $3$'s'', and similarly for $\chg{\lambda}{4}{2}$, $\chg{\lambda}{4}{3}$. To evade this difficulty we take inspiration from the probabilistic proof Sperner's theorem (see, e.g.,~\cite{Spe90})\noteryan{earliest citation I could find; note that Lubell's proof is a bit different} and introduce a new non-uniform, non-product distribution on points in $[k]^n$. We call this distribution the \emph{equal-slices} distribution and denote it by $\eqs{k}^n$. We will discuss $\eqs{2}^n$ in Section~\ref{sec:sperner} on Sperner's theorem, and its generalization $\eqs{k}^n$ in Section~\ref{sec:eqs}. For now we simply explain the name: Thinking of $\Omega = \{0,1\}$, one draws a string $x \in \{0,1\}^n$ from $\eqs{2}^n$ as follows: first choose an integer $s \in \{0, 1, \dots, n\}$ uniformly; then choose $x$ uniformly at random from the $s$th ``slice'' of the discrete cube, i.e., the set of strings with exactly $s$ many $1$'s. The more general $\eqs{k}^n$ involves choosing the count of each symbol from $[k]$ in a certain uniform manner, then drawing a uniformly random string with the chosen symbol counts.\noteryan{awkward}<br />
<br />
As we will see, equal-slices is the natural distribution to use for a probabilistic version of DHJ($k$). There is a small catch, which is that a line template $\lambda \sim \eqs{k}^n$ has a very tiny chance of being being degenerate. To compensate for this, we introduce the very closely related distribution $\ens{k}^n$ (``equal-nondegenerate-slices''), which is $\eqs{k}^n$ conditioned on $\lambda$ being a nondegenerate \emph{string} (i.e., having at least one of each symbol in $[k]$). For more details, see Section~\ref{sec:eqs}. We then prove the following two theorems:<br />
\begin{theorem} \label{thm:edhj} (Equal-slices density Hales--Jewett theorem.) For every positive integer $k$ and real $\delta>0$ there exists a positive integer $\edhj{k}{\delta}$ such that every $A \subseteq [k]^n$ with $\ens{k}^n(A) \geq \delta$ contains a combinatorial line, provided $n \geq \edhj{k}{\delta}$.<br />
\end{theorem}<br />
\begin{theorem} \label{thm:pdhj} (Probabilistic density Hales--Jewett theorem.) For every positive integer $k$ and real $\delta>0$ there exists a positive integer $\Pdhj{k}{\delta}$ and a positive real $\pdhj{k}{\delta}$ such that every $A \subseteq [k]^n$ with $\ens{k}^n(A) \geq \delta$satisfies<br />
\[<br />
\Pr_{\lambda \sim \ens{k+1}^n}[\lambda \subseteq A] \geq \pdhj{k}{\delta},<br />
\]<br />
provided $n \geq \Pdhj{k}{\delta}$. Here we are interpreting $\lambda$ as both a (nondegenerate) line template and as a line.<br />
\end{theorem}<br />
Theorem~\ref{thm:edhj} follows from the density Hales--Jewett theorem by playing around with probability distributions; see Proposition~\ref{prop:edhj-equiv-dhj}. Theorem~\ref{thm:pdhj} follows almost immediately from Theorem~\ref{thm:edhj}, thanks to some nice properties of the equal-nondegenerate-slices distribution; see Proposition~\ref{prop:pdhj-from-edhj}. Finally, the same two deductions also give us multidimensional versions; the ``equal-slices multidimensional density Hales--Jewett theorem'' (with parameter ``$\emdhj{k}{\delta}{d}$''), and the following:<br />
\begin{theorem} \label{thm:pmdhj} (Probabilistic multidimensional density Hales--Jewett theorem.) For every real $\delta > 0$ and every pair of positive integers $k$ and $d$, there exists a positive integer $\Pmdhj{k}{\delta}{d}$ and a positive real $\pmdhj{k}{\delta}{d}$ such that every $A \subseteq [k]^n$ with $\ens{k}^n(A) \geq \delta$ satisfies <br />
\[<br />
\Pr_{\sigma \sim \ens{k+d}^n}[\sigma \subseteq A] \geq \pmdhj{k}{\delta}{d},<br />
\]<br />
provided $n \geq \Pmdhj{k}{\delta}{d}$.<br />
\end{theorem}</div>67.186.58.92http://michaelnielsen.org/polymath1/index.php?title=Intro.texIntro.tex2009-06-25T04:20:10Z<p>67.186.58.92: </p>
<hr />
<div>\section{Introduction}<br />
<br />
\subsection{Basic theorem statements}<br />
The purpose of this paper is to give the first elementary proof of the density Hales--Jewett theorem. This theorem, first proved by Furstenberg and Katznelson~\cite{FK89,FK91}, has the same relation to the Hales--Jewett theorem~\cite{HJ63} as Szemer\'edi's theorem~\cite{Sze75} has to van der Waerden's theorem~\cite{vdW27}. Before we go any further, let us state all four theorems. We shall use the notation $[k]$ to stand for the set $\{1,2,\dotsc,k\}$. If $X$ is a set and $r$ is a positive integer, then an $r$-\textit{colouring} of $X$ will mean a function $\kappa\colon X\rightarrow [r]$. A subset $Y$ of $X$ is called \textit{monochromatic} if $\kappa(y)$ is the same for every $y\in Y$.<br />
<br />
First, let us state van der Waerden's theorem and Szemer\'edi's theorem.<br />
<br />
\begin{named}{van der Waerden's Theorem} \label{thm:vdw}<br />
For every pair of positive integers $k$ and $r$ there exists $N$ such that for every $r$-colouring of $[N]$ there is a monochromatic arithmetic progression of length $k$.<br />
\end{named}<br />
<br />
\begin{named}{Szemer\'edi's Theorem} \label{thm:szem}<br />
For every positive integer $k$ and every $\delta>0$ there exists $N$ such that every subset $A\subseteq[N]$ of size at least $\delta N$ contains an arithmetic progression of length $k$.<br />
\end{named}<br />
<br />
It is usually better to focus on the \textit{density} $\abs{A}/N$ of a subset $A\subseteq [N]$ rather than on its cardinality, since this gives us a parameter that we can think of independently of $N$. Szemer\'edi's theorem is often referred to as the \textit{density version} of van der Waerden's theorem. <br />
<br />
To state the Hales--Jewett theorem, we need a little more terminology. The theorem is concerned with subsets of $[k]^n$, elements of which we refer to as \emph{points} (or \emph{strings}). Instead of looking for arithmetic progressions, the Hales--Jewett theorem looks for \emph{combinatorial lines}. A combinatorial line in $[k]^n$ is a set of $k$ points $\{x^{(1)}, \dots, x^{(k)}\}$ formed as follows: Given a line \emph{template}, which is a string $\lambda \in ([k] \cup \{\wild\})^n$, the associated combinatorial line is formed by setting $x^{(i)}$ to be the point given by changing each``wildcard symbol'' `$\wild$' in $\lambda$ to symbol `$i$'. For instance, when $k = 3$ and $\lambda = 1 3 \wild \wild 2 2 1 \wild$, the associated combinatorial line is the following set of $3$ points:<br />
\[<br />
\{13\bs{1}\bs{1}221\bs{1}, 13\bs{2}\bs{2}221\bs{2}, 1\bs{3}\bs{3}3221\bs{3}\}.<br />
\]<br />
(We exclude \emph{degenerate} combinatorial lines, those that arise from templates with no $\wild$'s. More formal definitions are given in Section~\ref{sec:defs}.)<br />
\ignore{<br />
Let $x=(x_1,\dots,x_n)$ be an element of $[k]^n$, let $W\subset[n]$ and let $j\in[k]$. Let us write $x\oplus jW$ for the sequence $(y_1,\dots,y_n)$ such that $y_i=x_i$ if $i\notin A$ and $y_i=j$ if $i\in A$. That is, we take the sequence $x$ and overwrite the terms with index in $W$ with a $j$. For example, if $k=3$, $n=5$, then <br />
\begin{equation*}<br />
(1,1,3,2,1)+2\{1,4,5\}=(2,1,3,2,2).<br />
\end{equation*}<br />
The \textit{combinatorial line} $x\oplus[k]W$ is the set $\{x\oplus jW:j\in [k]\}$. For instance, \begin{equation*}<br />
(1,1,3,2,1)\oplus[3]\{1,4,5\}=\{(1,1,3,1,1),(2,1,3,2,2),(3,1,3,3,3)\}.<br />
\end{equation*}<br />
It may seem strange that the combinatorial line $x\oplus [k]W$ does not depend on the values of $x_i$ with $i\in A$, but this turns out to be a useful convention and makes it possible to state results in a concise way.}<br />
<br />
We are now ready to state the Hales--Jewett theorem.<br />
\begin{named}{Hales--Jewett theorem}\label{thm:hj}<br />
For every pair of positive integers $k$ and $r$ there exists a positive integer $\hj{k}{r}$ such that for every $r$-colouring of the set $[k]^n$ there is a monochromatic combinatorial line, provided $n \geq \hj{k}{r}$.<br />
\end{named}<br />
As with van der Waerden's theorem, we may consider the density version of the Hales-Jewett theorem, where the density of $A \subseteq [k]^n$ is $\abs{A}/k^n$. The following theorem was first proved by Furstenberg and Katznelson.<br />
\begin{named}{Density Hales--Jewett theorem} \label{thm:dhj}<br />
For every positive integer $k$ and real $\delta>0$ there exists a positive integer $\dhj{k}{\delta}$ such that every subset of $[k]^n$ of density at least $\delta$ contains a combinatorial line, provided $n \geq \dhj{k}{\delta}$.<br />
\end{named}<br />
<br />
We sometimes write ``DHJ($k$)'' to mean the $k$ case of this theorem.\noteryan{A bit confusing, since $\dhj{k}{\delta}$ is a number in the above theorem. Hmm.} The first nontrivial case, DHJ($2$), is a weak version of Sperner's theorem~\cite{Spe28}; we discuss this further in Section~\ref{sec:sperner}. We also remark that the Hales--Jewett theorem immediately implies van der Waerden's theorem, and likewise for the density versions. To see this, temporarily interpret $[m]$ as $\{0, 1, \dotsc, m-1\}$ rather than $\{1, 2, \dotsc, m\}$, and identify integers in $[N]$ with their base-$k$ representation in $[k]^n$.\noteryan{(we may assume $N$ is a power of $k$)} It's then easy to see that a combinatorial line in $[k]^n$ is a length-$k$ arithmetic progression in $[N]$; specifically, if the line's template is $\lambda$, with $S = \{i : \lambda_i = \star\}$, then the progression's common difference is $\sum_{i \in S} k^{n-i}$.<br />
<br />
In this paper, we give a new, elementary proof of the density Hales--Jewett theorem, achieving quantitative bounds:<br />
\begin{theorem} \label{thm:our-dhj} In the density Hales--Jewett theorem, one may take $\dhj{3}{\delta} = 2 \upuparrows O(1/\delta^3)$. For $k \geq 4$, the bound $\dhj{k}{\delta}$ we achieve is of Ackermann type.\noteryan{cop-out}<br />
\end{theorem}<br />
\noindent Here we use the notation $x \uparrow y$ for $x^y$, $x \uparrow^{(\ell)} y$ for $x \uparrow x \uparrow \dotsm \uparrow x \uparrow y$ (with $\ell$ many $\uparrow$'s), and $x \upuparrows y$ for $x \uparrow^{(y)} x$.<br />
<br />
We add that it is not too hard to obtain the following extensions to this result: (i)~a \emph{multidimensional} version, in which one finds higher-dimensional combinatorial subspaces;\ignore{We remark here that it is also possible to deduce the multidimensional Szemer\'edi theorem from the density Hales--Jewett theorem. Previously there were two known approaches: the ergodic approach and approaches that use hypergraph regularity.<br />
} (ii)~a \emph{probabilistic} version, in which one shows that a randomly chosen combinatorial line (from a suitable distribution) is in the subset with positive probability depending only on $k$ and $\delta$;\noteryan{name-check Varnavides?} (iii)~the combined probabilistic multidimensional version. Indeed, to prove Theorem~\ref{thm:our-dhj} we found it necessary to obtain these extensions in passing. See Section~\ref{sec:concepts} for more detailed statements.<br />
<br />
<br />
\subsection{Some discussion}<br />
Why is it interesting to give a new proof of this result? There are two main main reasons. The first is connected with the history of results in this area. One of the main benefits of Furstenberg's proof of Szemer\'edi's theorem was that it introduced a technique---ergodic methods---that could be developed in many directions, which did not seem to be the case with Szemer\'edi's proof. As a result, many far-reaching generalizations of Szemer\'edi's theorem were proved, and for a long time nobody could prove them in any other way than by using Furstenberg's methods. In the last few years that has changed, and a programme has developed to find new and finitary proofs of the results that were previously known only by infinitary ergodic methods. \noteryan{Citations would be good here; which are the far-reaching generalisations? Also, Tim's paper on Szemer\'edi is maybe 9 years old now\dots does that count as ``few years''?} Giving a non-ergodic proof of the density Hales--Jewett theorem was seen as a key goal for this programme: on the ergodic side it appeared to be significantly harder than Szemer\'edi's theorem, and this seemed to be reflected by certain significant combinatorial difficulties that we shall discuss later in this paper.\noteryan{Which difficulties does this refer to?} Having given a purely combinatorial proof, we are able to obtain explicit bounds for how large $n$ needs to be as a function of $\delta$ and $k$ in the density Hales--Jewett theorem (although admittedly the bounds for $k \geq 4$ are terrible and even the $k = 3$ bound is not especially good). Such bounds could not be obtained via the ergodic methods even in principle, since these proofs rely on the Axiom of Choice\noteryan{I just heard someone say this; is it true?}.\noteryan{maybe expand slightly on this, and/or cite the sister paper}<br />
<br />
A second reason is that the density Hales--Jewett theorem immediately implies Szemer\'edi's theorem, and finding a new proof of Szemer\'edi's theorem seems always to be illuminating---or at least this has been the case for the four main approaches discovered so far.\noteryan{Citations here.} In retrospect, one can add to this the fact that the proof we have discovered is, against all our expectations\noteryan{a strong phrase\dots}, \textit{simpler} than the previous approaches to Szemer\'edi's theorem; the most advanced notion needed is that of the total variation distance between discrete probability distributions. It seems that by looking at a more general problem we have removed some of the difficulty. Related to this is another surprise. We started out by trying to prove the first difficult case of the theorem, which is when $k=3$. The experience of all four of the earlier proofs of Szemer\'edi's theorem has been that interesting ideas are needed to prove results about progressions of length $3$, but significant extra difficulties arise when one tries to generalize an argument from the length-$3$ case to the general case. However, it turned out that once we had proved the case $k=3$ of the density Hales--Jewett theorem, it was straightforward to generalize the argument to the general case. We still do not have a convincing explanation of why our proof should differ from all the others in this respect.\noteryan{Perhaps we should think of one :)}<br />
<br />
\ignore{<br />
Here, briefly, is how to deduce Szemer\'edi's theorem from the density Hales--Jewett theorem. Given $k$ and $\delta$, let $n$ be such that the density Hales--Jewett theorem holds for $k$ and $\delta$ and let $N=k^n$. There is no harm in thinking of $[N]$ as the set $\{0,1,\dots,N-1\}$. Now given $A \subset [N]$ of density $\delta$, we can write the integers in $A$ in base $k$, thus thinking of it as a subset of density $\delta$ in~$[k]^n$ (where again, there is no harm in thinking of $[k]$ as $\{0, 1, \dots, k-1\}$). The density Hales--Jewett theorem gives a nondegenerate combinatorial line in $A$, and it is easy to see that this corresponds to a length-$k$ arithmetic progression: if the line's template is $\lambda$, with $S = \{i : \lambda_i = \star\}$, then the progression's common difference is $\sum_{i \in S} k^{n-i}$. \ignore{Thinking of $[k]$ as the set $\{0,1,\dots,k-1\}$ and $[N]$ as the set $\{0,1,\dots,N-1\}$, let $\phi:[k]^n\rightarrow[N]$ be the map that takes the sequence $(x_1,\dots,x_n)$ to the number $x_1+x_2k+x_3k^2+\dots+x_nk^{n-1}$, and note that $\phi$ is a bijection.\noteryan{Could we say here the catchier phrase, ``write the integers in $A$ in base $k$''?} Now let $A\subset[N]$ be a set of density $\delta$. Then $\phi^{-1}(A)$ is a subset of $[k]^n$ of density $\delta$, so by the density Hales--Jewett theorem it contains a combinatorial line. It is easy to see that the image of a combinatorial line under $\phi$ is an arithmetic progression of length $k$: indeed, if the combinatorial line is $x\oplus[k]W$,\noteryan{should convert to new notation} then the first term of the arithmetic progression is $\phi(x\oplus 0W)$ and the common difference is $\sum_{i\in W}k^i$. A very similar argument shows that the Hales--Jewett theorem implies van der Waerden's theorem.}}<br />
<br />
Before we start working towards the proof of the theorem, we would like briefly to mention that it was proved in a rather unusual ``open source" way, which is why it is being published under a pseudonym. The work was done by several mathematicians, who wrote their thoughts, as they had them, in the form of blog comments.\noteryan{I would like to at least consider adding a link or statement that it was on Tim's blog. Otherwise, the only identifying part of the paper is the wiki's URL, which might make it seem like Michael Nielsen was the mastermind. Okay with you, Tim?} Anybody who wanted to could participate, and at all stages of the process the comments were fully open to anybody who was interested. This was in complete contrast to the usual way that results are proved in private and presented in a finished form. The blog comments are still available, so although this paper is a somewhat polished account of the argument, it is possible to read a record of the entire thought process that led to the proof. The participants also created a wiki, which contains sketches of the argument, links to the blog comments, and a great deal of related material. The wiki's URL is \url{http://michaelnielsen.org/polymath1/}.\\<br />
<br />
\notetim{Somewhere we should mention Tim Austin's work and how it relates to ours. Perhaps Terry is best placed to do this.}<br />
<br />
<br />
\subsection{Outline of the paper}<br />
XXX ``table of contents'' paragraph to be filled in XXX</div>67.186.58.92http://michaelnielsen.org/polymath1/index.php?title=Abstract.texAbstract.tex2009-06-25T04:19:51Z<p>67.186.58.92: </p>
<hr />
<div>\begin{abstract}<br />
The Hales--Jewett theorem asserts that for every $r$ and every $k$ there exists $n$ such that every $r$-colouring of the $n$-dimensional grid $\{1, \dotsc, k\}^n$ contains a combinatorial line. This result is a generalization of van der Waerden's theorem, and it is one of the fundamental results of Ramsey theory. The theorem of van der Waerden has a famous density version, conjectured by Erd\H os and Tur\'an in 1936, proved by Szemer\'edi in 1975 and given a different proof by Furstenberg in 1977. The Hales--Jewett theorem has a density version as well, proved by Furstenberg and Katznelson in 1991 by means of a significant extension of the ergodic techniques that had been pioneered by Furstenberg in his proof of Szemer\'edi's theorem. In this paper, we give the first elementary proof of the theorem of Furstenberg and Katznelson, and the first to provide a quantitative bound on how large $n$ needs to be. In particular, we show that a subset of $[3]^n$ of density $\delta$ contains a combinatorial line if $n \geq 2 \upuparrows O(1/\delta^3)$. Our proof is surprisingly\noteryan{``reasonably'', maybe} simple: indeed, it gives what is probably the simplest known proof of Szemer\'edi's theorem. <br />
\end{abstract}</div>67.186.58.92http://michaelnielsen.org/polymath1/index.php?title=Dhj.styDhj.sty2009-06-25T04:19:27Z<p>67.186.58.92: </p>
<hr />
<div>\usepackage{amsmath,amssymb,amsthm,amsfonts,latexsym,hyperref,xspace}<br />
\usepackage[pdftex]{graphicx}<br />
<br />
% layout<br />
\setlength{\textwidth}{6.5 in}<br />
\setlength{\textheight}{9in}<br />
\setlength{\oddsidemargin}{0in}<br />
\setlength{\topmargin}{0in}<br />
\addtolength{\voffset}{-.5in}<br />
<br />
% environments <br />
\newtheorem{theorem}{Theorem}[section]<br />
\newtheorem*{namedtheorem}{\theoremname}<br />
\newcommand{\theoremname}{testing}<br />
\newenvironment{named}[1]{ \renewcommand{\theoremname}{#1} \begin{namedtheorem}} {\end{namedtheorem}}<br />
\newtheorem{lemma}[theorem]{Lemma}<br />
\newtheorem{claim}[theorem]{Claim}<br />
\newtheorem{proposition}[theorem]{Proposition}<br />
\newtheorem{fact}[theorem]{Fact}<br />
\newtheorem{corollary}[theorem]{Corollary}<br />
\theoremstyle{definition}<br />
\newtheorem{definition}[theorem]{Definition}<br />
\theoremstyle{definition}<br />
\newtheorem{remark}[theorem]{Remark}<br />
\theoremstyle{definition}<br />
\newtheorem{notation}[theorem]{Notation}<br />
\theoremstyle{plain}<br />
<br />
<br />
%%%%%%%%%%%% Generic math macros %%%%%%%%%%%%<br />
<br />
% probability and other operators<br />
\renewcommand{\Pr}{\mathop{\bf Pr\/}} <br />
\newcommand{\Ex}{\mathop{\bf E\/}} <br />
\newcommand{\Varx}{\mathop{\bf Var\/}} <br />
\newcommand{\Covx}{\mathop{\bf Cov\/}} <br />
\newcommand{\littlesum}{\mathop{{\textstyle \sum}}}<br />
<br />
% correct typography<br />
\newcommand{\abs}[1]{\left\lvert #1 \right\rvert} <br />
<br />
% mathrm terms<br />
\newcommand{\poly}{\mathrm{poly}}<br />
<br />
% number systems<br />
\newcommand{\R}{\mathbb R}<br />
\newcommand{\N}{\mathbb N}<br />
\newcommand{\Z}{\mathbb Z}<br />
<br />
% short forms <br />
\newcommand{\eps}{\epsilon}<br />
\newcommand{\lam}{\lambda}<br />
\newcommand{\la}{\langle}<br />
\newcommand{\ra}{\rangle}<br />
\newcommand{\wt}[1]{\widetilde{#1}}<br />
\newcommand{\wh}[1]{\widehat{#1}}<br />
\newcommand{\ot}{\otimes}<br />
\newcommand{\half}{{\textstyle \frac12}}<br />
<br />
% calligraphic letters<br />
\newcommand{\calA}{{\mathcal{A}}}<br />
\newcommand{\calB}{{\mathcal{B}}}<br />
\newcommand{\calC}{{\mathcal{C}}} <br />
\newcommand{\calE}{\mathcal{E}}<br />
\newcommand{\calS}{\mathcal{S}}<br />
\newcommand{\calU}{\mathcal{U}}<br />
<br />
% bold<br />
\newcommand{\bs}[1]{\boldsymbol{#1}}<br />
<br />
% barred letters<br />
\newcommand{\barJ}{{\overline{J}}}<br />
\newcommand{\barT}{{\overline{T}}}<br />
<br />
<br />
%%%%%%%%%%%% document-writing macros %%%%%%%%%%%% <br />
<br />
\newcommand{\ignore}[1]{*\marginpar{\tiny \em ignored part}}<br />
%\newcommand{\ignore}[1]{}<br />
<br />
%\newcommand{\noteryan}[1]{*\marginpar{\tiny *Ryan: \bf #1}}<br />
\newcommand{\noteryan}[1]{{\tiny [[#1]]}}<br />
%\newcommand{\noteryan}[1]{}<br />
\newcommand{\notetim}[1]{*\marginpar{\tiny *Tim: \bf #1}}<br />
%\newcommand{\notetim}[1]{}<br />
<br />
<br />
%%%%%%%%%%%% DHJ-centric macros %%%%%%%%%%%% <br />
<br />
\newcommand{\unif}{\mu} % uniform distribution<br />
\newcommand{\spac}{\nu} % the uniform spacing distribution<br />
\newcommand{\eqs}[1]{\overline{\nu}_{#1}} % equal-slices<br />
\newcommand{\ens}[1]{\widetilde{\nu}_{#1}} % equal-nondengenerate-slices<br />
\newcommand{\prd}{\pi} % a generic product distribution<br />
\newcommand{\distra}{\gamma} % a generic distribution<br />
\newcommand{\distrb}{\xi} % another generic distribution<br />
<br />
\newcommand{\slice}{s}<br />
<br />
\newcommand{\dtv}[2]{d_{\mathrm{TV}}(#1,#2)} % total variation distance<br />
\newcommand{\dchi}[2]{d_{\chi^2}(#1,#2)} % chi^2 distance<br />
<br />
\newcommand{\chg}[3]{#1^{#2\rightarrow#3}} % a string with some symbols replaced<br />
<br />
\newcommand{\wild}{\mathord{\star}} % wildcard symbol<br />
<br />
\newcommand{\ppn}{p} % proportion of coordinates free in the measure-embedding argument<br />
<br />
\newcommand{\hj}[2]{\mathrm{HJ}_{#1}(#2)}<br />
\newcommand{\dhj}[2]{\mathrm{DHJ}_{#1}(#2)}<br />
\newcommand{\mdhj}[3]{\mathrm{MDHJ}_{#1}(#2,#3)}<br />
\newcommand{\edhj}[2]{\mathrm{EDHJ}_{#1}(#2)}<br />
\newcommand{\Pdhj}[2]{\mathrm{PDHJ}_{#1}(#2)}<br />
\newcommand{\pdhj}[2]{\epsilon_{#1}(#2)}<br />
\newcommand{\emdhj}[3]{\mathrm{EMDHJ}_{#1}(#2,#3)}<br />
\newcommand{\Pmdhj}[3]{\mathrm{PMDHJ}_{#1}(#2,#3)}<br />
\newcommand{\pmdhj}[3]{\epsilon_{#1}(#2,#3)}</div>67.186.58.92http://michaelnielsen.org/polymath1/index.php?title=Dhj.texDhj.tex2009-06-25T04:19:04Z<p>67.186.58.92: </p>
<hr />
<div>\documentclass[11pt]{article}<br />
\usepackage{dhj}<br />
<br />
<br />
\begin{document}<br />
<br />
\title{A new proof of the density Hales-Jewett theorem}<br />
\author{D. H. J. Polymath\thanks{\url{http://michaelnielsen.org/polymath1/}}}<br />
\maketitle<br />
<br />
\input{abstract}<br />
\input{intro}<br />
\input{concepts}<br />
\input{sperner}<br />
\input{outline}<br />
\input{equal-slices}<br />
\input{measures}<br />
\input{easy}<br />
\input{correlation}<br />
\input{partitioning}<br />
\input{theproof}<br />
\input{outro}<br />
<br />
\bibliographystyle{alpha}<br />
\bibliography{dhj}<br />
<br />
\end{document}</div>67.186.58.92http://michaelnielsen.org/polymath1/index.php?title=TeX_files_for_first_paperTeX files for first paper2009-06-25T04:17:58Z<p>67.186.58.92: </p>
<hr />
<div>The token is available.<br />
<br />
[[dhj.tex]]<br />
<br />
[[dhj.sty]]<br />
<br />
[[abstract.tex]]<br />
<br />
[[intro.tex]]<br />
<br />
[[concepts.tex]]<br />
<br />
[[sperner.tex]]<br />
<br />
[[outline.tex]]<br />
<br />
[[equal-slices.tex]]<br />
<br />
[[measures.tex]]<br />
<br />
[[easy.tex]]<br />
<br />
[[correlation.tex]]<br />
<br />
[[partitioning.tex]]<br />
<br />
[[theproof.tex]]<br />
<br />
[[outro.tex]]<br />
<br />
[[dhj.bib]]<br />
<br />
[http://www.cs.cmu.edu/~odonnell/ens-figure.png .png file for the figure illustrating equal-nondenerate-slices draw]<br />
<br />
[http://www.cs.cmu.edu/~odonnell/ens-figure.ppt .ppt original for the figure illustrating equal-nondenerate-slices draw]<br />
<br />
----<br />
<br />
[http://www.cs.cmu.edu/~odonnell/dhj.pdf Draft as of June 25, 0h30m EDT]</div>67.186.58.92http://michaelnielsen.org/polymath1/index.php?title=TeX_files_for_first_paperTeX files for first paper2009-06-25T04:17:27Z<p>67.186.58.92: </p>
<hr />
<div>The token is available.<br />
<br />
[[dhj.tex]]<br />
<br />
[[dhj.sty]]<br />
<br />
[[abstract.tex]]<br />
<br />
[[intro.tex]]<br />
<br />
[[concepts.tex]]<br />
<br />
[[sperner.tex]]<br />
<br />
[[outline.tex]]<br />
<br />
[[equal-slices.tex]]<br />
<br />
[[measures.tex]]<br />
<br />
[[easy.tex]]<br />
<br />
[[correlation.tex]]<br />
<br />
[[partitioning.tex]]<br />
<br />
[[theproof.tex]]<br />
<br />
[[outro.tex]]<br />
<br />
[[dhj.bib]]<br />
<br />
[http://www.cs.cmu.edu/~odonnell/ens-fig.png .png file for the figure illustrating equal-nondenerate-slices draw]<br />
<br />
[http://www.cs.cmu.edu/~odonnell/ens-fig.ppt .ppt original for the figure illustrating equal-nondenerate-slices draw]<br />
<br />
----<br />
<br />
[http://www.cs.cmu.edu/~odonnell/dhj.pdf Draft as of June 25, 0h30m EDT]</div>67.186.58.92http://michaelnielsen.org/polymath1/index.php?title=Correlation.texCorrelation.tex2009-05-15T02:40:48Z<p>67.186.58.92: New page: \section{Line-free sets correlate with insensitive set intersections}</p>
<hr />
<div>\section{Line-free sets correlate with insensitive set intersections}</div>67.186.58.92http://michaelnielsen.org/polymath1/index.php?title=TeX_files_for_first_paperTeX files for first paper2009-05-15T02:40:28Z<p>67.186.58.92: </p>
<hr />
<div>[[dhj.tex]]<br />
<br />
[[dhj.sty]]<br />
<br />
[[abstract.tex]]<br />
<br />
[[intro.tex]]<br />
<br />
[[measures.tex]]<br />
<br />
[[insensitive.tex]]<br />
<br />
[[correlation.tex]]<br />
<br />
[[partitioning.tex]]<br />
<br />
[[biblio.tex]]<br />
<br />
[[dhj.bib]]</div>67.186.58.92http://michaelnielsen.org/polymath1/index.php?title=Sperner%27s_theoremSperner's theorem2009-03-12T17:08:51Z<p>67.186.58.92: </p>
<hr />
<div>==Statement of the theorem==<br />
<br />
Sperner's theorem as originally stated is a result about set systems. Suppose that you want to find the largest collection of sets <math>\mathcal{A}</math> such that no set in <math>\mathcal{A}</math> is a proper subset of any other. Then the best you can do is choose all the sets of some fixed size---and of course the best size to pick is <math>\lfloor n/2\rfloor</math>, since the binomial coefficient <math>\binom nm</math> is maximized when <math>m=\lfloor n/2\rfloor.</math><br />
<br />
Sperner's theorem is closely related to the density Hales-Jewett theorem: in fact, it is nothing other than DHJ(2) with the best possible bound. To see this, we associate each set <math>A\subset[n]</math> with its characteristic function (that is, the sequence that is 0 outside A and 1 in A). If we have a pair of sets <math>A\subset B,</math> then the two sequences form a [[combinatorial line]] in <math>[2]^n.</math> For example, if n=6 and A and B are the sets <math>\{2,3\}</math> and <math>\{2,3,4,6\}</math>, then we get the combinatorial line that consists of the two points 011000 and 011101, which we can denote by 011*0* (so the wildcard set is <math>\{4,6\}</math>).<br />
<br />
==Proof of the theorem==<br />
<br />
There are several proofs, but perhaps the most enlightening is a very simple averaging argument that proves a stronger result. Let <math>\mathcal{A}</math> be a collection of subsets of [n]. For each k, let <math>\delta_k</math> denote the density of <math>\mathcal{A}</math> in the kth [[slice|layer]] of the cube: that is, it is the number of sets in <math>\mathcal{A}</math> of size k, divided by <math>\binom nk.</math> The [[equal-slices measure]] of <math>\mathcal{A}</math> is defined to be <math>\delta_0+\dots+\delta_n.</math> <br />
<br />
Now the equal-slices measure of <math>\mathcal{A}</math> is easily seen to be equal to the following quantity. Let <math>\pi</math> be a random permutation of [n], let <math>U_0,U_1,U_2\dots,U_n</math> be the sets <math>\emptyset, \{\pi(1)\},\{\pi(1),\pi(2)\},\dots,[n],</math> and let <math>\mu(\mathcal{A})</math> be the expected number of the sets <math>U_i</math> that belong to <math>\mathcal{A}.</math> This is the same by linearity of expectation and the fact that the probability that <math>U_k</math> belongs to <math>\mathcal{A}</math> is <math>\delta_k.</math><br />
<br />
Therefore, if the equal-slices measure of <math>\mathcal{A}</math> is greater than 1, then the expected number of sets <math>U_k</math> in <math>\mathcal{A}</math> is greater than 1, so there must exist a permutation for which it is at least 2, and that gives us a pair of sets with one contained in the other.<br />
<br />
To see that this implies Sperner's theorem, one just has to make the simple observation that a set with equal-slices measure at most 1 must have cardinality at most <math>\binom n{\lfloor n/2\rfloor}.</math> (If n is odd, so that there are two middle layers, then it is not quite so obvious that to have an extremal set you must pick one or other of the layers, but this is the case.) This stronger version of the statement is called the <em>LYM inequality</em><br />
<br />
==Multidimensional version==<br />
<br />
The following proof is a variant of the [http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6WHS-45GMWMS-9&_user=10&_rdoc=1&_fmt=&_orig=search&_sort=d&view=c&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=d2d74495dece42adf2ad10297bcb49ee Gunderson-Rodl-Sidorenko] result. Its parameters are a little worse, but the proof is a little simpler.<br />
<br />
'''Proposition 1:''' Let <math>A \subseteq \{0,1\}^n</math> have density <math>\delta</math>. Let <math>Y_1, \dots, Y_d</math> be a partition of <math>[n]</math> with <math>|Y_i| \geq r</math> for each <math>i</math>. If<br />
<br />
<math>\delta^{2^d} - \frac{d}{\sqrt{\pi r}} > 0, </math> (1)<br />
<br />
then <math>A</math> contains a nondegenerate combinatorial subspace of dimension <math>d</math>, with its <math>i</math>th wildcard set a subset of <math>Y_i</math>.<br />
<br />
'''Proof:''' Let <math>C_i</math> denote a random chain from <math>0^{|Y_i|}</math> up to <math>1^{|Y_i|}</math>, thought of as residing in the coordinates <math>Y_i</math>, with the <math>d</math> chains chosen independently. Also, let <math>s_i, t_i</math> denote independent Binomial<math>(|Y_i|, 1/2)</math> random variables, <math>i \in [d]</math>. Note that <math>C_i(s_i)</math> and <math>C_i(t_i)</math> are (dependent) uniform random strings in <math>\{0,1\}^{Y_i}</math>. We write, say,<br />
<br />
<math>(C_1(s_1), C_2(t_2), C_3(t_3), \dots, C_d(s_d)) </math> (2)<br />
<br />
for the string in <math>\{0,1\}^n</math> formed by putting <math>C_1(s_1)</math> into the <math>Y_1</math> coordinates, <math>C_2(t_2)</math> into the <math>Y_2</math> coordinates, etc. Note that each string of this form is also uniformly random, since the chains are independent.<br />
<br />
If all <math>2^d</math> strings of the form in (2) are simultaneously in <math>A</math> then we have a <math>d</math>-dimensional subspace inside <math>A</math> with wildcard sets that are \emph{subsets} of <math>Y_1, \dots, Y_d</math>. All <math>d</math> dimensions are nondegenerate iff <math>s_i \neq t_i</math> for all <math>i</math>. Since <math>s_i</math> and <math>t_i</math> are independent Binomial<math>(|Y_i|, 1/2)</math>'s with <math>|Y_i| \geq r</math>, we have<br />
<br />
<math>\Pr[s_i = t_i] \leq \frac{1}{\sqrt{\pi r}}.</math><br />
<br />
Thus to complete the proof, it suffices to show that with probability at least <math>\delta^{2^d}</math>, all <math>2^d</math> strings of the form in (2) are in <math>A</math>. <br />
<br />
This is easy: writing <math>f</math> for the indicator of <math>A</math>, the probability is<br />
<br />
<math>\mathbf{E}_{C_1, \dots, C_d} \left[\mathbf{E}_{s_1, \dots, t_d}[f(C_1(s_1), \dots, C_d(s_d)) \cdots f(C_1(t_1), \dots, C_d(t_d))]\right].</math><br />
<br />
Since <math>s_1, \dots, t_d</math> are independent, the inside expectation-of-a-product can be changed to a product of expectations. <b>[THIS STEP IS WRONG, I THINK -- Ryan]</b> But for fixed <math>C_1, \dots, C_d</math>, each string of the form in (2) has the same distribution. Hence the above equals<br />
<br />
<math>\mathbf{E}_{C_1, \dots, C_d} \left[\mathbf{E}_{s_1, \dots, s_d}[f(C_1(s_1), \dots, C_d(s_d))]^{2^d}\right].</math><br />
<br />
By Jensen (or repeated Cauchy-Schwarz), this is at least<br />
<br />
<math>\left(\mathbf{E}_{C_1, \dots, C_d} \mathbf{E}_{s_1, \dots, s_d}[f(C_1(s_1), \dots, C_d(s_d))]\right)^{2^d}.</math><br />
<br />
But this is just <math>\delta^{2^d}</math>, since <math>(C_1(s_1), \dots, C_d(s_d))</math> is uniformly distributed. []<br />
<br />
<br />
As an aside:<br />
'''Corollary 2:''' If <math>A \subseteq [n]</math> has density <math>\Omega(1)</math>, then <math>A</math> contains a nondegenerate combinatorial subspace of dimension at least <math>\log_2 \log n - O(1)</math>.<br />
<br />
<br />
If we are willing to sacrifice significantly more probability, we can find a <math>d</math>-dimensional subspace randomly. <br />
<br />
'''Corollary 3:''' In the setting of Proposition 1, assume <math>\delta < 2/3</math> and <br />
<br />
<math> r \geq \exp(4 \ln(1/\delta) 2^d). </math> (3)<br />
<br />
Suppose we choose a random nondegenerate <math>d</math>-dimensional subspace of <math>[n]</math> with wildcard sets <math>Z_i \subseteq Y_i</math>. By this we mean choosing, independently for each <math>i</math>, a random combinatorial line within <math>\{0,1\}^{Y_i}</math>, uniformly from the <math>3^r - 1</math> possibilities. Then this subspace is entirely contained within <math>A</math> with probability at least <math>3^{-dr}</math>.<br />
<br />
<br />
This follows immediately from Proposition~\ref{prop:1}: having <math>r</math> as in (3) achieves (1), hence the desired nondengenerate combinatorial subspace exists and we pick it with probability <math>1/(3^r-1)^d</math>.<br />
<br />
<br />
We can further conclude:<br />
'''Corollary 4''': Let <math>A \subseteq \{0,1\}^n</math> have density <math>\delta < 2/3</math> and let <math>Y_1, \dots, Y_d</math> be disjoint subsets of <math>[n]</math> with each <math>|Y_i| \geq r</math>, <br />
<br />
<math> r \geq \exp(4 \ln(1/\delta) 2^d). </math><br />
<br />
Choose a nondegenerate combinatorial subspace at random by picking uniformly nondegenerate combinatorial lines in each of <math>Y_1, \dots, Y_d</math>, and filling in the remaining coordinates outside of the <math>Y_i</math>'s uniformly at random. Then with probability at least <math>\exp(-r^{O(1)})</math>, this combinatorial subspace is entirely contained within <math>A</math>.<br />
<br />
<br />
This follows because for a random choice of the coordinates outside the <math>Y_i</math>'s, there is a <math>\delta/2</math> chance that <math>A</math> has density at least <math>\delta/2</math> over the <math>Y</math> coordinates. We then apply the previous corollary, noting that <math>\exp(-r^{O(1)}) \ll (\delta/2)3^{-dr}</math>, even with <math>\delta</math> replaced by <math>\delta/2</math> in the lower bound demanded of <math>r</math>. <br />
<br />
<br />
<br />
===Strong version===<br />
<br />
An alternative argument deduces the multidimensional Sperner theorem from the density Hales-Jewett theorem. We can think of <math>[2]^n</math> as <math>[2^k]^{n/k}.</math> If we do so and apply DHJ(2^k) and translate back to <math>[2]^n,</math> then we find that we have produced a k-dimensional combinatorial subspace. This is obviously a much more sophisticated proof, since DHJ(2^k) is a very hard result, but it gives more information, since the wildcard sets turn out to have the same size. A sign that this strong version is genuinely strong is that it implies Szemer&eacute;di's theorem. For instance, suppose you take as your set <math>\mathcal{A}</math> the set of all sequences such that the number of 0s plus the number of 1s in even places plus twice the number of 1s in odd places belongs to some dense set in <math>[3n].</math> Then if you have a 2D subspace with both wildcard sets of size d, one wildcard set consisting of odd numbers and the other of even numbers (which this proof gives), then this implies that in your dense set of integers you can find four integers of the form a, a+d, a+2d, a+d+2d, which is an arithmetic progression of length 4.<br />
<br />
One can also prove the above strong form of Sperner theorem by using the multidimensional Szemer&eacute;di theorem which has combinatorial proofs. (reference!) It states that large dense high dimensional grids contain [[corners]]. Given a dense subset of <math>[2]^n,</math> denoted by <math>\mathcal{A}</math>. We can suppose that the elements of <math>\mathcal{A}</math> are of size about <math>\frac{n}{2}\pm C\sqrt{n}.</math> Take a random permutation of [n]. An element of <math>\mathcal{A}</math> is “<math>d-</math>nice” after the permutation if it consists of <math>d</math> intervals, each of length between <math>\frac{n}{2d}\pm C\sqrt{n}/2,</math> and each interval begins at position <math>id</math> for some <math>0\leq i< \frac{n}{d}.</math> (Suppose that <math>d</math> divides <math>n</math>) Any <math>d-</math>nice set can be represented as a point in a <math>d-</math>dimensional <math>[C\sqrt{n}]^d</math> cube. The sets represented by the vertices of an axis-parallel <math>d-</math>dimensional cube in <math>[C\sqrt{n}]^d</math> form a subspace with equal sized wildcard sets. Finding a cube is clearly more difficult than finding a corner, but it's existence in dense sets also follows from the multidimensional Szemer&eacute;di theorem. All what we need is to show that the expected number of the <math>d-</math>nice elements is <math>c\sqrt{n}^d</math> where c only depends on the density of <math>\mathcal{A}</math>. For a typical <math>m-</math>element subset of <math>\mathcal{A}</math> the probability that it is <math>d-</math>nice after the permutation is about <math>\binom{n}{m}^{-1}\sqrt{n}^{d-1}.</math> The sum for elements of <math>\mathcal{A}</math> with size between <math>\frac{n}{2}\pm C\sqrt{n},</math> gives that the expected number of the <math>d-</math>nice elements is <math>c\sqrt{n}^d,</math> so there is a cube if n is large enough.<br />
<br />
==Further remarks==<br />
<br />
The k=3 generalisation of the LYM inequality is the [[hyper-optimistic conjecture]].<br />
<br />
Sperner's theorem is also related to the [[Kruskal-Katona theorem]].<br />
<br />
* [http://en.wikipedia.org/wiki/Sperner%27s_theorem The Wikipedia entry for Sperner's theorem]<br />
* [http://en.wikipedia.org/wiki/Lubell-Yamamoto-Meshalkin_inequality The Wikipedia entry for the LYM inequality]</div>67.186.58.92http://michaelnielsen.org/polymath1/index.php?title=Sperner%27s_theoremSperner's theorem2009-03-08T20:36:15Z<p>67.186.58.92: </p>
<hr />
<div>==Statement of the theorem==<br />
<br />
Sperner's theorem as originally stated is a result about set systems. Suppose that you want to find the largest collection of sets <math>\mathcal{A}</math> such that no set in <math>\mathcal{A}</math> is a proper subset of any other. Then the best you can do is choose all the sets of some fixed size---and of course the best size to pick is <math>\lfloor n/2\rfloor</math>, since the binomial coefficient <math>\binom nm</math> is maximized when <math>m=\lfloor n/2\rfloor.</math><br />
<br />
Sperner's theorem is closely related to the density Hales-Jewett theorem: in fact, it is nothing other than DHJ(2) with the best possible bound. To see this, we associate each set <math>A\subset[n]</math> with its characteristic function (that is, the sequence that is 0 outside A and 1 in A). If we have a pair of sets <math>A\subset B,</math> then the two sequences form a [[combinatorial line]] in <math>[2]^n.</math> For example, if n=6 and A and B are the sets <math>\{2,3\}</math> and <math>\{2,3,4,6\}</math>, then we get the combinatorial line that consists of the two points 011000 and 011101, which we can denote by 011*0* (so the wildcard set is <math>\{4,6\}</math>).<br />
<br />
==Proof of the theorem==<br />
<br />
There are several proofs, but perhaps the most enlightening is a very simple averaging argument that proves a stronger result. Let <math>\mathcal{A}</math> be a collection of subsets of [n]. For each k, let <math>\delta_k</math> denote the density of <math>\mathcal{A}</math> in the kth [[slice|layer]] of the cube: that is, it is the number of sets in <math>\mathcal{A}</math> of size k, divided by <math>\binom nk.</math> The [[equal-slices measure]] of <math>\mathcal{A}</math> is defined to be <math>\delta_0+\dots+\delta_n.</math> <br />
<br />
Now the equal-slices measure of <math>\mathcal{A}</math> is easily seen to be equal to the following quantity. Let <math>\pi</math> be a random permutation of [n], let <math>U_0,U_1,U_2\dots,U_n</math> be the sets <math>\emptyset, \{\pi(1)\},\{\pi(1),\pi(2)\},\dots,[n],</math> and let <math>\mu(\mathcal{A})</math> be the expected number of the sets <math>U_i</math> that belong to <math>\mathcal{A}.</math> This is the same by linearity of expectation and the fact that the probability that <math>U_k</math> belongs to <math>\mathcal{A}</math> is <math>\delta_k.</math><br />
<br />
Therefore, if the equal-slices measure of <math>\mathcal{A}</math> is greater than 1, then the expected number of sets <math>U_k</math> in <math>\mathcal{A}</math> is greater than 1, so there must exist a permutation for which it is at least 2, and that gives us a pair of sets with one contained in the other.<br />
<br />
To see that this implies Sperner's theorem, one just has to make the simple observation that a set with equal-slices measure at most 1 must have cardinality at most <math>\binom n{\lfloor n/2\rfloor}.</math> (If n is odd, so that there are two middle layers, then it is not quite so obvious that to have an extremal set you must pick one or other of the layers, but this is the case.) This stronger version of the statement is called the <em>LYM inequality</em><br />
<br />
==Multidimensional version==<br />
<br />
The following proof is a variant of the [http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6WHS-45GMWMS-9&_user=10&_rdoc=1&_fmt=&_orig=search&_sort=d&view=c&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=d2d74495dece42adf2ad10297bcb49ee Gunderson-Rodl-Sidorenko] result. Its parameters are a little worse, but the proof is a little simpler.<br />
<br />
'''Proposition 1:''' Let <math>A \subseteq \{0,1\}^n</math> have density <math>\delta</math>. Let <math>Y_1, \dots, Y_d</math> be a partition of <math>[n]</math> with <math>|Y_i| \geq r</math> for each <math>i</math>. If<br />
<br />
<math>\delta^{2^d} - \frac{d}{\sqrt{\pi r}} > 0, </math> (1)<br />
<br />
then <math>A</math> contains a nondegenerate combinatorial subspace of dimension <math>d</math>, with its <math>i</math>th wildcard set a subset of <math>Y_i</math>.<br />
<br />
'''Proof:''' Let <math>C_i</math> denote a random chain from <math>0^{|Y_i|}</math> up to <math>1^{|Y_i|}</math>, thought of as residing in the coordinates <math>Y_i</math>, with the <math>d</math> chains chosen independently. Also, let <math>s_i, t_i</math> denote independent Binomial</math>(|Y_i|, 1/2)</math> random variables, <math>i \in [d]</math>. Note that <math>C_i(s_i)</math> and <math>C_i(t_i)</math> are (dependent) uniform random strings in <math>\{0,1\}^{Y_i}</math>. We write, say,<br />
<br />
<math>(C_1(s_1), C_2(t_2), C_3(t_3), \dots, C_d(s_d)) </math> (2)<br />
<br />
for the string in <math>\{0,1\}^n</math> formed by putting <math>C_1(s_1)</math> into the <math>Y_1</math> coordinates, <math>C_2(t_2)</math> into the <math>Y_2</math> coordinates, etc. Note that each string of this form is also uniformly random, since the chains are independent.<br />
<br />
If all <math>2^d</math> strings of the form in (2) are simultaneously in <math>A</math> then we have a <math>d</math>-dimensional subspace inside <math>A</math> with wildcard sets that are \emph{subsets} of <math>Y_1, \dots, Y_d</math>. All <math>d</math> dimensions are nondegenerate iff <math>s_i \neq t_i</math> for all <math>i</math>. Since <math>s_i</math> and <math>t_i</math> are independent Binomial<math>(|Y_i|, 1/2)</math>'s with <math>|Y_i| \geq r</math>, we have<br />
<br />
<math>\Pr[s_i = t_i] \leq \frac{1}{\sqrt{\pi r}}.</math><br />
<br />
Thus to complete the proof, it suffices to show that with probability at least <math>\delta^{2^d}</math>, all <math>2^d</math> strings of the form in (2) are in <math>A</math>. <br />
<br />
This is easy: writing <math>f</math> for the indicator of <math>A</math>, the probability is<br />
<br />
<math>\mathbf{E}_{C_1, \dots, C_d} \left[\mathbf{E}_{s_1, \dots, t_d}[f(C_1(s_1), \dots, C_d(s_d)) \cdots f(C_1(t_1), \dots, C_d(t_d))]\right].</math><br />
<br />
Since <math>s_1, \dots, t_d</math> are independent, the inside expectation-of-a-product can be changed to a product of expectations. But for fixed <math>C_1, \dots, C_d</math>, each string of the form in (2) has the same distribution. Hence the above equals<br />
<br />
<math>\mathbf{E}_{C_1, \dots, C_d} \left[\mathbf{E}_{s_1, \dots, s_d}[f(C_1(s_1), \dots, C_d(s_d))]^{2^d}\right].</math><br />
<br />
By Jensen (or repeated Cauchy-Schwarz), this is at least<br />
<br />
<math>\left(\mathbf{E}_{C_1, \dots, C_d} \mathbf{E}_{s_1, \dots, s_d}[f(C_1(s_1), \dots, C_d(s_d))]\right)^{2^d}.</math><br />
<br />
But this is just <math>\delta^{2^d}</math>, since <math>(C_1(s_1), \dots, C_d(s_d))</math> is uniformly distributed. []<br />
<br />
<br />
As an aside:<br />
'''Corollary 2:''' If <math>A \subseteq [n]</math> has density <math>\Omega(1)</math>, then <math>A</math> contains a nondegenerate combinatorial subspace of dimension at least <math>\log_2 \log n - O(1)</math>.<br />
<br />
<br />
If we are willing to sacrifice significantly more probability, we can find a <math>d</math>-dimensional subspace randomly. <br />
<br />
'''Corollary 3:''' In the setting of Proposition 1, assume <math>\delta < 2/3</math> and <br />
<br />
<math> r \geq \exp(4 \ln(1/\delta) 2^d). </math> (3)<br />
<br />
Suppose we choose a random nondegenerate <math>d</math>-dimensional subspace of <math>[n]</math> with wildcard sets <math>Z_i \subseteq Y_i</math>. By this we mean choosing, independently for each <math>i</math>, a random combinatorial line within <math>\{0,1\}^{Y_i}</math>, uniformly from the <math>3^r - 1</math> possibilities. Then this subspace is entirely contained within <math>A</math> with probability at least <math>3^{-dr}</math>.<br />
<br />
<br />
This follows immediately from Proposition~\ref{prop:1}: having <math>r</math> as in (3) achieves (1), hence the desired nondengenerate combinatorial subspace exists and we pick it with probability <math>1/(3^r-1)^d</math>.<br />
<br />
<br />
We can further conclude:<br />
'''Corollary 4''': Let <math>A \subseteq \{0,1\}^n</math> have density <math>\delta < 2/3</math> and let <math>Y_1, \dots, Y_d</math> be disjoint subsets of <math>[n]</math> with each <math>|Y_i| \geq r</math>, <br />
<br />
<math> r \geq \exp(4 \ln(1/\delta) 2^d). </math><br />
<br />
Choose a nondegenerate combinatorial subspace at random by picking uniformly nondegenerate combinatorial lines in each of <math>Y_1, \dots, Y_d</math>, and filling in the remaining coordinates outside of the <math>Y_i</math>'s uniformly at random. Then with probability at least <math>\exp(-r^{O(1)})</math>, this combinatorial subspace is entirely contained within <math>A</math>.<br />
<br />
<br />
This follows because for a random choice of the coordinates outside the <math>Y_i</math>'s, there is a <math>\delta/2</math> chance that <math>A</math> has density at least <math>\delta/2</math> over the <math>Y</math> coordinates. We then apply the previous corollary, noting that <math>\exp(-r^{O(1)}) \ll (\delta/2)3^{-dr}</math>, even with <math>\delta</math> replaced by <math>\delta/2</math> in the lower bound demanded of <math>r</math>. <br />
<br />
<br />
<br />
===Strong version===<br />
<br />
An alternative argument deduces the multidimensional Sperner theorem from the density Hales-Jewett theorem. We can think of <math>[2]^n</math> as <math>[2^k]^{n/k}.</math> If we do so and apply DHJ(2^k) and translate back to <math>[2]^n,</math> then we find that we have produced a k-dimensional combinatorial subspace. This is obviously a much more sophisticated proof, since DHJ(2^k) is a very hard result, but it gives more information, since the wildcard sets turn out to have the same size. A sign that this strong version is genuinely strong is that it implies Szemer&eacute;di's theorem. For instance, suppose you take as your set <math>\mathcal{A}</math> the set of all sequences such that the number of 0s plus the number of 1s in even places plus twice the number of 1s in odd places belongs to some dense set in <math>[3n].</math> Then if you have a 2D subspace with both wildcard sets of size d, one wildcard set consisting of odd numbers and the other of even numbers (which this proof gives), then this implies that in your dense set of integers you can find four integers of the form a, a+d, a+2d, a+d+2d, which is an arithmetic progression of length 4.<br />
<br />
==Further remarks==<br />
<br />
The k=3 generalisation of the LYM inequality is the [[hyper-optimistic conjecture]].<br />
<br />
Sperner's theorem is also related to the [[Kruskal-Katona theorem]].<br />
<br />
* [http://en.wikipedia.org/wiki/Sperner%27s_theorem The Wikipedia entry for Sperner's theorem]<br />
* [http://en.wikipedia.org/wiki/Lubell-Yamamoto-Meshalkin_inequality The Wikipedia entry for the LYM inequality]</div>67.186.58.92