Proof of DHJ(3) via density-increment

From Polymath1Wiki
Jump to: navigation, search

Introduction

This article is intended to contain a proof outline of DHJ([math]3[/math]) via the density-increment method. The proofs of each step in the outline will appear in separate articles. The goal is to rigorously codify the "second outline" proof. A secondary goal is to write the proof in such a way that it becomes "easy" to see the generalization to DHJ([math]k[/math]). Each step of the outline has a "Proofread By" section to which people can add their names.


Notation and definitions

We'll call the main probability distribution we work with the Pólya Urn distribution. As a distribution on [math][k]^n[/math], we'll write it as [math]\nu_k^n[/math], or as [math]\nu_k[/math] or [math]\nu[/math] if it's clear from context. (Note: even if one only wants to prove DHJ([math]3[/math]), one needs to define [math]\nu_k^n[/math] for general [math]k[/math].) One way to define it is as follows: Order the coordinates [math][n][/math] in a list according to a random permutation. Now choose [math]k-1[/math] distinct "partition points" uniformly at random from the [math]\binom{n+1}{k-1}[/math] possibilities. This partitions the list of coordinates into [math]k[/math] sublists; take the [math]1[/math]st sublist of coordinates to be [math]1[/math]'s, the [math]2[/math]nd sublist to be [math]2[/math]'s, etc. Note that any string drawn from the Pólya Urn distribution is nondegenerate, meaning it contains at least one of each character in [math][3][/math].

A closely related distribution is called the Equal Slices distribution. We will denote this distribution by [math]\overline{\nu}_k^n[/math]. It has the same definition as the Pólya Urn definition just given, except that the [math]k-1[/math] partition points are "allowed to be equal"; more precisely, we choose the [math]1[/math]st partition point uniformly from the [math]n+1[/math] possibilities, then we choose the [math]2[/math]nd partition point uniformly from the [math]n+2[/math] possibilities, etc. It should be clear from these definitions that the total variation distance between [math]\nu_k^n[/math] and [math]\overline{\nu}_k^n[/math] is small. Here is a proof that it is [math]O(k^2/n)[/math].

We identify combinatorial lines in [math][3]^n[/math] with strings in [math][4]^n[/math], where the [math]4[/math] character is treated as identifying the "wildcard" set. Similarly, we identify [math]2[/math]-dimensional combinatorial subspaces in [math][3]^n[/math] with strings in [math][5]^n[/math], where the characters [math]4[/math] and [math]5[/math] identify the [math]2[/math] wildcard sets. Etc. Note that a combinatorial line in [math][3]^n[/math] drawn from [math]\nu_4^n[/math] is automatically nondegenerate. Henceforth, whenever we say "combinatorial line" or "combinatorial subspace" without qualification, the object is assumed to be nondegenerate.

Although we will prove DHJ([math]3[/math]) for sets that are dense under [math]\nu_3[/math], this also proves it for sets that are dense under the usual uniform distribution; this should be explicated in the article on passing between measures.


Proof Outline

Here is the sequence of statements we prove:

1. Dense sets in [math][2]^n[/math] contain lines (i.e., DHJ([math]2[/math])).

2. Dense sets in [math][2]^n[/math] contain large subspaces.

3. Dense ij-insensitive sets in [math][3]^n[/math] contain large subspaces.

4. Dense [math]ij[/math]-insensitive sets in [math][3]^n[/math] are large-subspace rich.

5. Dense [math]ij[/math]-insensitive sets in [math][3]^n[/math] are large-subspace partitionable.

6. Dense intersections of [math]ij[/math]-insensitive sets in [math][3]^n[/math] are large-subspace partitionable.

7. Line-free sets in [math][3]^n[/math] have density increments on dense intersections of [math]ij[/math]-insensitive sets.

8. Line-free sets in [math][3]^n[/math] have density increments on large subspaces.

9. Dense sets in [math][3]^n[/math] contain lines (i.e., DHJ([math]3[/math])).


1. Sets in [math][2]^n[/math] have lines

Proofread By: no one yet

This is simply Sperner's Theorem. Our version, with the Pólya Urn distribution, is:

Theorem 1: Let [math]A \subseteq [2]^n[/math] have [math]\nu_2[/math]-density at least [math]\delta[/math]. Assume that [math]n \geq n_1(\delta)[/math], where [math]n_1(\delta) = O(1/\delta^2)[/math]. Then [math]A[/math] contains a combinatorial ([math]2[/math]-)line.
Proof of Sperner's Theorem. For now, the proof more or less appears in the article on Sperner's Theorem; we might want to use the writeup ideas in the current article on Equal Slices. The easiest proof involves first passing from Pólya to Equal Slices. I think you only need [math]n_1(\delta) \gt 1/\delta^2 + 1[/math] or something.


2. Sets in [math][2]^n[/math] have subspaces

Proofread By: no one yet

The next step is to extend DHJ([math]2[/math]) from lines to subspaces of arbitrary dimension.

Lemma 2: Let [math]A \subseteq [2]^n[/math] have [math]\nu_2[/math]-density at least [math]\delta[/math], and let [math]d \geq 1[/math]. Assume that [math]n \geq n_2(\delta,d)[/math]. Then [math]A[/math] contains a [math]d[/math]-dimensional combinatorial subspace. Here [math]n_2(\delta,d)[/math] probably only needs to be [math]O(1/\delta)^{2^{d+1}}[/math]

This sort of result has been shown for the uniform distribution case by [[1]], from which one can easily conclude the Pólya distribution case by passing between measures. Alternatively, one can prove the result with a (much) worse bound on [math]n_2[/math] using an argument which generalizes to [math][k]^n[/math]. A few technical bits need to be filled into this argument; namely, showing that if one does Pólya on some small random subset [math]S \subseteq [n][/math], and then combines this with an independent Pólya draw on [math][n] \setminus S[/math], the overall draw is very close in total variation distance to a normal Pólya draw. This is proved rather explicitly in the passing between measures article. (Presumably there is also literature from the Pólya Urn studiers on this, too.)


3. [math]ij[/math]-insensitive sets in [math][3]^n[/math] have subspaces

Proofread By: no one yet

Let [math]i,j \in [k][/math] be distinct characters. We define [math]A \subseteq [k]^n[/math] to be an "[math]ij[/math]-insensitive set" if for all strings [math]x[/math], changing [math]i[/math]'s to [math]j[/math]'s in [math]x[/math] or vice versa does not change whether or not [math]x \in A[/math].

An [math]23[/math]-insensitive in [math][3]^n[/math] is elsewhere on the blog called a "[math]1[/math]-set", since it depends only on the set of coordinates where the value 1 is taken. It is a special case of a special set of complexity [math]1[/math]. See also this article discussing the complexity of sets.

The statement we prove here is:

Lemma 3: Let [math]A \subseteq [3]^n[/math] be an [math]ij[/math]-insensitive set with [math]\nu_3[/math]-density at least [math]\delta[/math], and let [math]d \geq 1[/math]. Assume that [math]n \geq n_3(\delta,d) = \sqrt{2/\delta} \cdot n_2(\delta/2, d)[/math]. Then [math]A[/math] contains a [math]d[/math]-dimensional combinatorial subspace.

We sketch the proof here, for now.

Proof: (Sketch.) Assume without loss of generality that [math]i = 2[/math], [math]j = 3[/math]. We can think of a draw [math]x \sim \nu_3^n[/math] by first conditioning on the set of coordinates [math]S[/math] where [math]x[/math] has its [math]3[/math]'s, and then drawing the remainder of the string as [math]y \sim \nu_2^{|\overline{S}|}[/math], where [math]\overline{S} = [n] \setminus S[/math]. (This requires a quick justification.) Inventing some notation, we have [math]\mathbf{E}_S[\nu_2(A_S)] = \delta[/math]. Further, it should be easy to check that [math]\mathbf{Pr}[|\overline{S}| \lt\gamma n] \leq \gamma^2[/math] (or perhaps only a tiny bit higher). Hence [math]\mathbf{E}_S[\nu_2(A_S) \mid |\overline{S}| \geq \gamma n] \geq \delta - \gamma^2.[/math] We set [math]\gamma = \sqrt{\delta/2}[/math] and conclude that there exists a particular set of [math]3[/math]-coordinates [math]S_0[/math] such that [math]\nu_2(A_{S_0}) \geq \delta/2[/math] and [math]|\overline{S_0}| \geq \sqrt{\delta/2} n[/math]. By choice of [math]n_2[/math], this lets us apply Lemma 2 to conclude that [math]A_{S_0} \subseteq [2]^{S_0}[/math] contains a [math]d[/math]-dimensional combinatorial subspace; call it [math]\sigma[/math].
Pulling [math]\sigma[/math] back into [math][3]^n[/math] by putting [math]3[/math]'s into the [math]S[/math] coordinates, we get a [math]d[/math]-dimensional "[math]2[/math]-subspace" in [math]A \subseteq [3]^n[/math]. But in fact, all of the strings necessary to fill this out into a complete [math]d[/math]-dimensional subspace in [math][3]^n[/math] must also be present in [math]A[/math], because [math]A[/math] is [math]23[/math]-insensitive. [math]\Box[/math] (somewhat poorly explained)


4. [math]ij[/math]-insensitive sets are subspace rich

Proofread By: no one yet

Lemma 4: Let [math]A \subseteq [3]^n[/math] be an [math]ij[/math]-insensitive set with [math]\nu[/math]-density at least [math]\delta[/math], and let [math]d \geq 1[/math]. Assume that [math]n \geq n_4(\delta,d) = n_3(\delta/2,d)[/math]. Then a random [math]d[/math]-dimensional subspace drawn from [math]\nu^n_{3+d}[/math] is in [math]A[/math] with probability at least [math]\eta_4(\delta, d) = \delta/(O(d))^{n_4(\delta,d)}[/math].

This is sketched originally here. Here is a more complete argument:

Proof: (Sketch.) For any [math]3 \leq m \leq n[/math] one can draw a string [math]h \sim \nu_3^n[/math] as follows: First, draw [math]f \sim \nu_m^n[/math]; then independently draw [math]g \sim \nu_3^m[/math]; finally, set [math]h = g \circ f[/math]. This notation means that [math]h_i = g_{f_i}[/math]. The fact that this indeed gives the distribution [math]\nu_3^n[/math] is justified in [document], but should be simplified and ported to the wiki.
By assumption, [math]\mathbf{Pr}[h \in A] \geq \delta[/math]; hence with probability at least [math]\delta/2[/math] over the choice of [math]f[/math] we get that [math]A_f \subseteq [3]^m[/math] has [math]\nu_3[/math]-density at least [math]\delta/2[/math]. Here [math]A_f[/math] of course denotes [math]\{g \in [3]^m : g \circ f \in A[/math]. Crucially, [math]A_f[/math] is an [math]ij[/math]-insensitive set because [math]A[/math]. (This takes a slight bit of reflection but is, I think, easily confirmed.) Call an [math]f[/math] "good" if indeed [math]\nu_3(A_f) \geq \delta/2[/math].
We therefore select [math]m = n_3(\delta/2,d)[/math] (which is allowable as [math]n \geq n_3(\delta/2,d)[/math]) and use Lemma 3 to conclude that for each good [math]f[/math], the set [math]A_f[/math] contains some [math]d[/math]-dimensional subspace. In these cases, a randomly chosen subspace [math]\sigma \sim [3+d]^m[/math] is in [math]A_f[/math] with some positive probability. At this point, we should calculate the least probability of any outcome under [math]\nu_k^n[/math]; I believe it is [math]\frac{\Theta(1/k)^{k/2}}{n^{(k-1)/2} \cdot k^n}[/math]. Lower-bounding this crudely, we have that [math]\sigma \sim [3+d]^m[/math] is in [math]A_f[/math] with probability at least [math](O(d))^{-m}[/math].
So overall we have that for [math]f \sim \nu_3^m[/math] and [math]\sigma \sim [3+d]^m[/math], the probability of [math]\sigma \circ f[/math] being entirely in [math]A[/math] is at least [math]\delta/(O(d))^m = \delta/(O(d))^{n_3(\delta/2,d)}[/math]. But [math]\sigma \circ f[/math] is distributed as [math]\nu_{3+d}^n[/math], completing the proof. [math]\Box[/math]


5. [math]ij[/math]-insensitive sets are subspace partitionable

Proofread By: no one yet

Definition: A set [math]A \subseteq [3]^n[/math] is "[math](\gamma,d)[/math]-subspace partitionable" if it can be written as a disjoint union of [math]d[/math]-dimensional combinatorial subspaces, along with a single "error" set [math]E[/math] satisfying [math]\nu(E)/\nu(A) \leq \gamma[/math].
Lemma 5: Let [math]A \subseteq [3]^n[/math] be an [math]ij[/math]-insensitive set with [math]\nu[/math]-density at least [math]\delta[/math], let [math]d \geq 1[/math], and let [math]\gamma \in (0,1)[/math] be a parameter. Assume that [math]n \geq n_5(\delta,\gamma,d) = XXX[/math]. Then [math]A[/math] is [math](\gamma,d)[/math]-subspace partitionable.

This is currently written fairly thoroughly here, although the parameters need to be worked out to fill in the "XXX" in the statement.

6. Intersections of [math]ij[/math]-insensitive sets are subspace partitionable

Proofread By: no one yet

Currently written here.


7. Line-free sets increment on intersections of [math]ij[/math]-insensitive sets

Proofread By: no one yet

Written mostly completely here, although some of the "passing between measures" tricks that were done for the uniform distribution should be done for the Pólya distribution instead.


8. Line-free sets increment on subspaces

Proofread By: no one yet

Sketch here.


9. Sets in [math][3]^n[/math] have lines

Proofread By: no one yet

The basic idea is described in the article on the density increment method.