Proof of DHJ(3) via density-increment
Introduction
This article is intended to contain a proof outline of DHJ([math]\displaystyle{ 3 }[/math]) via the density-increment method. The proofs of each step in the outline will appear in separate articles. The goal is to rigorously codify the "second outline" proof. A secondary goal is to write the proof in such a way that it becomes "easy" to see the generalization to DHJ([math]\displaystyle{ k }[/math]). Each step of the outline has a "Proofread By" section to which people can add their names.
Notation and definitions
We'll call the main probability distribution we work with the Pólya Urn distribution. As a distribution on [math]\displaystyle{ [k]^n }[/math], we'll write it as [math]\displaystyle{ \nu_k^n }[/math], or as [math]\displaystyle{ \nu_k }[/math] or [math]\displaystyle{ \nu }[/math] if it's clear from context. (Note: even if one only wants to prove DHJ([math]\displaystyle{ 3 }[/math]), one needs to define [math]\displaystyle{ \nu_k^n }[/math] for general [math]\displaystyle{ k }[/math].) One way to define it is as follows: Order the coordinates [math]\displaystyle{ [n] }[/math] in a list according to a random permutation. Now choose [math]\displaystyle{ k-1 }[/math] distinct "partition points" uniformly at random from the [math]\displaystyle{ \binom{n+1}{k-1} }[/math] possibilities. This partitions the list of coordinates into [math]\displaystyle{ k }[/math] sublists; take the [math]\displaystyle{ 1 }[/math]st sublist of coordinates to be [math]\displaystyle{ 1 }[/math]'s, the [math]\displaystyle{ 2 }[/math]nd sublist to be [math]\displaystyle{ 2 }[/math]'s, etc. Note that any string drawn from the Pólya Urn distribution is nondegenerate, meaning it contains at least one of each character in [math]\displaystyle{ [3] }[/math].
A closely related distribution is called the Equal Slices distribution. We will denote this distribution by [math]\displaystyle{ \overline{\nu}_k^n }[/math]. It has the same definition as the Pólya Urn definition just given, except that the [math]\displaystyle{ k-1 }[/math] partition points are "allowed to be equal"; more precisely, we choose the [math]\displaystyle{ 1 }[/math]st partition point uniformly from the [math]\displaystyle{ n+1 }[/math] possibilities, then we choose the [math]\displaystyle{ 2 }[/math]nd partition point uniformly from the [math]\displaystyle{ n+2 }[/math] possibilities, etc. It should be clear from these definitions that the total variation distance between [math]\displaystyle{ \nu_k^n }[/math] and [math]\displaystyle{ \overline{\nu}_k^n }[/math] is small. Here is a proof that it is [math]\displaystyle{ O(k^2/n) }[/math].
We identify combinatorial lines in [math]\displaystyle{ [3]^n }[/math] with strings in [math]\displaystyle{ [4]^n }[/math], where the [math]\displaystyle{ 4 }[/math] character is treated as identifying the "wildcard" set. Similarly, we identify [math]\displaystyle{ 2 }[/math]-dimensional combinatorial subspaces in [math]\displaystyle{ [3]^n }[/math] with strings in [math]\displaystyle{ [5]^n }[/math], where the characters [math]\displaystyle{ 4 }[/math] and [math]\displaystyle{ 5 }[/math] identify the [math]\displaystyle{ 2 }[/math] wildcard sets. Etc. Note that a combinatorial line in [math]\displaystyle{ [3]^n }[/math] drawn from [math]\displaystyle{ \nu_4^n }[/math] is automatically nondegenerate. Henceforth, whenever we say "combinatorial line" or "combinatorial subspace" without qualification, the object is assumed to be nondegenerate.
Although we will prove DHJ([math]\displaystyle{ 3 }[/math]) for sets that are dense under [math]\displaystyle{ \nu_3 }[/math], this also proves it for sets that are dense under the usual uniform distribution; this should be explicated in the article on passing between measures.
Proof Outline
Here is the sequence of statements we prove:
1. Dense sets in [math]\displaystyle{ [2]^n }[/math] contain lines (i.e., DHJ([math]\displaystyle{ 2 }[/math])).
2. Dense sets in [math]\displaystyle{ [2]^n }[/math] contain large subspaces.
3. Dense ij-insensitive sets in [math]\displaystyle{ [3]^n }[/math] contain large subspaces.
4. Dense [math]\displaystyle{ ij }[/math]-insensitive sets in [math]\displaystyle{ [3]^n }[/math] are large-subspace rich.
5. Dense [math]\displaystyle{ ij }[/math]-insensitive sets in [math]\displaystyle{ [3]^n }[/math] are large-subspace partitionable.
6. Dense intersections of [math]\displaystyle{ ij }[/math]-insensitive sets in [math]\displaystyle{ [3]^n }[/math] are large-subspace partitionable.
7. Line-free sets in [math]\displaystyle{ [3]^n }[/math] have density increments on dense intersections of [math]\displaystyle{ ij }[/math]-insensitive sets.
8. Line-free sets in [math]\displaystyle{ [3]^n }[/math] have density increments on large subspaces.
9. Dense sets in [math]\displaystyle{ [3]^n }[/math] contain lines (i.e., DHJ([math]\displaystyle{ 3 }[/math])).
1. Sets in [math]\displaystyle{ [2]^n }[/math] have lines
Proofread By: no one yet
This is simply Sperner's Theorem. Our version, with the Pólya Urn distribution, is:
- Theorem 1: Let [math]\displaystyle{ A \subseteq [2]^n }[/math] have [math]\displaystyle{ \nu_2 }[/math]-density at least [math]\displaystyle{ \delta }[/math]. Assume that [math]\displaystyle{ n \geq n_1(\delta) }[/math], where [math]\displaystyle{ n_1(\delta) = O(1/\delta^2) }[/math]. Then [math]\displaystyle{ A }[/math] contains a combinatorial ([math]\displaystyle{ 2 }[/math]-)line.
- Proof of Sperner's Theorem. For now, the proof more or less appears in the article on Sperner's Theorem; we might want to use the writeup ideas in the current article on Equal Slices. The easiest proof involves first passing from Pólya to Equal Slices. I think you only need [math]\displaystyle{ n_1(\delta) \gt 1/\delta^2 + 1 }[/math] or something.
2. Sets in [math]\displaystyle{ [2]^n }[/math] have subspaces
Proofread By: no one yet
The next step is to extend DHJ([math]\displaystyle{ 2 }[/math]) from lines to subspaces of arbitrary dimension.
- Lemma 2: Let [math]\displaystyle{ A \subseteq [2]^n }[/math] have [math]\displaystyle{ \nu_2 }[/math]-density at least [math]\displaystyle{ \delta }[/math], and let [math]\displaystyle{ d \geq 1 }[/math]. Assume that [math]\displaystyle{ n \geq n_2(\delta,d) }[/math]. Then [math]\displaystyle{ A }[/math] contains a [math]\displaystyle{ d }[/math]-dimensional combinatorial subspace. Here [math]\displaystyle{ n_2(\delta,d) }[/math] probably only needs to be [math]\displaystyle{ O(1/\delta)^{2^{d+1}} }[/math]
This sort of result has been shown for the uniform distribution case by [[1]], from which one can easily conclude the Pólya distribution case by passing between measures. Alternatively, one can prove the result with a (much) worse bound on [math]\displaystyle{ n_2 }[/math] using an argument which generalizes to [math]\displaystyle{ [k]^n }[/math]. A few technical bits need to be filled into this argument; namely, showing that if one does Pólya on some small random subset [math]\displaystyle{ S \subseteq [n] }[/math], and then combines this with an independent Pólya draw on [math]\displaystyle{ [n] \setminus S }[/math], the overall draw is very close in total variation distance to a normal Pólya draw. This is proved rather explicitly in the passing between measures article. (Presumably there is also literature from the Pólya Urn studiers on this, too.)
3. [math]\displaystyle{ ij }[/math]-insensitive sets in [math]\displaystyle{ [3]^n }[/math] have subspaces
Proofread By: no one yet
Let [math]\displaystyle{ i,j \in [k] }[/math] be distinct characters. We define [math]\displaystyle{ A \subseteq [k]^n }[/math] to be an "[math]\displaystyle{ ij }[/math]-insensitive set" if for all strings [math]\displaystyle{ x }[/math], changing [math]\displaystyle{ i }[/math]'s to [math]\displaystyle{ j }[/math]'s in [math]\displaystyle{ x }[/math] or vice versa does not change whether or not [math]\displaystyle{ x \in A }[/math].
An [math]\displaystyle{ 23 }[/math]-insensitive in [math]\displaystyle{ [3]^n }[/math] is elsewhere on the blog called a "[math]\displaystyle{ 1 }[/math]-set", since it depends only on the set of coordinates where the value 1 is taken. It is a special case of a special set of complexity [math]\displaystyle{ 1 }[/math]. See also this article discussing the complexity of sets.
The statement we prove here is:
- Lemma 3: Let [math]\displaystyle{ A \subseteq [3]^n }[/math] be an [math]\displaystyle{ ij }[/math]-insensitive set with [math]\displaystyle{ \nu_3 }[/math]-density at least [math]\displaystyle{ \delta }[/math], and let [math]\displaystyle{ d \geq 1 }[/math]. Assume that [math]\displaystyle{ n \geq n_3(\delta,d) = \sqrt{2/\delta} \cdot n_2(\delta/2, d) }[/math]. Then [math]\displaystyle{ A }[/math] contains a [math]\displaystyle{ d }[/math]-dimensional combinatorial subspace.
We sketch the proof here, for now.
- Proof: (Sketch.) Assume without loss of generality that [math]\displaystyle{ i = 2 }[/math], [math]\displaystyle{ j = 3 }[/math]. We can think of a draw [math]\displaystyle{ x \sim \nu_3^n }[/math] by first conditioning on the set of coordinates [math]\displaystyle{ S }[/math] where [math]\displaystyle{ x }[/math] has its [math]\displaystyle{ 3 }[/math]'s, and then drawing the remainder of the string as [math]\displaystyle{ y \sim \nu_2^{|\overline{S}|} }[/math], where [math]\displaystyle{ \overline{S} = [n] \setminus S }[/math]. (This requires a quick justification.) Inventing some notation, we have [math]\displaystyle{ \mathbf{E}_S[\nu_2(A_S)] = \delta }[/math]. Further, it should be easy to check that [math]\displaystyle{ \mathbf{Pr}[|\overline{S}| \lt \gamma n] \leq \gamma^2 }[/math] (or perhaps only a tiny bit higher). Hence [math]\displaystyle{ \mathbf{E}_S[\nu_2(A_S) \mid |\overline{S}| \geq \gamma n] \geq \delta - \gamma^2. }[/math] We set [math]\displaystyle{ \gamma = \sqrt{\delta/2} }[/math] and conclude that there exists a particular set of [math]\displaystyle{ 3 }[/math]-coordinates [math]\displaystyle{ S_0 }[/math] such that [math]\displaystyle{ \nu_2(A_{S_0}) \geq \delta/2 }[/math] and [math]\displaystyle{ |\overline{S_0}| \geq \sqrt{\delta/2} n }[/math]. By choice of [math]\displaystyle{ n_2 }[/math], this lets us apply Lemma 2 to conclude that [math]\displaystyle{ A_{S_0} \subseteq [2]^{S_0} }[/math] contains a [math]\displaystyle{ d }[/math]-dimensional combinatorial subspace; call it [math]\displaystyle{ \sigma }[/math].
- Pulling [math]\displaystyle{ \sigma }[/math] back into [math]\displaystyle{ [3]^n }[/math] by putting [math]\displaystyle{ 3 }[/math]'s into the [math]\displaystyle{ S }[/math] coordinates, we get a [math]\displaystyle{ d }[/math]-dimensional "[math]\displaystyle{ 2 }[/math]-subspace" in [math]\displaystyle{ A \subseteq [3]^n }[/math]. But in fact, all of the strings necessary to fill this out into a complete [math]\displaystyle{ d }[/math]-dimensional subspace in [math]\displaystyle{ [3]^n }[/math] must also be present in [math]\displaystyle{ A }[/math], because [math]\displaystyle{ A }[/math] is [math]\displaystyle{ 23 }[/math]-insensitive. [math]\displaystyle{ \Box }[/math] (somewhat poorly explained)
4. [math]\displaystyle{ ij }[/math]-insensitive sets are subspace rich
Proofread By: no one yet
- Lemma 4: Let [math]\displaystyle{ A \subseteq [3]^n }[/math] be an [math]\displaystyle{ ij }[/math]-insensitive set with [math]\displaystyle{ \nu }[/math]-density at least [math]\displaystyle{ \delta }[/math], and let [math]\displaystyle{ d \geq 1 }[/math]. Assume that [math]\displaystyle{ n \geq n_4(\delta,d) = n_3(\delta/2,d) }[/math]. Then a random [math]\displaystyle{ d }[/math]-dimensional subspace drawn from [math]\displaystyle{ \nu^n_{3+d} }[/math] is in [math]\displaystyle{ A }[/math] with probability at least [math]\displaystyle{ \eta_4(\delta, d) = \delta/(O(d))^{n_4(\delta,d)} }[/math].
This is sketched originally here. Here is a more complete argument:
- Proof: (Sketch.) For any [math]\displaystyle{ 3 \leq m \leq n }[/math] one can draw a string [math]\displaystyle{ h \sim \nu_3^n }[/math] as follows: First, draw [math]\displaystyle{ f \sim \nu_m^n }[/math]; then independently draw [math]\displaystyle{ g \sim \nu_3^m }[/math]; finally, set [math]\displaystyle{ h = g \circ f }[/math]. This notation means that [math]\displaystyle{ h_i = g_{f_i} }[/math]. The fact that this indeed gives the distribution [math]\displaystyle{ \nu_3^n }[/math] is justified in [document], but should be simplified and ported to the wiki.
- By assumption, [math]\displaystyle{ \mathbf{Pr}[h \in A] \geq \delta }[/math]; hence with probability at least [math]\displaystyle{ \delta/2 }[/math] over the choice of [math]\displaystyle{ f }[/math] we get that [math]\displaystyle{ A_f \subseteq [3]^m }[/math] has [math]\displaystyle{ \nu_3 }[/math]-density at least [math]\displaystyle{ \delta/2 }[/math]. Here [math]\displaystyle{ A_f }[/math] of course denotes [math]\displaystyle{ \{g \in [3]^m : g \circ f \in A }[/math]. Crucially, [math]\displaystyle{ A_f }[/math] is an [math]\displaystyle{ ij }[/math]-insensitive set because [math]\displaystyle{ A }[/math]. (This takes a slight bit of reflection but is, I think, easily confirmed.) Call an [math]\displaystyle{ f }[/math] "good" if indeed [math]\displaystyle{ \nu_3(A_f) \geq \delta/2 }[/math].
- We therefore select [math]\displaystyle{ m = n_3(\delta/2,d) }[/math] (which is allowable as [math]\displaystyle{ n \geq n_3(\delta/2,d) }[/math]) and use Lemma 3 to conclude that for each good [math]\displaystyle{ f }[/math], the set [math]\displaystyle{ A_f }[/math] contains some [math]\displaystyle{ d }[/math]-dimensional subspace. In these cases, a randomly chosen subspace [math]\displaystyle{ \sigma \sim [3+d]^m }[/math] is in [math]\displaystyle{ A_f }[/math] with some positive probability. At this point, we should calculate the least probability of any outcome under [math]\displaystyle{ \nu_k^n }[/math]; I believe it is [math]\displaystyle{ \frac{\Theta(1/k)^{k/2}}{n^{(k-1)/2} \cdot k^n} }[/math]. Lower-bounding this crudely, we have that [math]\displaystyle{ \sigma \sim [3+d]^m }[/math] is in [math]\displaystyle{ A_f }[/math] with probability at least [math]\displaystyle{ (O(d))^{-m} }[/math].
- So overall we have that for [math]\displaystyle{ f \sim \nu_3^m }[/math] and [math]\displaystyle{ \sigma \sim [3+d]^m }[/math], the probability of [math]\displaystyle{ \sigma \circ f }[/math] being entirely in [math]\displaystyle{ A }[/math] is at least [math]\displaystyle{ \delta/(O(d))^m = \delta/(O(d))^{n_3(\delta/2,d)} }[/math]. But [math]\displaystyle{ \sigma \circ f }[/math] is distributed as [math]\displaystyle{ \nu_{3+d}^n }[/math], completing the proof. [math]\displaystyle{ \Box }[/math]
5. [math]\displaystyle{ ij }[/math]-insensitive sets are subspace partitionable
Proofread By: no one yet
- Definition: A set [math]\displaystyle{ A \subseteq [3]^n }[/math] is "[math]\displaystyle{ (\gamma,d) }[/math]-subspace partitionable" if it can be written as a disjoint union of [math]\displaystyle{ d }[/math]-dimensional combinatorial subspaces, along with a single "error" set [math]\displaystyle{ E }[/math] satisfying [math]\displaystyle{ \nu(E)/\nu(A) \leq \gamma }[/math].
- Lemma 5: Let A \subseteq [3]^n be an [math]\displaystyle{ ij }[/math]-insensitive set with [math]\displaystyle{ \nu }[/math]-density at least [math]\displaystyle{ \delta }[/math], let [math]\displaystyle{ d \geq 1 }[/math], and let [math]\displaystyle{ \gamma \in (0,1) }[/math] be a parameter. Assume that [math]\displaystyle{ n \geq n_5(\delta,\gamma,d) = XXX }[/math]. Then [math]\displaystyle{ A }[/math] is [math]\displaystyle{ (\gamma,d) }[/math]-subspace partitionable.
This is currently written fairly thoroughly here, although the parameters need to be worked out to fill in the "XXX" in the statement.
6. Intersections of [math]\displaystyle{ ij }[/math]-insensitive sets are subspace partitionable
Proofread By: no one yet
Currently written here.
7. Line-free sets increment on intersections of [math]\displaystyle{ ij }[/math]-insensitive sets
Proofread By: no one yet
Written mostly completely here.
8. Line-free sets increment on subspaces
Proofread By: no one yet
Sketch here.
9. Sets in [math]\displaystyle{ [3]^n }[/math] have lines
Proofread By: no one yet
The basic idea is described in the article on the density increment method.