Proof of DHJ(3) via density-increment: Difference between revisions
No edit summary |
Parts 1 and 2 filled in |
||
Line 10: | Line 10: | ||
A closely related distribution is called the [[Equal-slices_distribution_for_DHJ(k)|Equal Slices distribution]]. We will denote this distribution by <math>\overline{\nu}_k^n</math>. It has the same definition as the Pólya Urn definition just given, except that the <math>k-1</math> partition points are "allowed to be equal"; more precisely, we choose the <math>1</math>st partition point uniformly from the <math>n+1</math> possibilities, then we choose the <math>2</math>nd partition point uniformly from the <math>n+2</math> possibilities, etc. It should be clear from these definitions that the [[total variation distance]] between <math>\nu_k^n</math> and <math>\overline{\nu}_k^n</math> is small. Here is a [[to_be_filled|proof that it is <math>O(k^2/n)</math>]]. | A closely related distribution is called the [[Equal-slices_distribution_for_DHJ(k)|Equal Slices distribution]]. We will denote this distribution by <math>\overline{\nu}_k^n</math>. It has the same definition as the Pólya Urn definition just given, except that the <math>k-1</math> partition points are "allowed to be equal"; more precisely, we choose the <math>1</math>st partition point uniformly from the <math>n+1</math> possibilities, then we choose the <math>2</math>nd partition point uniformly from the <math>n+2</math> possibilities, etc. It should be clear from these definitions that the [[total variation distance]] between <math>\nu_k^n</math> and <math>\overline{\nu}_k^n</math> is small. Here is a [[to_be_filled|proof that it is <math>O(k^2/n)</math>]]. | ||
We identify [[Combinatorial_line|combinatorial lines]] in <math>[3]^n</math> with strings in <math>[4]^n</math>, where the <math>4</math> character is treated as identifying the "wildcard" set. Similarly, we identify <math>2</math>-dimensional [[Combinatorial_subspace|combinatorial subspaces]] in <math>[3]^n</math> with strings in <math>[5]^n</math>, where the characters <math>4</math> and <math>5</math> identify the <math>2</math> wildcard sets. Etc. Note that a combinatorial line in <math>[3]^n</math> drawn from <math>\nu_4^n</math> is automatically nondegenerate. | We identify [[Combinatorial_line|combinatorial lines]] in <math>[3]^n</math> with strings in <math>[4]^n</math>, where the <math>4</math> character is treated as identifying the "wildcard" set. Similarly, we identify <math>2</math>-dimensional [[Combinatorial_subspace|combinatorial subspaces]] in <math>[3]^n</math> with strings in <math>[5]^n</math>, where the characters <math>4</math> and <math>5</math> identify the <math>2</math> wildcard sets. Etc. Note that a combinatorial line in <math>[3]^n</math> drawn from <math>\nu_4^n</math> is automatically nondegenerate. Henceforth, whenever we say "combinatorial line" or "combinatorial subspace" without qualification, the object is assumed to be nondegenerate. | ||
Although we will prove DHJ(<math>3</math>) for sets that are dense under <math>\nu_3</math>, this also proves it for sets that are dense under the usual uniform distribution; this should be explicated in the article on [[passing between measures]]. | Although we will prove DHJ(<math>3</math>) for sets that are dense under <math>\nu_3</math>, this also proves it for sets that are dense under the usual uniform distribution; this should be explicated in the article on [[passing between measures]]. | ||
Line 38: | Line 38: | ||
===Sets in <math>[2]^n</math> have lines=== | ===Sets in <math>[2]^n</math> have lines=== | ||
This is simply Sperner's | This is simply Sperner's Theorem. Our version, with the Pólya Urn distribution, is: | ||
'''Theorem:''' | : '''Theorem 1:''' Let <math>A \subseteq [2]^n</math> have <math>\nu_2</math>-density at least <math>\delta</math>. Assume that <math>n \geq n_1(\delta)</math>, where <math>n_1(\delta) = O(1/\delta^2)</math>. Then <math>A</math> contains a combinatorial (<math>2</math>-)line. | ||
: '''[[Proof of Sperner's Theorem]]'''. For now, the proof more or less appears in the article on [[Sperner%27s_theorem#Proof_of_the_theorem|Sperner's Theorem]]; we might want to use the writeup ideas in the current article on [[Equal-slices_distribution_for_DHJ(k)|Equal Slices]]. The easiest proof involves first passing from Pólya to Equal Slices. I think you only need <math>n_1(\delta) > 1/\delta^2 + 1</math> or something. | |||
===<math>ij</math>-insensitive sets in <math>[3]^n</math> have lines=== | ===<math>ij</math>-insensitive sets in <math>[3]^n</math> have lines=== | ||
Let <math>i,j \in [k]</math> be distinct characters. We define <math>A \subseteq [k]^n</math> to be an "<math>ij</math>-insensitive set" if for all strings <math>x</math>, changing <math>i</math>'s to <math>j</math>'s in <math>x</math> or vice versa does not change whether or not <math>x \in A</math>. | |||
An <math>ij</math>-insensitive in <math>[k]^n</math> is elsewhere on the blog called a "<math>1</math>-set", being a special case of a [[Complexity_of_a_set|special set of complexity <math>1</math>]]. See also this article discussing [[Line_free_sets_correlate_locally_with_dense_sets_of_complexity_k-2|the complexity of sets]]. | |||
The statement we prove here is: | |||
: '''Lemma 2:''' Let <math>A \subseteq [3]^n</math> be an <math>ij</math>-insensitive set with <math>\nu_2</math>-density at least <math>\delta</math>. Assume that <math>n \geq n_2(\delta) = \sqrt{2/\delta}n_1(\delta/2)</math>. Then <math>A</math> contains a combinatorial (<math>3</math>-)line. | |||
We sketch the proof here, for now. | |||
: '''Proof:''' (Sketch.) Assume without loss of generality that <math>i = 2</math>, <math>j = 3</math>. We can think of a draw <math>x \sim \nu_3^n</math> by first conditioning on the set of coordinates <math>S</math> where <math>x</math> has its <math>3</math>'s, and then drawing the remainder of the string as <math>y \sim \nu_2^{|\overline{S}|}</math>, where <math>\overline{S} = [n] \setminus S</math>. (This requires a quick justification.) Inventing some notation, we have <math>\mathbf{E}_S[\nu_2(A_S)] = \delta</math>. Further, it should be easy to check that <math>\mathbf{Pr}[|\overline{S}| <\gamma n] \leq \gamma^2</math> (or perhaps only a tiny bit higher). Hence <math>\mathbf{E}_S[\nu_2(A_S) \mid |\overline{S}| \geq \gamma n] \geq \delta - \gamma^2.</math> We set <math>\gamma = \sqrt{\delta/2}</math> and conclude that there exists a particular set of <math>3</math>-coordinates <math>S_0</math> such that <math>\nu_2(A_{S_0}) \geq \delta/2</math> and <math>|\overline{S_0}| \geq \sqrt{\delta/2} n</math>. By choice of <math>n_2</math>, this lets us apply Theorem 1 to conclude that <math>A_{S_0}</math> contains a combinatorial <math>2</math>-line. This pulls back to a <math>2</math>-line <math>(u,v)</math> in <math>A \subseteq [3]^n</math>. Assume <math>T</math> is the wildcard set for this line, and <math>v</math> is the point which has <math>2</math>'s on <math>T</math>. But <math>A</math> is a <math>23</math>-insensitive set, so the string <math>w</math> gotten by changing <math>v</math>'s coordinates from <math>2</math> to <math>3</math> on <math>T</math> must also be in <math>A</math>. Hence <math>(u,v,w)</math> is a combinatorial <math>3</math>-line in <math>A</math>. <math>\Box</math> | |||
===<math>ij</math>-insensitive sets have subspaces=== | ===<math>ij</math>-insensitive sets have subspaces=== |
Revision as of 17:41, 19 March 2009
Introduction
This article is intended to contain a proof outline of DHJ([math]\displaystyle{ 3 }[/math]) via the density-increment method. The proofs of each step in the outline will appear in separate articles. The goal is to rigorously codify the "second outline" proof. A secondary goal is to write the proof in such a way that it becomes "easy" to see the generalization to DHJ([math]\displaystyle{ k }[/math]).
Notation and definitions
We'll call the main probability distribution we work with the Pólya Urn distribution. As a distribution on [math]\displaystyle{ [k]^n }[/math], we'll write it as [math]\displaystyle{ \nu_k^n }[/math], or as [math]\displaystyle{ \nu_k }[/math] or [math]\displaystyle{ \nu }[/math] if it's clear from context. (Note: even if one only wants to prove DHJ([math]\displaystyle{ 3 }[/math]), one needs to define [math]\displaystyle{ \nu_k^n }[/math] for general [math]\displaystyle{ k }[/math].) One way to define it is as follows: Order the coordinates [math]\displaystyle{ [n] }[/math] in a list according to a random permutation. Now choose [math]\displaystyle{ k-1 }[/math] distinct "partition points" uniformly at random from the [math]\displaystyle{ \binom{n+1}{k-1} }[/math] possibilities. This partitions the list of coordinates into [math]\displaystyle{ k }[/math] sublists; take the [math]\displaystyle{ 1 }[/math]st sublist of coordinates to be [math]\displaystyle{ 1 }[/math]'s, the [math]\displaystyle{ 2 }[/math]nd sublist to be [math]\displaystyle{ 2 }[/math]'s, etc. Note that any string drawn from the Pólya Urn distribution is nondegenerate, meaning it contains at least one of each character in [math]\displaystyle{ [3] }[/math].
A closely related distribution is called the Equal Slices distribution. We will denote this distribution by [math]\displaystyle{ \overline{\nu}_k^n }[/math]. It has the same definition as the Pólya Urn definition just given, except that the [math]\displaystyle{ k-1 }[/math] partition points are "allowed to be equal"; more precisely, we choose the [math]\displaystyle{ 1 }[/math]st partition point uniformly from the [math]\displaystyle{ n+1 }[/math] possibilities, then we choose the [math]\displaystyle{ 2 }[/math]nd partition point uniformly from the [math]\displaystyle{ n+2 }[/math] possibilities, etc. It should be clear from these definitions that the total variation distance between [math]\displaystyle{ \nu_k^n }[/math] and [math]\displaystyle{ \overline{\nu}_k^n }[/math] is small. Here is a proof that it is [math]\displaystyle{ O(k^2/n) }[/math].
We identify combinatorial lines in [math]\displaystyle{ [3]^n }[/math] with strings in [math]\displaystyle{ [4]^n }[/math], where the [math]\displaystyle{ 4 }[/math] character is treated as identifying the "wildcard" set. Similarly, we identify [math]\displaystyle{ 2 }[/math]-dimensional combinatorial subspaces in [math]\displaystyle{ [3]^n }[/math] with strings in [math]\displaystyle{ [5]^n }[/math], where the characters [math]\displaystyle{ 4 }[/math] and [math]\displaystyle{ 5 }[/math] identify the [math]\displaystyle{ 2 }[/math] wildcard sets. Etc. Note that a combinatorial line in [math]\displaystyle{ [3]^n }[/math] drawn from [math]\displaystyle{ \nu_4^n }[/math] is automatically nondegenerate. Henceforth, whenever we say "combinatorial line" or "combinatorial subspace" without qualification, the object is assumed to be nondegenerate.
Although we will prove DHJ([math]\displaystyle{ 3 }[/math]) for sets that are dense under [math]\displaystyle{ \nu_3 }[/math], this also proves it for sets that are dense under the usual uniform distribution; this should be explicated in the article on passing between measures.
Proof Outline
Here is the sequence of statements we prove:
1. Dense sets in [math]\displaystyle{ [2]^n }[/math] contain lines (i.e., DHJ([math]\displaystyle{ 2 }[/math])).
2. Dense ij-insensitive sets in [math]\displaystyle{ [3]^n }[/math] contain lines.
3. Dense [math]\displaystyle{ ij }[/math]-insensitive sets in [math]\displaystyle{ [3]^n }[/math] contain large subspaces.
4. Dense [math]\displaystyle{ ij }[/math]-insensitive sets in [math]\displaystyle{ [3]^n }[/math] are large-subspace rich.
5. Dense [math]\displaystyle{ ij }[/math]-insensitive sets in [math]\displaystyle{ [3]^n }[/math] are large-subspace partitionable.
6. Dense intersections of [math]\displaystyle{ ij }[/math]-insensitive sets in [math]\displaystyle{ [3]^n }[/math] are large-subspace partitionable.
7. Line-free sets in [math]\displaystyle{ [3]^n }[/math] have density increments on dense intersections of [math]\displaystyle{ ij }[/math]-insensitive sets.
8. Line-free sets in [math]\displaystyle{ [3]^n }[/math] have density increments on large subspaces.
9. Dense sets in [math]\displaystyle{ [3]^n }[/math] contain lines (i.e., DHJ([math]\displaystyle{ 3 }[/math])).
Sets in [math]\displaystyle{ [2]^n }[/math] have lines
This is simply Sperner's Theorem. Our version, with the Pólya Urn distribution, is:
- Theorem 1: Let [math]\displaystyle{ A \subseteq [2]^n }[/math] have [math]\displaystyle{ \nu_2 }[/math]-density at least [math]\displaystyle{ \delta }[/math]. Assume that [math]\displaystyle{ n \geq n_1(\delta) }[/math], where [math]\displaystyle{ n_1(\delta) = O(1/\delta^2) }[/math]. Then [math]\displaystyle{ A }[/math] contains a combinatorial ([math]\displaystyle{ 2 }[/math]-)line.
- Proof of Sperner's Theorem. For now, the proof more or less appears in the article on Sperner's Theorem; we might want to use the writeup ideas in the current article on Equal Slices. The easiest proof involves first passing from Pólya to Equal Slices. I think you only need [math]\displaystyle{ n_1(\delta) \gt 1/\delta^2 + 1 }[/math] or something.
[math]\displaystyle{ ij }[/math]-insensitive sets in [math]\displaystyle{ [3]^n }[/math] have lines
Let [math]\displaystyle{ i,j \in [k] }[/math] be distinct characters. We define [math]\displaystyle{ A \subseteq [k]^n }[/math] to be an "[math]\displaystyle{ ij }[/math]-insensitive set" if for all strings [math]\displaystyle{ x }[/math], changing [math]\displaystyle{ i }[/math]'s to [math]\displaystyle{ j }[/math]'s in [math]\displaystyle{ x }[/math] or vice versa does not change whether or not [math]\displaystyle{ x \in A }[/math].
An [math]\displaystyle{ ij }[/math]-insensitive in [math]\displaystyle{ [k]^n }[/math] is elsewhere on the blog called a "[math]\displaystyle{ 1 }[/math]-set", being a special case of a special set of complexity [math]\displaystyle{ 1 }[/math]. See also this article discussing the complexity of sets.
The statement we prove here is:
- Lemma 2: Let [math]\displaystyle{ A \subseteq [3]^n }[/math] be an [math]\displaystyle{ ij }[/math]-insensitive set with [math]\displaystyle{ \nu_2 }[/math]-density at least [math]\displaystyle{ \delta }[/math]. Assume that [math]\displaystyle{ n \geq n_2(\delta) = \sqrt{2/\delta}n_1(\delta/2) }[/math]. Then [math]\displaystyle{ A }[/math] contains a combinatorial ([math]\displaystyle{ 3 }[/math]-)line.
We sketch the proof here, for now.
- Proof: (Sketch.) Assume without loss of generality that [math]\displaystyle{ i = 2 }[/math], [math]\displaystyle{ j = 3 }[/math]. We can think of a draw [math]\displaystyle{ x \sim \nu_3^n }[/math] by first conditioning on the set of coordinates [math]\displaystyle{ S }[/math] where [math]\displaystyle{ x }[/math] has its [math]\displaystyle{ 3 }[/math]'s, and then drawing the remainder of the string as [math]\displaystyle{ y \sim \nu_2^{|\overline{S}|} }[/math], where [math]\displaystyle{ \overline{S} = [n] \setminus S }[/math]. (This requires a quick justification.) Inventing some notation, we have [math]\displaystyle{ \mathbf{E}_S[\nu_2(A_S)] = \delta }[/math]. Further, it should be easy to check that [math]\displaystyle{ \mathbf{Pr}[|\overline{S}| \lt \gamma n] \leq \gamma^2 }[/math] (or perhaps only a tiny bit higher). Hence [math]\displaystyle{ \mathbf{E}_S[\nu_2(A_S) \mid |\overline{S}| \geq \gamma n] \geq \delta - \gamma^2. }[/math] We set [math]\displaystyle{ \gamma = \sqrt{\delta/2} }[/math] and conclude that there exists a particular set of [math]\displaystyle{ 3 }[/math]-coordinates [math]\displaystyle{ S_0 }[/math] such that [math]\displaystyle{ \nu_2(A_{S_0}) \geq \delta/2 }[/math] and [math]\displaystyle{ |\overline{S_0}| \geq \sqrt{\delta/2} n }[/math]. By choice of [math]\displaystyle{ n_2 }[/math], this lets us apply Theorem 1 to conclude that [math]\displaystyle{ A_{S_0} }[/math] contains a combinatorial [math]\displaystyle{ 2 }[/math]-line. This pulls back to a [math]\displaystyle{ 2 }[/math]-line [math]\displaystyle{ (u,v) }[/math] in [math]\displaystyle{ A \subseteq [3]^n }[/math]. Assume [math]\displaystyle{ T }[/math] is the wildcard set for this line, and [math]\displaystyle{ v }[/math] is the point which has [math]\displaystyle{ 2 }[/math]'s on [math]\displaystyle{ T }[/math]. But [math]\displaystyle{ A }[/math] is a [math]\displaystyle{ 23 }[/math]-insensitive set, so the string [math]\displaystyle{ w }[/math] gotten by changing [math]\displaystyle{ v }[/math]'s coordinates from [math]\displaystyle{ 2 }[/math] to [math]\displaystyle{ 3 }[/math] on [math]\displaystyle{ T }[/math] must also be in [math]\displaystyle{ A }[/math]. Hence [math]\displaystyle{ (u,v,w) }[/math] is a combinatorial [math]\displaystyle{ 3 }[/math]-line in [math]\displaystyle{ A }[/math]. [math]\displaystyle{ \Box }[/math]