Equal-slices measure: Difference between revisions
→The distribution of a random point in a random combinatorial line: changed "measure" to "density" |
|||
Line 39: | Line 39: | ||
==== Analogies with Sperner's theorem ==== | ==== Analogies with Sperner's theorem ==== | ||
One indication that the equal-slices measure is a natural measure to consider is that it plays an important role (implicitly or explicitly) in the standard proofs of Sperner's theorem. Recall that Sperner's theorem is the statement that the largest size of any subset <math>\mathcal{A}</math> of <math>[2]^n</math> that does not contain two sets A and B with <math>A\subset B</math> is <math>\binom n{\lfloor n/2\rfloor}</math>, the size of the middle layer (or one of the two middle layers if n is odd). Now the equal-slices density on <math>[2]^n</math> can be interpreted as follows: choose a random integer m between 0 and n (chosen uniformly) and a random permutation <math>\pi</math> of <math>[n]</math>, and then take the set <math>\{\pi(1),\dots,\pi(m)\}.</math> The probability that this set belongs to <math>\mathcal{A}</math> is the equal-slices density of <math>\mathcal{A}.</math> | One indication that the equal-slices measure is a natural measure to consider is that it plays an important role (implicitly or explicitly) in the standard proofs of [[Sperner's theorem]]. Recall that Sperner's theorem is the statement that the largest size of any subset <math>\mathcal{A}</math> of <math>[2]^n</math> that does not contain two sets A and B with <math>A\subset B</math> is <math>\binom n{\lfloor n/2\rfloor}</math>, the size of the middle layer (or one of the two middle layers if n is odd). Now the equal-slices density on <math>[2]^n</math> can be interpreted as follows: choose a random integer m between 0 and n (chosen uniformly) and a random permutation <math>\pi</math> of <math>[n]</math>, and then take the set <math>\{\pi(1),\dots,\pi(m)\}.</math> The probability that this set belongs to <math>\mathcal{A}</math> is the equal-slices density of <math>\mathcal{A}.</math> | ||
Now we claim that any set <math>\mathcal{A}</math> of equal-slices density greater than <math>1/(n+1)</math> contains a pair (A,B) with <math>A\subset B.</math> This is a considerable strengthening of Sperner's theorem, since the largest set with equal-slices measure 1 trivially has size equal to the size of the largest slice. And the proof follows almost instantly from the definition of equal-slices density, since if the probability that a random initial segment of a random permutation of <math>[n]</math> lies in <math>\mathcal{A}</math> is greater than <math>1/(n+1)</math> then there must be some permutation of <math>[n]</math> with at least two initial segments belonging to <math>\mathcal{A}.</math> | Now we claim that any set <math>\mathcal{A}</math> of equal-slices density greater than <math>1/(n+1)</math> contains a pair (A,B) with <math>A\subset B.</math> This is a considerable strengthening of Sperner's theorem, since the largest set with equal-slices measure 1 trivially has size equal to the size of the largest slice. And the proof follows almost instantly from the definition of equal-slices density, since if the probability that a random initial segment of a random permutation of <math>[n]</math> lies in <math>\mathcal{A}</math> is greater than <math>1/(n+1)</math> then there must be some permutation of <math>[n]</math> with at least two initial segments belonging to <math>\mathcal{A}.</math> |
Revision as of 13:50, 15 February 2009
The equal-slices measure [math]\displaystyle{ \mu(A) }[/math] of a set [math]\displaystyle{ A \subset [3]^n }[/math] is defined by the formula
- [math]\displaystyle{ \mu(A) := \sum_{(a,b,c) \in \Delta_n} \frac{|A \cap \Gamma_{a,b,c}|}{|\Gamma_{a,b,c}|} }[/math]
where [math]\displaystyle{ \Delta_n := \{ (a,b,c) \in {\Bbb Z}_+^3: a+b+c=n\} }[/math] is the triangular grid and [math]\displaystyle{ \Gamma_{a,b,c} \subset [3]^n }[/math] is the set of slices with exactly a 1s, b 2s, and c 3s. Thus every slice [math]\displaystyle{ \Gamma_{a,b,c} }[/math] has measure 1 (hence the name equal-slices), and the entire cube has measure [math]\displaystyle{ \frac{(n+1)(n+2)}{2} }[/math]. Dividing the equal slices measure by [math]\displaystyle{ \frac{(n+1)(n+2)}{2} }[/math] gives the equal-slices density.
Example: in [math]\displaystyle{ {}[3]^2 }[/math], the diagonal points 11, 22, 33 each have equal-slices measure 1 (and equal-slices density 1/6), whereas the other six off-diagonal points have equal-slices measure 1/2 (and equal-slices density 1/12). The total equal-slices measure of [math]\displaystyle{ {}[3]^2 }[/math] is 6 (and the equal-slices density is of course 1).
There is a probabilistic interpretation of the equal-slices density of a set [math]\displaystyle{ A }[/math]. Randomly shuffle the [math]\displaystyle{ n }[/math] indices, and then randomly pick [math]\displaystyle{ (a,b,c) \in \Delta_n }[/math]. The probability that the string [math]\displaystyle{ 0^a 1^b 2^c }[/math] lies in the shuffled version of [math]\displaystyle{ A }[/math] is the equal-slices density of [math]\displaystyle{ A }[/math].
The LYM inequality asserts that any line-free subset of [math]\displaystyle{ [2]^n }[/math] has equal-slices measure at most 1. The analogue of this for k=3 is the hyper-optimistic conjecture.
The DHJ(3) conjecture,
DHJ(3), Version 1. For every [math]\displaystyle{ \delta \gt 0 }[/math] there exists n such that every subset [math]\displaystyle{ A \subset [3]^n }[/math] of density at least [math]\displaystyle{ \delta }[/math] contains a combinatorial line.
is equivalent to an equal slices version:
DHJ(3), Version 3. For every [math]\displaystyle{ \delta \gt 0 }[/math] there exists n such that every subset [math]\displaystyle{ A \subset [3]^n }[/math] of equal-slices density at least [math]\displaystyle{ \delta }[/math] contains a combinatorial line.
Version 3 implies Version 1
Suppose that A has density [math]\displaystyle{ \geq \delta }[/math] in the usual sense. Let m be such that every subset of [math]\displaystyle{ [3]^m }[/math] of equal-slices density [math]\displaystyle{ \geq \delta/2 }[/math] contains a combinatorial line. Now randomly embed [math]\displaystyle{ [3]^m }[/math] into [math]\displaystyle{ [3]^n }[/math] by choosing m variable coordinates and fixing the rest. We may suppose that every point in A has [math]\displaystyle{ n/3+O(\sqrt{n}) }[/math] of each coordinate value and that [math]\displaystyle{ m \ll \sqrt{n} }[/math]. Therefore, changing coordinates hardly changes the density of a slice. It follows that each point of is in approximately the same number of these random subspaces. Therefore, by averaging, there is a random subspace inside which has equal-slices-density at least [math]\displaystyle{ \geq \delta/2 }[/math], and we are done. (We could think of it Terry’s way: as we move the random subspace around, what we effectively have is a bunch of random variables, each with mean approximately [math]\displaystyle{ \delta }[/math], so by linearity of expectation we’ll get equal-slices-density at least [math]\displaystyle{ \delta/2 }[/math] at some point, whatever the measure is.)
Version 1 implies Version 3
Suppose that A has density [math]\displaystyle{ \geq \delta }[/math] in the equal-slices sense. By the first moment method, this means that A has density [math]\displaystyle{ \gg \delta }[/math] on [math]\displaystyle{ \gg \delta }[/math] on the slices.
Let m be a medium integer (much bigger than [math]\displaystyle{ 1/\delta }[/math], much less than n).
Pick (a, b, c) at random that add up to n-m. By the first moment method, we see that with probability [math]\displaystyle{ \gg \delta }[/math], A will have density on [math]\displaystyle{ \gg \delta }[/math] the of the slices [math]\displaystyle{ \Gamma_{a',b',c'} }[/math] with [math]\displaystyle{ a' = a + m/3 + O(\sqrt{m}) }[/math], [math]\displaystyle{ c' = c + m/3 + O(\sqrt{m}) }[/math], [math]\displaystyle{ c' = c + m/3 + O(\sqrt{m}) }[/math].
This implies that A has expected density on a random m-dimensional subspace generated by a 1s, b 2s, c 3s, and m independent wildcards.
Applying Version 1 to that random subspace we obtain the claim.
Motivation for equal-slices measure
Analogies with Sperner's theorem
One indication that the equal-slices measure is a natural measure to consider is that it plays an important role (implicitly or explicitly) in the standard proofs of Sperner's theorem. Recall that Sperner's theorem is the statement that the largest size of any subset [math]\displaystyle{ \mathcal{A} }[/math] of [math]\displaystyle{ [2]^n }[/math] that does not contain two sets A and B with [math]\displaystyle{ A\subset B }[/math] is [math]\displaystyle{ \binom n{\lfloor n/2\rfloor} }[/math], the size of the middle layer (or one of the two middle layers if n is odd). Now the equal-slices density on [math]\displaystyle{ [2]^n }[/math] can be interpreted as follows: choose a random integer m between 0 and n (chosen uniformly) and a random permutation [math]\displaystyle{ \pi }[/math] of [math]\displaystyle{ [n] }[/math], and then take the set [math]\displaystyle{ \{\pi(1),\dots,\pi(m)\}. }[/math] The probability that this set belongs to [math]\displaystyle{ \mathcal{A} }[/math] is the equal-slices density of [math]\displaystyle{ \mathcal{A}. }[/math]
Now we claim that any set [math]\displaystyle{ \mathcal{A} }[/math] of equal-slices density greater than [math]\displaystyle{ 1/(n+1) }[/math] contains a pair (A,B) with [math]\displaystyle{ A\subset B. }[/math] This is a considerable strengthening of Sperner's theorem, since the largest set with equal-slices measure 1 trivially has size equal to the size of the largest slice. And the proof follows almost instantly from the definition of equal-slices density, since if the probability that a random initial segment of a random permutation of [math]\displaystyle{ [n] }[/math] lies in [math]\displaystyle{ \mathcal{A} }[/math] is greater than [math]\displaystyle{ 1/(n+1) }[/math] then there must be some permutation of [math]\displaystyle{ [n] }[/math] with at least two initial segments belonging to [math]\displaystyle{ \mathcal{A}. }[/math]
Thus, Sperner's theorem follows from a combination of the definition of equal-slices measure/density, a simple averaging argument, and the trivial observation that if you have a subset of the set [math]\displaystyle{ \{0,1,\dots,n\} }[/math] of density greater than [math]\displaystyle{ 1/(n+1) }[/math] then you can find a configuration inside that subset of the form [math]\displaystyle{ x,x+d }[/math] with [math]\displaystyle{ d\ne 0. }[/math] This last statement can be thought of as "the one-dimensional corners theorem".
Therefore, it is not completely unreasonable to hope that if we choose a subset of [math]\displaystyle{ [3]^n }[/math] that has positive equal-slices density, then it contains a combinatorial line. This would be the two-dimensional analogue of the strong form of Sperner's theorem given by the above proof.
The distribution of a random point in a random combinatorial line
Let us choose a random combinatorial line in [math]\displaystyle{ [3]^n }[/math] uniformly from all combinatorial lines. Since each element of [math]\displaystyle{ [n] }[/math] has an equal chance of being fixed at 1, fixed at 2, fixed at 3, or a wildcard, there are [math]\displaystyle{ 4^n }[/math] combinatorial lines. And since the number of each type of element is binomially distributed, and the binomial distribution is strongly concentrated about its mean, if you choose a random combinatorial line, then with high probability there will be roughly [math]\displaystyle{ n/4 }[/math] of each type of coordinate. Therefore, with high probability, a random point on a random combinatorial line takes one of the three values roughly [math]\displaystyle{ n/2 }[/math] times and the other two roughly [math]\displaystyle{ n/4 }[/math] times each.
Now let us choose a random point in [math]\displaystyle{ [3]^n. }[/math] Then a similar argument shows that with high probability it will take each value roughly [math]\displaystyle{ n/3 }[/math] times. Therefore, if we let [math]\displaystyle{ \mathcal{A} }[/math] be the set of all sequences with roughly equal numbers of 1s, 2s and 3s, we find that it has density very close to 1, but it contains only a tiny proportion of the combinatorial lines.
This indicates that the uniform measure on the whole of [math]\displaystyle{ [3]^n }[/math] is not completely appropriate for analytic arguments. Does equal-slices density do any better?
One cannot answer this question without first deciding what one means by a random combinatorial line. The most natural definition seems to be this. First choose a random quadruple of non-negative integers [math]\displaystyle{ (a,b,c,r) }[/math] such that [math]\displaystyle{ a+b+c+r=n, }[/math] then choose four disjoint sets [math]\displaystyle{ A,B,C,D }[/math] of cardinality [math]\displaystyle{ a,b,c,d, }[/math] and finally form the combinatorial line [math]\displaystyle{ (A\cup D,B,C),(A,B\cup D,C),(A,B,C\cup D) }[/math].
If we do this, then the marginal distribution of the point [math]\displaystyle{ (A\cup D,B,C) }[/math] is clearly not given by the equal-slices density (since the expected size of [math]\displaystyle{ A\cup D }[/math] is more like [math]\displaystyle{ n/2 }[/math] than [math]\displaystyle{ n/3 }[/math]). However, the discrepancy between the marginal distribution and the equal-slices density is far less than it is when we do the comparable calculation with the uniform measure. It turns into a mild discrepancy of a kind that occurs with the corners problem as well: it is irritating but not a disaster.
As we have seen, the fact that points in typical combinatorial lines are not typical points means that we can find dense subsets of [math]\displaystyle{ [3]^n }[/math] that do not contain a positive proportion of all possible combinatorial lines. This is worrying if one wants to find an analytic approach to the theorem, because analytic approaches tend to make use not of the hypothesis that a set contains no configurations of the desired kind, but merely that the number of configurations it contains differs by a constant proportion from the expected number of those configurations in a random set.
The equal-slices measure does not obviously have this defect, though nobody has yet checked whether a stronger statement of the above kind really does hold.