Line-free sets correlate locally with complexity-1 sets

The aim of this page is to present a proof that if [math]\displaystyle{ \mathcal{A} }[/math] is a dense subset of [math]\displaystyle{ [3]^n }[/math] that contains no combinatorial line, then there is a combinatorial subspace X of [math]\displaystyle{ \mathcal{A} }[/math] with dimension tending to infinity and a dense subset [math]\displaystyle{ \mathcal{B} }[/math] of X of complexity 1. It is written in a slightly unconventional way, with first a short sketch, then a longer one that fleshes out a few details, and then a longer one still. That way, even while it is incomplete it should be understandable to some extent, and if I get stuck then it will be clearer where the problem lies.

Short sketch of argument

Preliminaries

Throughout this sketch, [math]\displaystyle{ \mathcal{A} }[/math] refers to a subset of [math]\displaystyle{ [3]^n }[/math] of density [math]\displaystyle{ \delta }[/math] in the uniform distribution on [math]\displaystyle{ [3]^n. }[/math] We shall sometimes use letters such as x, y and z for elements of [math]\displaystyle{ [3]^n }[/math] and we shall sometimes write them as triples (U,V,W) of sets that partition [n]. A triple of sets corresponds to the 1-set, the 2-set and the 3-set of a sequence. We shall pass freely between the two ways of thinking about [math]\displaystyle{ [3]^n, }[/math] at each stage using whichever is more convenient.

If (U,V,W) is an element of [math]\displaystyle{ [3]^n }[/math] and (U',V',W') is an arbitrary triple of disjoint sets (not necessarily partitioning [n]), we shall write (U,V,W)++(U',V',W') for the sequence obtained from (U,V,W) by changing everything in U' to 1, everything in V' to 2, and everything in W' to 3. For example, writing § for an unspecified coordinate, we have 331322311++§§§1§22§3=331122213. (We think of (U',V',W') as "overwriting" (U,V,W).) If Z is a subset of [n], we shall also write [math]\displaystyle{ (U,V,W)++[3]^Z }[/math] for the combinatorial subspace consisting of all [math]\displaystyle{ (U,V,W)++(U',V',W') }[/math] with [math]\displaystyle{ (U',V',W')\in[3]^Z, }[/math] and [math]\displaystyle{ (U,V,W)++[2]^Z }[/math] for the subset of this combinatorial subspace consisting of all points with [math]\displaystyle{ W'=\emptyset. }[/math]

An unexpected aspect of the proof is that we shall use both equal-slices measure and uniform measure. This decision was not arbitrary: it turns out that either measure on its own has inconvenient features that make the proof difficult, but that these difficulties can be be dealt with by passing from one to the other. (Roughly speaking, uniform measure is better for averaging arguments over subspaces, but equal-slices measure is better when we want Varnavides-type statements.) For this we need a tighter version of the statement that the versions DHJ(3) for the two measures are equivalent. We need that any set of density [math]\displaystyle{ \delta }[/math] in one of the measures can be restricted to a combinatorial subspace where its density is at least [math]\displaystyle{ \delta-\eta }[/math] in the other. I'm fairly sure that the argument for the equivalence of the two versions (given here) can be strengthened to give this conclusion, and will in due course make absolutely sure.

The main steps

Step 1. If a, b and c are all within [math]\displaystyle{ C\sqrt n }[/math] of n/3 and a+b+c=n, and if r, s and t are three integers that add up to 0 and are all at most [math]\displaystyle{ m=o(\sqrt{n}) }[/math] in modulus, then the size of the slice [math]\displaystyle{ \Gamma_{a,b,c} }[/math] is 1+o(1) times the size of the slice [math]\displaystyle{ \Gamma_{a+r,b+s,c+t}. }[/math]

Step 2. Let [math]\displaystyle{ \mu }[/math] be some probability distribution on combinatorial subspaces S of [math]\displaystyle{ [3]^n }[/math] and for each S let [math]\displaystyle{ \sigma_S }[/math] be a probability distribution on S. (We shall abbreviate [math]\displaystyle{ \sigma_S }[/math] to [math]\displaystyle{ \sigma }[/math] if S is clear from the context.) Let [math]\displaystyle{ \nu }[/math] be the distribution on [math]\displaystyle{ [3]^n }[/math] that results if you choose a subspace S at random according to [math]\displaystyle{ \mu }[/math] and then a random point x of S according to [math]\displaystyle{ \sigma }[/math]. Suppose that the distribution [math]\displaystyle{ \nu }[/math] is approximately uniform and the distributions [math]\displaystyle{ \sigma_S }[/math] are reasonably nice. Then we may assume that for [math]\displaystyle{ \mu }[/math]-almost all subpaces [math]\displaystyle{ S\subset[3]^n }[/math] the [math]\displaystyle{ \sigma }[/math]-density of [math]\displaystyle{ \mathcal{A}\cap S }[/math] is at least [math]\displaystyle{ (\delta-\eta). }[/math]

Step 3. By 1,2 and an averaging argument, we find [math]\displaystyle{ (U,V,W) }[/math] and [math]\displaystyle{ Z\subset U\cup V }[/math] of size [math]\displaystyle{ o(\sqrt{n}) }[/math] (but not much smaller than [math]\displaystyle{ \sqrt{n} }[/math]) with two properties. First, out of all pairs [math]\displaystyle{ (U',V')\in[2]^Z, }[/math] the equal-slices proportion such that [math]\displaystyle{ (U,V,W)++(U',V',\emptyset) }[/math] belongs to [math]\displaystyle{ \mathcal{A} }[/math] is at least [math]\displaystyle{ \delta/3. }[/math] Secondly, out of all triples [math]\displaystyle{ (U',V',W')\in[3]^Z, }[/math] the equal-slices proportion such that [math]\displaystyle{ (U,V,W)++(U',V',W') }[/math] belongs to [math]\displaystyle{ \mathcal{A} }[/math] is at least [math]\displaystyle{ \delta-\eta. }[/math]

Step 4. Fixing such (U,V,W) and Z, let us write (U',V',W') instead of (U,V,W)++(U',V',W'). Then if [math]\displaystyle{ U_1\subset U_2 }[/math] and [math]\displaystyle{ (U_1,Z\setminus U_1,\emptyset) }[/math] and [math]\displaystyle{ (U_2,Z\setminus U_2,\emptyset) }[/math] both belong to [math]\displaystyle{ \mathcal{A}, }[/math] then, writing [math]\displaystyle{ V_i }[/math] for [math]\displaystyle{ Z\setminus U_i, }[/math] we have that [math]\displaystyle{ (U_1,V_2,Z\setminus(U_1\cup V_2)) }[/math] does not belong to [math]\displaystyle{ \mathcal{A}. }[/math]

Step 5. Let [math]\displaystyle{ \mathcal{U} }[/math] be the set of all U such that [math]\displaystyle{ (U,Z\setminus U,\emptyset) }[/math] belongs to [math]\displaystyle{ \mathcal{A}, }[/math] and let [math]\displaystyle{ \mathcal{V}=\{Z\setminus U:U\in\mathcal{U}\}. }[/math] Then the set of all pairs [math]\displaystyle{ (U_1,V_2) }[/math] such that [math]\displaystyle{ U_1\in\mathcal{U} }[/math] and [math]\displaystyle{ V_2\in\mathcal{V} }[/math] is equal-slices dense (this follows from the proof of Sperner's theorem). It follows that [math]\displaystyle{ \mathcal{A} }[/math] is disjoint from an equal-slices-dense set of complexity 1.

Step 6. We can partition the set of all disjoint pairs [math]\displaystyle{ (U_1,V_2) }[/math] according to which of the sets [math]\displaystyle{ \mathcal{U}\times\mathcal{V}, }[/math] [math]\displaystyle{ \mathcal{U}\times\mathcal{V}^c, }[/math] [math]\displaystyle{ \mathcal{U}^c\times\mathcal{V} }[/math] or [math]\displaystyle{ \mathcal{U}^c\times\mathcal{V}^c }[/math] they belong to. There must be at least one of the three sets other than [math]\displaystyle{ \mathcal{U}\times\mathcal{V} }[/math] in which [math]\displaystyle{ \mathcal{A} }[/math] has a density increment. Thus, we have a local equal-slices density increment on a set of complexity 1.

Further details

Step 1

This one is easy. First let us prove the comparable result in [math]\displaystyle{ [2]^n. }[/math] That is, let us prove that if a is within [math]\displaystyle{ O(\sqrt{n}) }[/math] of n/2 and [math]\displaystyle{ r=o(\sqrt{n}, }[/math] then [math]\displaystyle{ \binom na=(1+o(1))\binom n{a+r}. }[/math] This is because the ratio of [math]\displaystyle{ \binom nk }[/math] to [math]\displaystyle{ \binom n{k+1} }[/math] is (k+1)/(n-k), so if [math]\displaystyle{ k=n/2+O(\sqrt{n}), }[/math] then the ratio is [math]\displaystyle{ 1+O(n^{-1/2}). }[/math] If we now multiply [math]\displaystyle{ r=o(\sqrt{n}) }[/math] such ratios together we get [math]\displaystyle{ 1+o(1). }[/math]

To get from there to a comparable statement about the sizes of slices in [math]\displaystyle{ [3]^n, }[/math] note that we can get from [math]\displaystyle{ (a,b,c) }[/math] to [math]\displaystyle{ (a+r,b+s,c+t) }[/math] by two operations where we add [math]\displaystyle{ o(\sqrt n) }[/math] to one coordinate and subtract [math]\displaystyle{ o(\sqrt{n}) }[/math] from another. Each time we do so, we multiply by [math]\displaystyle{ 1+o(1), }[/math] by the result for [math]\displaystyle{ [2]^n }[/math] (but applied to [math]\displaystyle{ [2]^p }[/math] with p close to 2n/3).

Step 2

First let us make the statement more precise. Let us say that a probability distribution [math]\displaystyle{ \nu }[/math] on a finite set X is [math]\displaystyle{ \epsilon }[/math]-uniform if [math]\displaystyle{ \nu(A) }[/math] never differs from [math]\displaystyle{ |A|/|X| }[/math] by more than [math]\displaystyle{ \epsilon. }[/math] (A probabilist would say that the total variation distance between [math]\displaystyle{ \nu }[/math] and the uniform distribution is at most [math]\displaystyle{ \epsilon. }[/math]) Then the precise claim is the following. Let [math]\displaystyle{ \epsilon,\eta\gt 0. }[/math] Suppose that [math]\displaystyle{ \mu }[/math] is a probability distribution on some collection [math]\displaystyle{ \Sigma }[/math] of combinatorial subspaces S of [math]\displaystyle{ [3]^n. }[/math] Now choose a point x randomly by first choosing a subspace S [math]\displaystyle{ \mu }[/math]-randomly from [math]\displaystyle{ \Sigma }[/math] and then choosing [math]\displaystyle{ x }[/math] [math]\displaystyle{ \sigma_S }[/math]-randomly from S. Suppose that the resulting distribution [math]\displaystyle{ \nu }[/math] is [math]\displaystyle{ \epsilon }[/math]-uniform. Then either we can find a combinatorial subspace [math]\displaystyle{ S\in\Sigma }[/math] such that [math]\displaystyle{ \sigma_S(\mathcal{A}\cap S)\geq\delta+\epsilon }[/math] or, when you choose S randomly according to the distribution [math]\displaystyle{ \mu, }[/math] the probability that [math]\displaystyle{ \sigma_S(\mathcal{A}\cap S)\leq\delta-\eta }[/math] is at most [math]\displaystyle{ 2\epsilon/\eta. }[/math]

Proof. Let us first work out a lower bound for the expectation of [math]\displaystyle{ \delta(S):=\sigma_S(\mathcal{A}\cap S). }[/math] This expectation is [math]\displaystyle{ \sum_{S\in\Sigma}\mu(S)\delta(S), }[/math] which is precisely the probability that you obtain a point in [math]\displaystyle{ \mathcal{A} }[/math] if you first pick a [math]\displaystyle{ \mu }[/math]-random S and then pick a [math]\displaystyle{ \sigma_S }[/math]-random point in S. In other words, it is [math]\displaystyle{ \nu(\mathcal{A}), }[/math] which by hypothesis is within [math]\displaystyle{ \epsilon }[/math] of [math]\displaystyle{ \delta, }[/math] and is therefore at least [math]\displaystyle{ \delta-\epsilon. }[/math] If the probability that [math]\displaystyle{ \delta(S)\lt \delta-\eta }[/math] is p and [math]\displaystyle{ \delta(S) }[/math] is bounded above by [math]\displaystyle{ \delta+\epsilon, }[/math] then the expectation of [math]\displaystyle{ \delta(S) }[/math] is at most [math]\displaystyle{ p(\delta-\eta)+(1-p)(\delta+\epsilon), }[/math] which equals [math]\displaystyle{ \delta+\epsilon-p(\eta+\epsilon). }[/math] If [math]\displaystyle{ p\gt 2\epsilon/\eta, }[/math] then this is less than [math]\displaystyle{ \delta+\epsilon-2\epsilon, }[/math] which is a contradiction. [math]\displaystyle{ \Box }[/math]

In the informal statement of Step 2 above, we said "we may assume" that almost all densities are at least [math]\displaystyle{ \delta-\eta. }[/math] The reason is that the above argument shows that the only thing that could go wrong is if there exists a subspace [math]\displaystyle{ S\in\Sigma }[/math] such that [math]\displaystyle{ \delta(S)=\sigma_S(\mathcal{A}\cap S)\geq\delta+\epsilon. }[/math] But we shall choose the measures [math]\displaystyle{ \sigma_S }[/math] in such a way that if this happens then we can pass to a further subspace inside which the uniform density is at least [math]\displaystyle{ \delta+\epsilon/2. }[/math] And if we can do that, then we have our desired density increment.

Step 3

Now let us pick a random point [math]\displaystyle{ (U,V,W) }[/math] and a random set [math]\displaystyle{ Z\subset[n] }[/math] of size [math]\displaystyle{ m=o(\sqrt{n}). }[/math] We claim first that the distribution of an equal-slices-random point in the combinatorial subspace [math]\displaystyle{ S=(U,V,W)++[3]^Z }[/math] is approximately uniform, and also that the distribution of an equal-layers random point in the set [math]\displaystyle{ T=(U,V,W)++[2]^Z }[/math] is approximately uniform. (For the sake of clarity, I'll say "equal-layers" for [math]\displaystyle{ [2]^n }[/math] and "equal-slices" for [math]\displaystyle{ [3]^n. }[/math]) Just in case there is any doubt, the equal-slices measures on the subspaces are not the restrictions of equal-slices measure on [math]\displaystyle{ [3]^n }[/math] to those subspaces: rather, they are what you get when you think of the subspaces as copies of [math]\displaystyle{ [3]^m. }[/math]

To prove this assertion (which is essentially already proved in the discussion of the equivalence of DHJ(3) for the two measures), let us first fix three non-negative integers [math]\displaystyle{ a,b,c }[/math] that add up to m, and then examine the distribution of the point [math]\displaystyle{ x }[/math] chosen by first picking a random [math]\displaystyle{ (U,V,W), }[/math] then picking a random triple [math]\displaystyle{ (U',V',W') }[/math] belonging to the slice [math]\displaystyle{ \Gamma_{a,b,c} }[/math] of [math]\displaystyle{ [3]^Z, }[/math] and finally taking the point [math]\displaystyle{ x=(U,V,W)++(U',V',W'). }[/math] This is equivalent to choosing [math]\displaystyle{ (U',V',W') }[/math] first and then filling up the rest of the sequence randomly. Since [math]\displaystyle{ U', V' }[/math] and [math]\displaystyle{ W' }[/math] are random sets of size a, b and c, the effect of this is to change very slightly the density associated with each slice. More precisely, the densities of near-central slices are hardly affected, while the densities of outlying slices are irrelevant because their total measure is tiny.

Once we've done that for a single triple (a,b,c) we can average over all of them (with appropriate weights) and get the result. For now, I will not give this argument in any more detail.

A similar argument (in fact, almost exactly the same argument) proves that if you choose an equal-layers random point in [math]\displaystyle{ (U,V,W)++[2]^Z, }[/math] then it too will have a distribution that is [math]\displaystyle{ \epsilon }[/math]-uniform.

Now let us find the particular [math]\displaystyle{ (U,V,W) }[/math] and Z that we are looking for. Because the distribution of an equal-slices random point in [math]\displaystyle{ S=(U,V,W)++[3]^Z }[/math] is [math]\displaystyle{ \epsilon }[/math]-uniform, the hypotheses of Step 2 are satisfied for the uniform measure on the subspaces S of this form and the equal-slices measure inside. Therefore, we are free to assume that the proportion of such subspaces S inside which the equal-slices density is less than [math]\displaystyle{ \delta-\eta }[/math] is at most [math]\displaystyle{ 2\epsilon/\eta. }[/math] But we also know that if we choose a random point from a random set of the form [math]\displaystyle{ (U,V,W)++[2]^Z, }[/math] then it is [math]\displaystyle{ \epsilon }[/math]-uniform, so its probability of being in [math]\displaystyle{ \mathcal{A} }[/math] is at least [math]\displaystyle{ \delta-\epsilon. }[/math] It follows that with probability at least [math]\displaystyle{ \delta/3 }[/math] the density of [math]\displaystyle{ \mathcal{A} }[/math] inside [math]\displaystyle{ (U,V,W)++[2]^Z }[/math] is at least [math]\displaystyle{ \delta/3. }[/math] So provided we choose [math]\displaystyle{ \epsilon }[/math] and [math]\displaystyle{ \eta }[/math] so that [math]\displaystyle{ 2\epsilon/\eta }[/math] is less than [math]\displaystyle{ \delta/3, }[/math] we can find [math]\displaystyle{ (U,V,W) }[/math] and [math]\displaystyle{ Z }[/math] such that both statements hold. This proves Step 3.

Step 4

In one way this is trivial, and in another it is the observation that drives the whole argument (and has been mentioned in different guises and by various people several times on the blog threads). If [math]\displaystyle{ U_1\subset U_2 }[/math] and [math]\displaystyle{ (U_1,Z\setminus U_1,\emptyset) }[/math] and [math]\displaystyle{ (U_2,Z\setminus U_2,\emptyset) }[/math] both belong to [math]\displaystyle{ \mathcal{A}, }[/math] then, writing [math]\displaystyle{ V_i }[/math] for [math]\displaystyle{ Z\setminus U_i, }[/math] the claim is that [math]\displaystyle{ (U_1,V_2,Z\setminus(U_1\cup V_2)) }[/math] does not belong to [math]\displaystyle{ \mathcal{A}. }[/math] But that is because the points [math]\displaystyle{ (U_1,V_1,\emptyset), (U_2,V_2,\emptyset) }[/math] and [math]\displaystyle{ (U_1,V_2,Z\setminus(U_1\cup V_2)) }[/math] form a combinatorial line, the first two points of which belong to [math]\displaystyle{ \mathcal{A}. }[/math]

Step 5

The set of all pairs [math]\displaystyle{ (U_1,V_2) }[/math] such that [math]\displaystyle{ U_1\in\mathcal{U} }[/math] and [math]\displaystyle{ V_2\in\mathcal{V} }[/math] is in one-to-one correspondence with the set of all pairs [math]\displaystyle{ U_1\subset U_2 }[/math] such that [math]\displaystyle{ U_1,U_2\in\mathcal{U}. }[/math] From Step 3 we know that the equal-layers density of [math]\displaystyle{ \mathcal{U} }[/math] is at least [math]\displaystyle{ \delta/3. }[/math] Therefore, if we choose a random permutation [math]\displaystyle{ \pi }[/math] of [math]\displaystyle{ [n], }[/math] the expected density of initial segments that lie in [math]\displaystyle{ \mathcal{U} }[/math] is at least [math]\displaystyle{ \delta/3. }[/math] It follows from Cauchy-Schwarz that the expected density of pairs of initial segments is at least [math]\displaystyle{ \delta^2/9. }[/math] Therefore, the set of all disjoint pairs [math]\displaystyle{ (U_1,V_2) }[/math] that belong to [math]\displaystyle{ \mathcal{U}\times\mathcal{V} }[/math] has density at least [math]\displaystyle{ \delta^2/9 }[/math] in the set of all disjoint pairs (where the density of pairs is given by first choosing their cardinalities randomly and then choosing the sets given the cardinalities).

It remains to deduce from this that the collection of points [math]\displaystyle{ (U,V,W) }[/math] such that [math]\displaystyle{ U\in\mathcal{U} }[/math] and [math]\displaystyle{ V\in\mathcal{V} }[/math] is equal-slices dense. Hang on, I've just shown precisely the statement that the equal-slices density of this set is at least [math]\displaystyle{ \delta^2/9. }[/math]

Step 6

This one is very simple. We have partitioned [math]\displaystyle{ [3]^Z }[/math] into four special sets of complexity 1. [math]\displaystyle{ \mathcal{A} }[/math] is disjoint from one of those sets, which has density at least [math]\displaystyle{ \delta^2/9. }[/math] Therefore, of at least one of the other three we must be able to say that its density is [math]\displaystyle{ \alpha }[/math] but it contains at least [math]\displaystyle{ \alpha\delta+\delta^2/27)3^m }[/math] points of [math]\displaystyle{ \mathcal{A} }[/math] (since otherwise the density of [math]\displaystyle{ \mathcal{A} }[/math] would not be [math]\displaystyle{ \delta }[/math]). This gives us a density increment of at least [math]\displaystyle{ \delta^2/27 }[/math] on some special set of complexity 1, which itself must have density at least [math]\displaystyle{ \delta^2/27 }[/math] in [math]\displaystyle{ [3]^Z }[/math].

Remarks

This argument is intended to form part of a density-increment strategy for proving DHJ(3). It is closely analogous to, though not quite the same as, a statement that plays an important role in Shkredov's proof of the corners theorem. It reduces the problem to understanding special sets of complexity 1, which should in principle be much easier than the original problem, as it is amenable to the kinds of techniques that can be used to prove Sperner's theorem. This reduced problem will shortly be considered in a separate page.

The above write-up is clearly not of the precision that would be demanded in a journal article, and it has not been thoroughly checked. But it feels natural enough to be robust, in the sense that any mistakes ought to be technical rather than fundamental.

[math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math][math]\displaystyle{ }[/math]

Line-free sets correlate locally with complexity-1 sets

Contents

Short sketch of argument

Preliminaries

The main steps

Further details

Step 1

Step 2

Step 3

Step 4

Step 5

Step 6

Remarks

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools