Line free sets correlate locally with dense sets of complexity k-2

From Polymath1Wiki
Jump to: navigation, search


The aim of this page is to generalize the proof that line-free subsets of [math][3]^n[/math] correlate locally with sets of complexity 1 to an analogous statement for line-free subsets of [math][k]^n.[/math]

Until this sentence is removed, there is no guarantee that this aim will be achieved, or even that a plausible sketch will result.

Notation and definitions

The actual result to be proved is more precise than the result given in the title of the page. To explain it we need a certain amount of terminology. Given [math]j\leq k[/math] and a sequence [math]x\in[k]^n,[/math] let [math]U_j(x)[/math] denote the set [math]\{i\in[n]:x_i=j\}.[/math] We call this the j-set of x. More generally, if [math]E\subset[k],[/math] let [math]U_E(x)[/math] denote the sequence [math](U_j(x):j\in E).[/math] We call this the E-sequence of x. For example, if n=10, k=4, [math]x=3442411123[/math] and [math]E=\{2,3\}[/math] then the E-sequence of x is [math](\{4,9\},\{1,10\}).[/math] It can sometimes be nicer to avoid set-theoretic brackets and instead to say things like that the 24-sequence of x is [math](49,235).[/math]

An E-set is a set [math]\mathcal{A}\subset[k]^n[/math] such that whether or not x belongs to [math]\mathcal{A}[/math] depends only on the E-sequence of x. In other words, to define an E-set one chooses a collection [math]\mathcal{U}_E[/math] of sequences of the form [math](U_i:i\in E),[/math] where the [math]U_i[/math] are disjoint subsets of [n], and one takes the set of all x such that [math](U_i(x):i\in E)\in\mathcal{U}_E.[/math] More generally, an [math](E_1,\dots,E_r)[/math]-set is an intersection of [math]E_h[/math]-sets as h runs from 1 to r.

Of particular interest to us will be the sequence [math](E_1,\dots,E_{k-1}),[/math] where [math]E_j=[k-1]\setminus j.[/math] What is an [math](E_1,\dots,E_{k-1})[/math]set? It is a set where membership depends on k-1 conditions, and the jth condition on a sequence x depends just on the sets [math]U_i(x)[/math] with [math]i\leq k-1[/math] and [math]i\ne j.[/math]

Since it provides a useful illustrative example, let us describe the class of sets that will arise in the proof. Suppose we have a set [math]\mathcal{B}\subset[k-1]^n.[/math] We can use [math]\mathcal{B}[/math] to define a set [math]\mathcal{C}\subset[k]^n[/math] as follows. A sequence x belongs to [math]\mathcal{C}[/math] if and only if for each j < k the following holds: if change all the k's of x into j's, the resulting sequence is in [math]\mathcal{B}.[/math]

To see that this is an [math](E_1,\dots,E_{k-1})[/math]-set, it is enough to show that the set [math]\mathcal{C}_j[/math] of all x such that changing ks to js puts you in [math]\mathcal{B}[/math] is an [math]E_j[/math]-set. And indeed, whether or not x belongs to that set does depend only on the i-sets [math]U_i(x)[/math] with [math]i\leq k-1[/math] and [math]i\ne j,[/math] since once you know those you have determined the sequence obtained from x when you rewrite the ks as js.


Proof of the main result, with a gap still left to fill

To write this section, I lifted a section from the equivalent proof for DHJ(3) and made such changes as were necessary. One result did not go through straightforwardly and now needs to be thought about. (It is pretty clear that it can be done -- it will just need a bit of work.)

Let us assume for now that the equal-slices density of [math]\mathcal{A}[/math] in [math][k]^n[/math] is at least [math]\delta[/math], and that the equal-slices density of [math]\mathcal{A} \cap [k-1]^n[/math] in [math][k-1]^n[/math] is at least [math]\theta[/math]. As discussed in the sections below, we can reduce to this case by passing to subspaces. Let us write [math]\mathcal{B}[/math] for [math]\mathcal{A}\cap[k-1]^n,[/math] and let [math]\mathcal{C}[/math] be the set defined in terms of [math]\mathcal{B}[/math] in the previous section.

We shall now deduce from the fact that [math]\mathcal{A}[/math] contains no combinatorial line that it is disjoint from [math]\mathcal{C}.[/math] Indeed, this is pretty well trivial. Given any sequence x, let [math]\rho_j(x)[/math] be the sequence obtained by changing all the ks in x to js. Then the set [math]\{\rho_1(x),\dots,\rho_{k-1}(x),x\}[/math] is a combinatorial line (with wildcard set [math]U_k(x)[/math]). Therefore, not all its points belong to [math]\mathcal{A}[/math] which implies that either [math]x\notin\mathcal{A}[/math] or [math]x\notin\mathcal{C}.[/math]

We will now show that [math]\mathcal{C}[/math] is dense in equal-slices measure on [math][k]^n[/math].

Hmm ... this is less obvious. Hopefully one can use the understanding of the equal-slices distribution for DHJ(k).

For now let me just say what happens if this is true. If [math]\mathcal{C}=\mathcal{C}_1\cap\dots\cap\mathcal{C}_{k-1}[/math] has density at least [math]\eta,[/math] then consider all the [math]2^{k-1}-1[/math] sets of the form [math]\mathcal{D}_1\cap\dots\cap\mathcal{D}_{k-1},[/math] where each [math]\mathcal{D}_i[/math] is either [math]\mathcal{C}_i[/math] or its complement, and not every [math]\mathcal{D}_i[/math] equals [math]\mathcal{C}_i.[/math] Then there must be some i such that the measure [math]\mu(\mathcal{A}\cap\mathcal{D}_i)[/math] is at least [math]\delta\mu(\mathcal{D}_i)+\eta/(2^{k-1}-1).[/math] This gives us a density increase of at least [math]\eta/2^{k-1}[/math] on a set of density at least [math]\eta/2^{k-1}.[/math]

We should be able to move from this relative density-increment under equal-slices to a nearly as good one under uniform by appropriately generalizing the results in the passing between measures section ("relative density version").

DHJ(k) implies equal-slices Varnavides-DHJ(k)

This section is intended to fill the main gap in the section above. We would like to deduce from DHJ(k) that if [math]\mathcal{B}[/math] is an equal-slices dense subset of [math][k]^n,[/math] then with positive probability an equal-slices random combinatorial line is contained in [math]\mathcal{A}.[/math]

To prove this it is enough to do the following. Let the density of [math]\mathcal{B}[/math] be [math]\theta,[/math] let [math]\zeta[/math] be some positive constant that depends on [math]\theta[/math] only, and let M be large enough that every subset of [math][k]^M[/math] of density at least [math]\zeta[/math] contains a combinatorial line. We would now like to find a positive linear combination [math]\nu[/math] of characteristic functions of M-dimensional subspaces of [math][k]^n[/math] with the following properties.

  • [math]\nu[/math] is a probability measure: that is, it sums to 1.
  • If [math]\mathcal{B}[/math] is any subset of [math][k]^n[/math] with equal-slices density [math]\theta\gt0,[/math] then [math]\nu(\mathcal{B})\geq\eta=\eta(\theta)\gt0.[/math]
  • Let [math]\mathcal{L}[/math] be a set of combinatorial lines. Define the [math]\nu[/math]-density of [math]\mathcal{L}[/math] to be the probability that a random line L belongs to [math]\mathcal{L}[/math] if you choose it by first picking a [math]\nu[/math]-random subspace and then picking a random line (uniformly) from that subspace. (See below for the definition of [math]\nu[/math]-random subspace.) Then if the [math]\nu[/math]-density of [math]\mathcal{L}[/math] is at least [math]c,[/math] then the equal-slices density of [math]\mathcal{L}[/math] is at least [math]c'(c).[/math]

Let us see why these three properties are enough. Let the M-dimensional subspaces of [math][k]^n[/math] be enumerated as [math]S_1,\dots,S_N,[/math] and let [math]\sigma_i[/math] be the uniform probability measure on [math]S_i.[/math] Then we are looking for a convex combination [math]\nu=\sum_{i=1}^N\lambda_i\sigma_i[/math] of the measures [math]\sigma_i.[/math] Given such a [math]\nu,[/math] we define a [math]\nu[/math]-random subspace to be what you get when you choose [math]S_i[/math] with probability [math]\lambda_i.[/math]

Suppose we have found [math]\nu[/math] with the three properties above, and let [math]\mathcal{B}[/math] be a set of equal-slices density [math]\theta.[/math] Then by hypothesis we have that [math]\nu(\mathcal{B})\geq\eta.[/math] This implies that if [math]S_i[/math] is a [math]\nu[/math]-random subspace, then the probability that [math]\sigma_i(\mathcal{B})\geq\eta/2[/math] is at least [math]\eta/2.[/math] If we have taken [math]\zeta[/math] to be [math]\eta/2,[/math] then each subspaces [math]S_i[/math] for which [math]\sigma_i(\mathcal{B})\geq\eta/2[/math] contains a combinatorial line in [math]\mathcal{B}.[/math] Since an M-dimensional subspace contains [math](k+1)^M[/math] combinatorial lines, this implies that the [math]\nu[/math]-density of the set of all combinatorial lines in [math]\mathcal{B}[/math] is at least [math]\eta/2(k+1)^M.[/math] Hence, by the third property, the equal-slices density of this set of lines is at least [math]c(\theta,k,M)\gt0.[/math]

It remains to define an appropriate measure [math]\nu.[/math] I think probably a fairly obvious equal-slices measure on M-dimensional subspaces will do the trick, but will have to wait to check this.

Here is a preliminary attempt to do the check. This will be tidied up at some point if it works. Let us define a random M-dimensional subspace S by randomly choosing numbers [math]a_1+\dots+a_M=n[/math] and then randomly partitioning [math][n][/math] into wildcard sets [math]Z_1,\dots,Z_M[/math] with [math]|Z_i|=a_i.[/math] Then [math]\nu[/math] is the measure you get by choosing a random subspace in this way and then choosing a random point in that subspace.

Now [math]\nu[/math] is nothing like the equal-slices density, since a typical point will have roughly equal numbers of each coordinate value. However, the probability that we deviate from this average is small as a function of M rather than as a function of n, so the second property looks fine.

A similar argument appears to deal with the third property. It may be a bore to make it rigorous, but I definitely believe it.