Austin's proof II

From Polymath Wiki
Revision as of 17:25, 13 March 2009 by Teorth (talk | contribs)
Jump to navigationJump to search

This is a variant of Austin's proof that eschews Ramsey theory (except for pigeonhole principle) and/or density incrementation.

DHJ(3) If [math]\displaystyle{ \delta \gt 0 }[/math] and n is sufficiently large depending on [math]\displaystyle{ \delta }[/math], then any subset A of [math]\displaystyle{ [3]^n }[/math] of density at least [math]\displaystyle{ \delta }[/math] contains a combinatorial line.

Step 0: selection of parameters

We can assume that [math]\displaystyle{ \delta }[/math] is smaller than some small absolute constant.

We will need a number [math]\displaystyle{ \varepsilon \gt 0 }[/math] that is sufficiently small depending on [math]\displaystyle{ \delta }[/math]. This parameter measures the amount of regularity we will extract from the abstract regularity lemma.

We need a function [math]\displaystyle{ F: {\Bbb R}^+ \to {\Bbb R}^+ }[/math] that is sufficiently rapidly growing depending on [math]\displaystyle{ \varepsilon,\delta }[/math]. This measures the quality of the fine approximation from the abstract regularity lemma.

We need a number L which is sufficiently large depending on [math]\displaystyle{ \varepsilon,\delta,F }[/math]. This measures the number of scales between the partitions in the abstract regularity lemma, and also the separation between adjacent scales.

We need a number J which is sufficiently large depending on [math]\displaystyle{ \varepsilon,\delta,F, L }[/math]. This measures the total number of scales in play.

We assume n is sufficiently large depending on [math]\displaystyle{ \varepsilon, \delta, F, L, J }[/math].

We introduce the scales [math]\displaystyle{ n = n_0 \gt n_1 \gt \ldots \gt n_J }[/math] by setting [math]\displaystyle{ n_j := \lfloor n/L^j \rfloor }[/math]. Note that [math]\displaystyle{ n_j }[/math] are all larger than 1.

Step 1: Setting up the probability space and basic partitions

We set up a probability space X by creating the following random variables:

  1. First, select a string x at random from [math]\displaystyle{ [3]^n }[/math] using equal-slices measure. (NB: one could also use uniform measure here, but this is inferior as one then has to make each [math]\displaystyle{ n_{j+1} }[/math] something like [math]\displaystyle{ n_j^{1/2}/L }[/math] rather than [math]\displaystyle{ n_j/L }[/math].)
  2. Set [math]\displaystyle{ A_0 := [n] }[/math], and then for [math]\displaystyle{ j=1,2,3,\ldots,J }[/math] in turn, set [math]\displaystyle{ A_j }[/math] to be a random subset of [math]\displaystyle{ A_{j-1} }[/math] of cardinality [math]\displaystyle{ n_j }[/math], chosen uniformly at random for any fixed choice of [math]\displaystyle{ x,A_0,\ldots,A_{j-1} }[/math].

Let X be the probability space generated by the random variables [math]\displaystyle{ x, A_0,\ldots,A_J }[/math]. This is a finite probability space, with every point occurring with positive probability (we can of course throw out all configurations that occur with zero probability). Clearly any function [math]\displaystyle{ f: [3]^n \to {\Bbb R} }[/math] can be equated with the random variable f(x) on X.

Next, for [math]\displaystyle{ 0 \leq j \leq J }[/math] and [math]\displaystyle{ i=0,1,2 }[/math], we form the following [math]\displaystyle{ \sigma }[/math]-algebras (partitions):

  1. [math]\displaystyle{ {\mathcal B}^j_{012} }[/math] is the space of all events that depend on the sets [math]\displaystyle{ A_0,\ldots,A_j }[/math], and on the string x.
  2. [math]\displaystyle{ {\mathcal B}^j_{\emptyset} }[/math] is the space of all events that depend on the sets [math]\displaystyle{ A_0,\ldots,A_j }[/math], and on the values of the string x outside of [math]\displaystyle{ A_j }[/math].
  3. [math]\displaystyle{ {\mathcal B}^j_i }[/math] is the space of all events that depend on the sets [math]\displaystyle{ A_0,\ldots,A_j }[/math], on the values of the string x outside of [math]\displaystyle{ A_j }[/math], and on the i-set of x inside [math]\displaystyle{ A_j }[/math] (i.e. the set of positions inside [math]\displaystyle{ A_j }[/math] where x takes the value i).

We observe the inclusions

[math]\displaystyle{ {\mathcal B}^j_\emptyset \leq {\mathcal B}^j_i \leq {\mathcal B}^j_{012} }[/math]

for i=0,1,2, and

[math]\displaystyle{ {\mathcal B}^j_S \leq {\mathcal B}^{j+1}_S }[/math]

for [math]\displaystyle{ S=\emptyset,0,1,2,012 }[/math].

Step 2. Regularisation

We apply the abstract regularity lemma with the functions [math]\displaystyle{ f_0=f_1=f_2:=1_A(x) }[/math], the parameters [math]\displaystyle{ \varepsilon, F }[/math], and the [math]\displaystyle{ \sigma }[/math]-algebras [math]\displaystyle{ {\mathcal B}^{mL}_i }[/math] for [math]\displaystyle{ m=0,1,2,\ldots }[/math]. This gives us positive integers [math]\displaystyle{ M \leq M' \leq O_{F,\varepsilon}(1) }[/math] and factors [math]\displaystyle{ {\mathcal A}_i^{coarse} \leq {\mathcal A}_i^{fine} }[/math] for i=0,1,2 with the following properties for {i,j,k}={0,1,2}:

  1. (Coarse partition is low complexity) [math]\displaystyle{ {\mathcal A}_i^{coarse} \leq {\mathcal B}_i^{ML} }[/math], and [math]\displaystyle{ {\mathcal A}_i^{coarse} }[/math] has complexity M.
  2. (Fine partition is medium complexity) [math]\displaystyle{ {\mathcal A}_i^{fine} \leq {\mathcal B}_i^{M'L} }[/math], and [math]\displaystyle{ {\mathcal A}_i^{fine} }[/math] has complexity M'.
  3. (Coarse and fine approximations are close) [math]\displaystyle{ \| {\Bbb E}(1_A | {\mathcal A}_j^{fine} \vee {\mathcal A}_k^{fine} ) - {\Bbb E}(1_A | {\mathcal A}_j^{coarse} \vee {\mathcal A}_k^{coarse} ) \|_{L^2(X)} \leq \varepsilon }[/math].
  4. (Fine approximation is extremely high quality) For any [math]\displaystyle{ E_j \in {\mathcal B}_j^{M'L+L} }[/math], E_k \in {\mathcal B}_k^{M'L+L}</math>, we have [math]\displaystyle{ | {\Bbb E} (1_A - {\Bbb E}(1_A| {\mathcal A}_j^{fine} \vee {\mathcal A}_k^{fine} )) 1_{E_j} 1_{E_k} | \leq 1/F(M) }[/math].

For each {i,j,k}={0,1,2} we decompose

[math]\displaystyle{ 1_A = f_{i,U^\perp} + f_{i,err} + f_{i,U} }[/math]

where

[math]\displaystyle{ f_{i,U^\perp} = {\Bbb E}(1_A | {\mathcal A}_j^{coarse} \vee {\mathcal A}_k^{coarse} ) }[/math]
[math]\displaystyle{ f_{i,err} = {\Bbb E}(1_A | {\mathcal A}_j^{fine} \vee {\mathcal A}_k^{fine} ) - {\Bbb E}(1_A | {\mathcal A}_j^{coarse} \vee {\mathcal A}_k^{coarse} ) }[/math]
[math]\displaystyle{ f_{i,U} = 1_A - {\Bbb E}(1_A | {\mathcal A}_j^{fine} \vee {\mathcal A}_k^{fine} ) }[/math].

We record some salient features of these functions:

  1. (Structure) [math]\displaystyle{ f_{i,U^\perp} }[/math] has density [math]\displaystyle{ \delta }[/math], takes values between 0 and 1, and is constant on intersections between cells of [math]\displaystyle{ {\mathcal A}_j^{coarse} }[/math] and cells of [math]\displaystyle{ {\mathcal A}_k^{coarse} }[/math] (there are at most [math]\displaystyle{ 2^{2M} }[/math] such intersections in all).
  2. (Smallness) [math]\displaystyle{ f_{i,err} }[/math] has an [math]\displaystyle{ L^2 }[/math] norm of at most [math]\displaystyle{ \varepsilon }[/math].
  3. (Uniformity) [math]\displaystyle{ |{\Bbb E} f_{i,U} 1_{E_j} 1_{E_k}| \leq 1/F(M) }[/math] whenever [math]\displaystyle{ E_j \in {\mathcal B}_j^{M'L+L} }[/math] and [math]\displaystyle{ E_j \in {\mathcal B}_k^{M'L+L} }[/math].
  4. (Measurability) The values of [math]\displaystyle{ f_{i,U^\perp}, f_{i,err}, f_{i,U} }[/math] depend on the string x and on the choices of [math]\displaystyle{ A_0, A_1, \ldots, A_{M'L} }[/math], but are independent of subsequent choices of the [math]\displaystyle{ A }[/math]'s. In other words, these random variables are [math]\displaystyle{ {\mathcal B}^{M'L}_{012} }[/math]-measurable.

We also observe

  1. (Large intersection) We have [math]\displaystyle{ {\Bbb E} \prod_{i=0,1,2} f_{i,U^\perp} \gg \delta^4 }[/math].

Indeed, as [math]\displaystyle{ f_{i,U^\perp} }[/math] is a conditional expectation of [math]\displaystyle{ 1_A }[/math], we see that A has measure at most [math]\displaystyle{ \delta/10 }[/math] on the set [math]\displaystyle{ \{ f_{i,U^\perp} \leq \delta/10 \} }[/math]. From the union bound, we conclude that on at least half of A, [math]\displaystyle{ f_{i,U^\perp} \geq \delta/10 }[/math] for all i=0,1,2, and the claim follows.

Step 3. Secondary regularisation

This is a step which is needed in the DHJ setting but not in the triangle-removal setting, due to the multiplicity of scales.

For [math]\displaystyle{ 0 \leq l \leq L }[/math], define the energy

[math]\displaystyle{ E_l := \sum_{\{i,j,k\}=\{0,1,2\}} ??? }[/math]

???. Thus, by the pigeonhole principle we can find [math]\displaystyle{ 0 \leq l \lt L }[/math] such that

[math]\displaystyle{ E_{l+1}-E_l \ll_{F,\varepsilon} 1/L }[/math].

Fix this l. By Pythagoras, we conclude that

???


Lemma 1 (Independence lemma) ...

Proof ???

Step 4. Counting

Now for the "counting lemma" stage. Let f,g,h be real-valued [math]\displaystyle{ {\mathcal B}^{M'L+l}_{012} }[/math]-measurable random variables; we view f,g,h as functions [math]\displaystyle{ f,g,h: [3]^n \to {\Bbb R} }[/math] that also depend on the choices of sets [math]\displaystyle{ A_0,\ldots,A_{M'L+l} }[/math]. Given such functions, we define the trilinear forms

[math]\displaystyle{ \Lambda_0(f,g,h) := {\Bbb E} f(x) g( \rho^{M'L+l+1}_{0 \to 1} x ) h( \rho^{M'L+l+1}_{0 \to 2} x ) }[/math]
[math]\displaystyle{ \Lambda_1(f,g,h) := {\Bbb E} f(\rho^{M'L+l+1}_{1 \to 0} x) g( x ) h( \rho^{M'L+l+1}_{1 \to 2} x ) }[/math]
[math]\displaystyle{ \Lambda_2(f,g,h) := {\Bbb E} f(\rho^{M'L+l+1}_{2 \to 0} x) g( \rho^{M'L+1}_{2 \to 1} x ) h( x ) }[/math]

These trilinear forms are essentially equivalent:

Lemma 2 If f,g,h are [math]\displaystyle{ {\mathcal B}^{M'L+l}_{012} }[/math]-measurable and take values in [-1,1], then [math]\displaystyle{ \Lambda_0(f,g,h), \Lambda_1(f,g,h), \Lambda_2(f,g,h) }[/math] differ by [math]\displaystyle{ O(1/L) }[/math].

Proof: Suppose we condition on (i.e. fix) the choices of [math]\displaystyle{ A_0,\ldots,A_{M'L+l} }[/math], as well as the coefficients of x outside of [math]\displaystyle{ A_{M'L+l} }[/math]. Then x is distributed in an [math]\displaystyle{ n_{M'L+l} }[/math]-dimensional combinatorial subspace Vindexed by [math]\displaystyle{ A_{M'L+l} }[/math], and essentially has the equal-slices distribution (it may even have exactly the equal-slices distribution - did not check this). The random lines [math]\displaystyle{ \ell_i: [3] \to V }[/math] defined by [math]\displaystyle{ \ell_i(j) := \rho^{M'L+l+1}_{i \to j}(x) }[/math] are then random lines in V; one can check by some computation (to be done later) that the three probability distributions for these random lines differ in total variation by [math]\displaystyle{ O(1/L) }[/math], basically because [math]\displaystyle{ n_{M'L+l+1} = O(n_{M'L+l} / L) }[/math]. [math]\displaystyle{ \Box }[/math]

We now seek a lower bound for

[math]\displaystyle{ \Lambda_0(1_A, 1_A, 1_A) }[/math] (1)

(note from Lemma 1 that we may replace [math]\displaystyle{ \Lambda_0 }[/math] here by [math]\displaystyle{ \Lambda_1 }[/math] or [math]\displaystyle{ \Lambda_2 }[/math] and only accept an error of [math]\displaystyle{ O(1/L) }[/math] in the process). We have the partitions [math]\displaystyle{ {\mathcal A}_i^{coarse} }[/math] for i=0,1,2, which (as they have complexity at most M) partition X into disjoint events [math]\displaystyle{ E_i^1,\ldots,E_i^{2^M} \in {\mathcal B}^{M'L+l}_i }[/math]. We can then partition (1) as

[math]\displaystyle{ \sum_{1 \leq a,b,c \leq 2^M} \Lambda_0( 1_A 1_{E_1^b}, 1_A 1_{E_2^c}, 1_A 1_{E_0^a} ) }[/math]

which by various invariance properties can be rearranged as

[math]\displaystyle{ \sum_{1 \leq a,b,c \leq 2^M} \Lambda_0( 1_A 1_{E_1^b \cap E_2^c}, 1_A 1_{E_0^a \cap E_{E^2_c}, 1_A 1_{E_0^a \cap E_1^b} ) }[/math].

This is analogous to how the number of triangles in a tripartite graph can be split up into a sum over triples of cells in each of the three pieces.

Now we throw away some "bad" triples (a,b,c), but as opposed to the triangle removal case, in which there is a globally deterministic choice of which triples are bad, here we must work "relative" to the factor [math]\displaystyle{ {\mathcal B}^{M'L+l}_\emptyset }[/math], and so the badness of any given triple (a,b,c) is a [math]\displaystyle{ {\mathcal B}^{M'L+l}_\emptyset }[/math]-measurable random variable, i.e. it depends on [math]\displaystyle{ A_0,\ldots,A_{M'L+l} }[/math] and on the values of x outside of [math]\displaystyle{ A_{M'L+l} }[/math]. If we fix (condition) those random variables, we say that a pair (a,b) "01-bad" if at least one of the following is true. It is convenient to define the oscillation osc(f) of a function f by the formula

[math]\displaystyle{ osc(f) := {\Bbb E}( |{\Bbb E}(f|{\mathcal B}^{M'L+l+1}_\emptyset) - {\Bbb E}(f|{\mathcal B}^{M'L+l}_\emptyset)|^2 | {\mathcal B}^{M'L+l}_\emptyset ) }[/math];

note that this is a [math]\displaystyle{ {\mathcal B}^{M'L+l}_\emptyset }[/math]-measurable quantity.

  1. (0-cell too small) The expected value [math]\displaystyle{ {\Bbb E}(1_{E_0^a}|{\mathcal B}^{M'L+l}_\emptyset) }[/math] of [math]\displaystyle{ E_0^a }[/math] is less than [math]\displaystyle{ \varepsilon^{0.1}/2^M }[/math].
  2. (1-cell too small) The expected value [math]\displaystyle{ {\Bbb E}(1_{E_1^b}|{\mathcal B}^{M'L+l}_\emptyset) }[/math] of [math]\displaystyle{ E_1^b }[/math] is less than [math]\displaystyle{ \varepsilon^{0.1}/2^M }[/math].
  3. (0-cell too unstable) [math]\displaystyle{ osc(1_{E_0^a}) }[/math] is larger than [math]\displaystyle{ L^{-0.1} }[/math].
  4. (1-cell too unstable) [math]\displaystyle{ osc(1_{E_1^b}) }[/math] is larger than [math]\displaystyle{ L^{-0.1} }[/math].
  5. (01-cell too unstable) [math]\displaystyle{ osc(1_{E_0^a \cap E_1^b}) }[/math] is larger than [math]\displaystyle{ L^{-0.1} }[/math].
  6. ([math]\displaystyle{ f_{2,err} }[/math] too unstable) [math]\displaystyle{ osc(f_{2,err} 1_{E_0^a \cap E_1^b}) }[/math] is larger than [math]\displaystyle{ L^{-0.1} }[/math].
  7. (A too sparse) [math]\displaystyle{ f_{0,U^\perp} }[/math] (which is constant on [math]\displaystyle{ E_0^a \cap E_1^b }[/math] by construction) is less than [math]\displaystyle{ \varepsilon^{0.1} }[/math] on [math]\displaystyle{ E_0^a \cap E_1^b }[/math].
  8. (A too irregular) The density [math]\displaystyle{ {\Bbb E}(|f_{2,err}| 1_{E_0^a \cap E_1^b} |{\mathcal B}^{M'L+l}_\emptyset) / {\Bbb E}(1_{E_0^a \cap E_1^b} |{\mathcal B}^{M'L+l}_\emptyset) }[/math] is greater than [math]\displaystyle{ \varepsilon^{0.9} }[/math]. (The non-vanishing of the denominator here will follow from the preceding hypotheses and Lemma 1.)

We similarly define the notion of a pair (b,c) being "12-bad", or a pair (c,a) being "20-bad". We say that a triple (a,b,c) is bad if either (a,b) is 01-bad, (b,c) is 12-bad, or (c,a) is 20-bad. Call a triple (a,b,c) good if it is not bad. We can clearly bound (1) from below by

[math]\displaystyle{ \sum_{(a,b,c)} X_{a,b,c} }[/math]

where

[math]\displaystyle{ X_{a,b,c} = {\Bbb E} 1_{(a,b,c) good} \Lambda_0( 1_A 1_{E_1^b \cap E_2^c}, 1_A 1_{E_0^a \cap E_{E^2_c}, 1_A 1_{E_0^a \cap E_1^b} ) }[/math] (2)

and the expectation used to define [math]\displaystyle{ \Lambda_0 }[/math] is conditioned relative to [math]\displaystyle{ {\mathcal B}^{M'L+l}_\emptyset }[/math]. (This is clunky - I will rewrite the notation later.) The outer expectation is in the [math]\displaystyle{ \sigma }[/math]-algebra [math]\displaystyle{ {\mathcal B}^{M'L+l}_\emptyset }[/math].

Now we split each of the factors [math]\displaystyle{ 1_A }[/math] here into [math]\displaystyle{ 1_A = f_{i,U^\perp} + f_{i,err} + f_{i,U} }[/math] for i=0,1,2. We now dispose of the contribution of the uniform parts. Take for instance the contribution of [math]\displaystyle{ f_{0,U} }[/math] to [math]\displaystyle{ X_{a,b,c}: :\lt math\gt {\Bbb E} 1_{(a,b,c) good} \Lambda_0( f_{0,U} 1_{E_1^b \cap E_2^c}, 1_A 1_{E_0^a \cap E_{E^2_c}, 1_A 1_{E_0^a \cap E_1^b} ) }[/math].

We expand out [math]\displaystyle{ \Lambda_0 }[/math] and write this as

[math]\displaystyle{ {\Bbb E} 1_{(a,b,c) good} {\Bbb E}( f_{0,U} 1_{E_1^b \cap E_2^c}(x) 1_A 1_{E_0^a \cap E_{E^2_c}( \rho^{M'L+l+1}_{0 \to 1} x ) 1_A 1_{E_0^a \cap E_{E^2_c}( \rho^{M'L+l+1}_{0 \to 2} x ) | {\mathcal B}^{M'L+l}_\emptyset ). }[/math]

After a lot of rearranging, this is the expectation of [math]\displaystyle{ f_{0,U} }[/math], times a bounded [math]\displaystyle{ {\mathcal B}^{M'L+l+1}_1 }[/math]-measurable function, times a bounded [math]\displaystyle{ {\mathcal B}^{M'L+l+1}_2 }[/math]-measurable function (the point being that every factor, other than the one coming from [math]\displaystyle{ f_{0,U} }[/math], is either 02-insensitive or 01-insensitive on [math]\displaystyle{ A_{M'L+l+1} }[/math]. Applying the uniformity property of [math]\displaystyle{ f_{0,U} }[/math], we conclude that this term is [math]\displaystyle{ O( 1 / F(M) ) }[/math]. We can thus replace the first [math]\displaystyle{ 1_A }[/math] term in (2) by [math]\displaystyle{ 1_A - f_{0,U} = f_{0,U^\perp} + f_{0,err} }[/math]. Similarly for the other two terms in [math]\displaystyle{ X_{a,b,c} }[/math] (using Lemma 2 as necessary, losing an additional factor of [math]\displaystyle{ O(1/L) }[/math]). We are thus left with

[math]\displaystyle{ X_{a,b,c} = {\Bbb E} 1_{(a,b,c) good} \Lambda_0( (f_{0,U^\perp}+f_{0,err}) 1_{E_1^b \cap E_2^c}, (f_{1,U^\perp}+f_{1,err}) 1_{E_0^a \cap E_{E^2_c}, (f_{2,U^\perp}+f_{2,err}) 1_{E_0^a \cap E_1^b} ) + O( 1/F(M) ) + O(1/L). }[/math] (3)

Now let us compare this quantity with the quantity

[math]\displaystyle{ Y_{a,b,c} := {\Bbb E} 1_{(a,b,c) good} \Lambda_0( 1_{E_1^b \cap E_2^c}, 1_{E_0^a \cap E^2_c}, 1_{E_0^a \cap E_1^b} ). }[/math]

First, let's look at the main term of (3), given by [math]\displaystyle{ f_{0,U^\perp}, f_{1,U^\perp}, f_{2,U^\perp} }[/math]. As (a,b,c) is good, [math]\displaystyle{ f_{0,U^\perp} }[/math] is constant on [math]\displaystyle{ E_1^b \cap E_2^c }[/math] and is at least [math]\displaystyle{ \varepsilon^{0.1} }[/math] on this cell, and similarly for the other two terms. Thus this contibution to [math]\displaystyle{ X_{a,b,c} }[/math] is at least [math]\displaystyle{ \varepsilon^{0.3} Y_{a,b,c} }[/math].

Now let's look at one of the other terms, say a term involving [math]\displaystyle{ f_{0,err} }[/math]. Taking absolute values and bounding the other f factors by O(1), this type of term is bounded in magnitude by

[math]\displaystyle{ {\Bbb E} 1_{(a,b,c) good} \Lambda_0( |f_{0,err}| 1_{E_1^b \cap E_2^c}, 1_{E_0^a \cap E^2_c}, 1_{E_0^a \cap E_1^b} ). }[/math]

We expand this out and collect some terms to obtain

[math]\displaystyle{ {\Bbb E} 1_{(a,b,c) good} |f_{0,err}| 1_{E_1^b \cap E_2^c}(x) 1_{E_0^a}( \rho^{M'L+l+1}_{012 \to 2}(x) ); }[/math]

since [math]\displaystyle{ 1_{E_0^a}( \rho^{M'L+l+1}_{012 \to 2}(x) ) }[/math] is [math]\displaystyle{ {\mathcal B}^{M'L+l+1}_\emptyset }[/math]-measurable, we can rewrite this as

[math]\displaystyle{ {\Bbb E} 1_{(a,b,c) good} {\Bbb E}( |f_{0,err}| 1_{E_1^b \cap E_2^c} | {\mathcal B}^{M'L+l+1}_\emptyset ) 1_{E_0^a}(x). }[/math]

As (a,b,c) is good, we can replace the [math]\displaystyle{ {\mathcal B}^{M'L+l+1}_\emptyset }[/math] conditional expectation here by [math]\displaystyle{ {\mathcal B}^{M'L+l}_\emptyset }[/math], accepting an error of [math]\displaystyle{ O(L^{-0.1}) }[/math] in the process. We can similarly estimate [math]\displaystyle{ Y_{a,b,c} }[/math] by

[math]\displaystyle{ {\Bbb E} 1_{(a,b,c) good} {\Bbb E}( 1_{E_1^b \cap E_2^c} | {\mathcal B}^{M'L+l}_\emptyset ) 1_{E_0^a}(x) + O(L^{-0.1}). }[/math]

On the other hand, if we again use the fact that (a,b,c) is good, one sees that [math]\displaystyle{ {\Bbb E}( |f_{0,err}| 1_{E_1^b \cap E_2^c} | {\mathcal B}^{M'L+l}_\emptyset ) }[/math] is less than [math]\displaystyle{ \varepsilon^{0.9} }[/math] times [math]\displaystyle{ {\Bbb E}( 1_{E_1^b \cap E_2^c} | {\mathcal B}^{M'L+l}_\emptyset ) }[/math]. Putting all this together, we see that the contribution of this term to [math]\displaystyle{ X_{a,b,c} }[/math] is [math]\displaystyle{ O( \varepsilon^{0.9} Y_{a,b,c} ) + O( L^{-0.1} ) }[/math]. Summing up all the contributions to [math]\displaystyle{ X_{a,b,c} }[/math], we have thus concluded that

[math]\displaystyle{ X_{a,b,c} \gg \varepsilon^{0.3} Y_{a,b,c} - O(1/F(M)) }[/math]

(note that the [math]\displaystyle{ O(L^{-0.1}), O(1/L) }[/math] terms are small compared to [math]\displaystyle{ O(1/F(M)) }[/math]. Summing over a,b,c, we obtain

[math]\displaystyle{ \Lambda_0(1_A, 1_A, 1_A) \gg \varepsilon^{0.3} \sum_{a,b,c} Y_{a,b,c} - O_M( 1/F(M)). }[/math]

Now, as A has no combinatorial lines, we see that [math]\displaystyle{ \Lambda_0(1_A,1_A,1_A) = O(1/L) = O(1/F(M)) }[/math] (say) if n is large enough. We conclude that

[math]\displaystyle{ \sum_{a,b,c} {\Bbb E} 1_{(a,b,c) good} Y_{a,b,c} \ll_{M,\varepsilon} 1/F(M) }[/math].

Now, as we have already done before, we can expand [math]\displaystyle{ Y_{a,b,c} }[/math] as

[math]\displaystyle{ {\Bbb E} 1_{(a,b,c) good} {\Bbb E}( 1_{E_1^b \cap E_2^c} | {\mathcal B}^{M'L+l}_\emptyset ) 1_{E_0^a} + O(L^{-0.1}) }[/math]

which we can rearrange slightly as

[math]\displaystyle{ {\Bbb E} 1_{(a,b,c) good} 1_{E_1^b \cap E_2^c} {\Bbb E}( 1_{E_0}^a | {\mathcal B}^{M'L+l}_\emptyset ) + O(L^{-0.1}) }[/math]

and rearrange this further as

[math]\displaystyle{ {\Bbb E} 1_{(a,b,c) good} \Lambda_0( 1_{E_1^b}, 1_{E_2^c}, 1 ) {\Bbb E}( 1_{E_0}^a | {\mathcal B}^{M'L+l}_\emptyset ) + O(L^{-0.1}). }[/math]

On the other hand, observe that

[math]\displaystyle{ \Lambda_2( 1_{E_1^b}, 1_{E_2^c}, 1 ) = {\Bbb E}( {\Bbb E}(1_{E_1^b}|{\mathcal B}^{M'L+l+1}_\emptyset ) {\Bbb E}(1_{E_2^c}|{\mathcal B}^{M'L+l+1}_\emptyset ) ) }[/math]

and so by using the fact that (a,b,c) is good, we can express [math]\displaystyle{ Y_{a,b,c} }[/math] as

[math]\displaystyle{ {\Bbb E} 1_{(a,b,c) good} {\Bbb E}( 1_{E_0}^a | {\mathcal B}^{M'L+l}_\emptyset ) {\Bbb E}( 1_{E_1}^b | {\mathcal B}^{M'L+l}_\emptyset ) {\Bbb E}( 1_{E_2}^c | {\mathcal B}^{M'L+l}_\emptyset ) + O(L^{-0.1}) }[/math]

and thus

[math]\displaystyle{ {\Bbb E} 1_{(a,b,c) good} {\Bbb E}( 1_{E_0}^a | {\mathcal B}^{M'L+l}_\emptyset ) {\Bbb E}( 1_{E_1}^b | {\mathcal B}^{M'L+l}_\emptyset ) {\Bbb E}( 1_{E_2}^c | {\mathcal B}^{M'L+l}_\emptyset ) \ll_{M,\varepsilon} 1/F(M). }[/math] (4)

Thus, in some sense, "most" (a,b,c) are bad.

...