DHJ(3) and Upper and lower bounds: Difference between pages

From Polymath Wiki
(Difference between pages)
Jump to navigationJump to search
→‎Moser(3): added strong Roth
 
No edit summary
 
Line 1: Line 1:
The '''k=3 Density Hales-Jewett theorem''' (DHJ(3)) has many equivalent forms.
<center>'''Upper and lower bounds for <math>c_n</math> for small values of n.'''</center>


== Versions of DHJ(3) ==
<math>c_n</math> is the size of the largest subset of <math>[3]^n</math> that does not contain a combinatorial line (OEIS [http://www.research.att.com/~njas/sequences/A156762 A156762].  A spreadsheet for all the latest bounds on <math>c_n</math> [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DsU-uZ1tK7VEg can be found here].  In this page we record the proofs justifying these bounds.


Let <math>[3]^n</math> be the set of all length <math>n</math> strings over the alphabet <math>1, 2, 3</math>.  A subset of <math>[3]^n</math> is said to be ''line-free'' if it contains no [[combinatorial line]]s.  Let <math>c_n</math> be the size of the largest line-free subset of <math>[3]^n</math>. 


'''DHJ(3), Version 1'''.  For every <math>\delta > 0</math> there exists n such that every subset <math>A \subset [3]^n</math> of density at least <math>\delta</math> contains a [[combinatorial line]].
{|
| n || 0 || 1 || 2 || 3 || 4 || 5 || 6 || 7
|-
| <math>c_n</math> || 1 || 2 || 6 || 18 || 52 || 150 || 450 || [1302,1348]
|}


'''DHJ(3), Version 2'''. <math>\lim_{n \rightarrow \infty} c_n/3^n = 0</math>.
== Basic constructions ==


(The equivalence of Version 1 and Version 2 is clear.)
For all <math>n \geq 1</math>, a basic example of a mostly line-free set is


'''DHJ(3), Version 3'''.  For every <math>\delta > 0</math> there exists n such that every subset <math>A \subset [3]^n</math> of [[equal-slices density]] at least <math>\delta</math> contains a [[combinatorial line]].
:<math>D_n := \{ (x_1,\ldots,x_n) \in [3]^n: \sum_{i=1}^n x_i \neq 0 \ \operatorname{mod}\ 3 \}</math>. (1)


(For the proof that Version 3 and Version 1 are equivalent, see [[equal-slices measure|this page]].)
This has cardinality <math>|D_n| = 2 \times 3^{n-1}</math>.  The only lines in <math>D_n</math> are those with


'''DHJ(3), Version 4'''. For every <math>\delta > 0</math> there exists n such that every collection of pairs (U,V) of disjoint subsets of [n] of cardinality at least <math>\delta 3^n</math> contains a "corner" <math>(U,V), (U \cup D, V), (U, V \cup D)</math>, where U, V, D are disjoint subsets of <math>[n]</math> (compare with the [[corners theorem]]).
# A number of wildcards equal to a multiple of three;
# The number of 1s unequal to the number of 2s modulo 3.


(The equivalence of Version 1 and Version 2 follows by identifying a string <math>x \in [3]^n</math> with the pair (U,V) consisting of the 1-set and 2-set of x.)
One way to construct line-free sets is to start with <math>D_n</math> and remove some additional points.  We also have the variants <math>D_{n,0}=D_n, D_{n,1}, D_{n,2}</math> defined as


'''DHJ(3), Version 5'''. For every <math>\delta > 0</math> there exists n such that if <math>(B_w)_{w \in [3]^n}</math> are a collection of Bernoulli (i.e. 01-valued) random variables <math>B_w</math> (not necessarily independent), each with probability at least <math>\delta</math>, then there exists a combinatorial line <math>w^1,w^2,w^3</math> such that <math>B_{w^0} \wedge
:<math>D_{n,j} := \{ (x_1,\ldots,x_n) \in [3]^n: \sum_{i=1}^n x_i \neq j \ \operatorname{mod}\ 3 \}</math>. (1')
B_{w^1} \wedge B_{w^2}</math> has positive probability.


(The implication of Version 5 from Version 1 follows from the first moment method, observing that the set <math>A := \{ w \in [3]^n: B_w \hbox{ holds} \}</math> has expected density at least <math>\delta</math>.  The reverse implication comes from a random sampling method, taking a random m-dimensional [[combinatorial subspace]] of <math>[3]^n</math> and letting <math>B_w</math> be the event that the image of <math>w\in [3]^m</math> under the subspace embedding map lies in A.)
When n is not a multiple of 3, then <math>D_{n,0}, D_{n,1}, D_{n,2}</math> are all cyclic permutations of each other; but when n is a multiple of 3, then <math>D_{n,0}</math> plays a special role (though <math>D_{n,1}, D_{n,2}</math> are still interchangeable).


'''DHJ(3), Version 6'''.  If the free group <math>F_3</math> on three generators acts in a measure-preserving fashion <math>(T_w: X \to X)_{w \in F_3}</math> on a probability space X, and E is a set of positive measure in X, then there exists a combinatorial line <math>w^1,w^2,w^3</math> such that <math>T_{w^1}^{-1}(E) \cap T_{w^2}^{-1}(E) \cap T_{w^3}^{-1}(E)</math> has positive measure (where we identify words with elements of <math>F_3</math>).
Another useful construction proceeds by using the slices <math>\Gamma_{a,b,c} \subset [3]^n</math> for <math>(a,b,c)</math> in the triangular grid


(The equivalence of Version 1 and Version 6 follows from a variant of the [[Furstenberg correspondence principle]].)
:<math>\Delta_n := \{ (a,b,c) \in {\Bbb Z}_+^3: a+b+c = n \},</math>. (2)


== Variants ==
where <math>\Gamma_{a,b,c}</math> is defined as the strings in <math>[3]^n</math> with <math>a</math> 1s, <math>b</math> 2s, and <math>c</math> 3s.  Note that


====DHJ(k)====
:<math>|\Gamma_{a,b,c}| = \frac{n!}{a! b! c!}.</math> (3)


One can of course define DHJ(k) for any positive integer k by a similar method.  DHJ(1) is trivial, and DHJ(2) follows quickly from [[Sperner's theorem]].
Given any set <math>B \subset \Delta_n</math> that avoids equilateral triangles <math> (a+r,b,c), (a,b+r,c), (a,b,c+r)</math>, the set


==== [[DHJ(1,3)]] ====
:<math>\Gamma_B := \bigcup_{(a,b,c) \in B} \Gamma_{a,b,c}</math> (4)


The following variant of DHJ(3) is motivated by the proof of Sperner's theorem. Let <math>\mathcal{U},\mathcal{V}</math> and <math>\mathcal{W}</math> be collections of subsets of <math>[n].</math> Define <math>\mathcal{A}(\mathcal{U},\mathcal{V},\mathcal{W})</math> to be the set of all triples <math>(U,V,W),</math> where <math>U,V,W</math> are disjoint, and <math>U\in\mathcal{U},V\in\mathcal{V},W\in\mathcal{W}.</math> These triples are in an obvious one-to-one correspondence with elements of <math>[3]^n.</math>
is line-free and has cardinality


Call <math>\mathcal{A}</math> a ''set of complexity 1'' if it is of the form <math>\mathcal{A}(\mathcal{U},\mathcal{V},\mathcal{W})</math>.  For instance, a [[slice]] <math>\Gamma_{a,b,c}</math> is of complexity 1.
:<math>|\Gamma_B| = \sum_{(a,b,c) \in B} \frac{n!}{a! b! c!},</math> (5)
</math> DHJ(1,3) is the assertion that every dense set of complexity 1 contains a combinatorial line. Here is a proof of this assertion if we assume that <math>\mathcal{W}=[2]^n</math> and we interpret "dense" to mean "dense in the [[equal-slices measure]]".


First, choose a random permutation <math>\pi</math> of <math>[n]</math>. Without loss of generality it is the identity. Then the expected number of disjoint quadruples <math>(U,V,W)</math> such that U is an initial segment, W is a final segment, and U, V and W are elements of <math>\mathcal{U},\mathcal{V}</math> and <math>\mathcal{W}</math> that partition <math>[n]</math> is approximately <math>\delta n^2/2</math>, where <math>\delta</math> is the equal-slices density of <math>\mathcal{A}</math>. (The total number of such triples if U, V and W don't have to belong to <math>\mathcal{U},\mathcal{V}</math> and <math>\mathcal{W}</math> is roughly <math>n^2/2.)</math> Therefore, there must exist some fixed set <math>W\in\mathcal{W}</math> such that the number of such triples <math>(U,V,W)</math> is at least <math>\delta n/2.</math> In particular, we can find two such triples <math>(U_1,V_1,W)</math> and <math>(U_2,V_2,W).</math> If <math>U_1\subset U_2,</math> then this gives us the combinatorial line <math>(U_1,V_1,W),(U_2,V_2,W),(U_1,V_2,W\cup(U_2\cap V_1)).</math>
and thus provides a lower bound for <math>c_n</math>:


Quite possibly a proof of this kind can be devised that proves the full statement DHJ(1,3).
:<math>c_n \geq \sum_{(a,b,c) \in B} \frac{n!}{a! b! c!}.</math> (6)


==== DHJ(j,k) ====
All lower bounds on <math>c_n</math> have proceeded so far by choosing a good set of B and applying (6).  Note that <math>D_n</math> is the same as <math>\Gamma_{B_n}</math>, where <math>B_n</math> consists of those triples <math>(a,b,c) \in \Delta_n</math> in which <math>a \neq b\ \operatorname{mod}\ 3</math>.


The statement DHJ(j,k) is motivated by concepts connected with [http://en.wikipedia.org/wiki/Hypergraph  hypergraphs]. A k-uniform hypergraph <math>H</math> could be said to have ''complexity'' j if there exists a j-uniform hypergraph <math>J</math> such that <math>H</math> consists of all sets A of size k with the property that every subset <math>B\subset A</math> of size j belongs to <math>J.</math>
Note that if one takes a line-free set and permutes the alphabet <math>\{1,2,3\}</math> in any fashion (e.g. replacing all 1s by 2s and vice versa), one also gets a line-free set.  This potentially gives six examples from any given starting example of a line-free set, though in practice there is enough symmetry that the total number of examples produced this way is less than six.  (These six examples also correspond to the six symmetries of the triangular grid <math>\Delta_n</math> formed by rotation and reflection.)


There is also a k-partite version of this definition. If the k vertex sets are all copies of <math>[n]</math>, then we say that K has complexity j if for every <math>E\subset\{1,2,\dots,k\}</math> there is a j-uniform hypergraph <math>J_E</math> such that <math>K</math> consists of all k-tuples <math>(i_1,\dots,i_k)</math> such that for every <math>E\subset\{1,2,\dots,k\}</math> of size j the j-tuple <math>(i_s:s\in E)</math> belongs to <math>J_E.</math>
Another symmetry comes from permuting the <math>n</math> indices in the strings of <math>[3]^n</math> (e.g. replacing every string by its reversal).  But the sets <math>\Gamma_B</math> are automatically invariant under such permutations and thus do not produce new line-free sets via this symmetry.


We can make a similar definition for sequences in <math>[k]^n</math>, or equivalently ordered partitions <math>(U_1,\dots,U_k)</math> of <math>[n].</math> This time we say that <math>\mathcal{A}</math> has complexity j if there are collections of sets <math>\mathcal{U}_E</math> such that <math>\mathcal{A}</math> consists of all ordered partitions <math>(U_1,\dots,U_k)</math> such that for every <math>E\subset\{1,2,\dots,k\}</math> of size j the ordered sequence of disjoint sets <math>(U_s:s\in E)</math> belongs to <math>\mathcal{U}_E.</math> For instance, the [[slice]] <math>\Gamma_{a,b,c}</math> has complexity 1.
== The basic upper bound ==


DHJ(j,k) is the assertion that a subset of <math>[k]^n</math> of complexity j and density at least <math>\delta</math> contains a [[combinatorial line]], if n is sufficiently large depending on <math>j,k,\delta</math>. It is not hard to check that DHJ(k-1,k) is the same as DHJ(j,k). In other words, every subset of <math>[k]^n</math> has complexity at most k-1.
Because <math>[3]^{n+1}</math> can be expressed as the union of three copies of <math>[3]^n</math>, we have the basic upper bound


==== DHJ(2.5) ====
:<math>c_{n+1} \leq 3 c_n.</math> (7)


DHJ(2.5) is the statement that if <math>A \subset [3]^n</math> has density <math>\delta</math>, and n is sufficiently large depending on <math>\delta</math>, then there exists a [[combinatorial line]] <math>w^0, w^1, w^2</math> whose first two elements lie in A.  This is of course weaker than DHJ(3), which requires that all three elements of the combinatorial line lie in A.
Note that equality only occurs if one can find an <math>n+1</math>-dimensional line-free set such that every n-dimensional slice has the maximum possible cardinality of <math>c_n</math>.


DHJ(2.5) can be deduced from DHJ(2) by partitioning <math>[3]^n</math> into disjoint copies of <math>[2]^m</math> for various values of m; indeed, for each set <math>C \subset [n]</math> of cardinality n-m, we have a copy of <math>[2]^m</math> inside <math>[3]^n</math> formed by considering all the strings which equal 2 at C and equal 0 or 1 elsewhere.  By [[Sperner's theorem]] or DHJ(2), any of the copies of <math>[2]^m</math> at which A has density at least <math>\delta/2</math> (say) will contain the first two points of a combinatorial line, if m is sufficiently large depending on <math>\delta</math>.  The remaining values of m only fill up a negligible portion of the cube <math>[3]^n</math>, and the claim follows.
== n=0 ==


==== [[DHJ(2.6)]] ====
:<math>c_0=1</math>:


DHJ(2.6) is the statement that if <math>A \subset [3]^n</math> has density <math>\delta</math>, and n is sufficiently large depending on <math>\delta</math>, then there exists r such that for each ij=01,12,20, there exists a [[combinatorial line]] <math>w^0, w^1, w^2</math> with r wildcards whose i and j elements in A.  This is of course weaker than DHJ(3), but implies DHJ(2.5).
This is clear.


Proofs of DHJ(2.6) and further discussion [[DHJ(2.6)|can be found here]].
== n=1 ==


===="Upside-down" DHJ(3)====
:<math>c_1=2</math>:


A [[corner]] in the grid can either be the right way up (if <math>d>0</math> so that the right angle is at the bottom left) or upside-down (if <math>d<0</math> so that the right angle is at the top right). However, if you take a combinatorial line in <math>[3]^n</math> and use it to define a corner by associating with each point x in the combinatorial line the pair (number of 1s in x, number of 2s in x) you always get corners that are the right way up. Is there a version of the density Hales-Jewett theorem that allows upside-down corners as well? It turns out that such a version can be devised, and one can even define an analogue of the theorem that every dense subset of <math>[n]^2</math> contains a square.
The three sets <math>D_1 = \{1,2\}</math>, <math>D_{1,1} = \{2,3\}</math>, and <math>D_{1,2} = \{1,3\}</math> are the only two-element sets which are line-free in <math>[3]^1</math>, and there are no three-element sets.


To do the second of these first, we define a square to be a quadruple of the form <math>(A,B),(A\cup D,B),(A,B\cup D),(A\cup D,B\cup D),</math> where A and B are allowed to intersect, but D must be disjoint from both A and B. Then an upside-down corner is a square without the bottom left vertex (A,B). It can also be described as a triple <math>(A,B), (A\setminus D,B), (A,B\setminus D)</math> with <math>D\subset A\cap B.</math>
== n=2 ==


To prove that every dense subset of <math>[2]^n\times[2]^n</math> contains an upside-down corner or a square is a very different problem from DHJ(3) because the sets are allowed to intersect. (It is essential to allow them to intersect because of the pair <math>(A\cup D,B\cup D).</math> However, it is much closer to the [[IP-Szemer&eacute;di theorem]]. It can also be brought closer to DHJ(3) if one uses something like [[equal-slices measure]]. To choose a random point in <math>[2]^n\times[2]^n,</math> first choose a random permutation <math>\pi</math> of <math>[n]</math> and then choose a random pair of intervals in <math>\pi([n]).</math> Note that if we pick a random pair (A,B) according to this distribution then A and B are identically distributed but far from independent---they are much more likely to be disjoint than a random pair of sets, or even a random pair of sets of the same cardinality. One can place the following measure on the set of squares. Choose a random permutation  <math>\pi</math> of <math>[n]</math> and then choose three random subintervals A, B and D in <math>\pi([n]).</math> If <math>D</math> is disjoint from <math>A\cup B</math> then define the corresponding square. One could hope that, according to this measure, a dense subset of <math>[2]^n\times[2]^n</math> contains a dense set of squares.
:<math>c_2=6</math>:


(I have not thought for long enough about this problem to be sure that it is a sensible one.)
There are four six-element sets in <math>[3]^2</math> which are line-free, which we denote <math>x = D_{2,2}</math>, <math>y=D_{2,1}</math>, <math>z=D_2</math>, and <math>w</math> and are displayed graphically as follows.


==== Moser(3) ====
    13 .. 33      .. 23 33      13 23 ..      13 23 ..
x = 12 22 ..  y = 12 .. 32  z = .. 22 32  w = 12 .. 32
    .. 21 31      11 21 ..      11 .. 31      .. 21 31


'''Moser(3)''' is the claim that <math>c'_n = o(3^n)</math>, where <math>c'_n</math> is the size of the largest [[Moser set]] in <math>[3]^n</math>.  This is implied by DHJ(3) and implies [[Roth's theorem]].  No combinatorial proof of Moser(3) is currently known, and no effective decay rate for <math>c'_n</math> is known either.
Combining this with the basic upper bound (7) we see that <math>c_2=6</math>.


====Strong Roth in <math>\mathbb{F}_3^n</math>====
== n=3 ==


The usual form of Roth's theorem in <math>\mathbb{F}_3^n</math> is that every dense subset of <math>\mathbb{F}_3^n</math> contains an [[algebraic line]], that is, a set of the form <math>(x,x+d,x+2d)</math> with <math>d\ne 0,</math> where x and d are elements of <math>\mathbb{F}_3^n</math> and addition is pointwise mod 3.
:<math>c_3=18</math>:


A problem that is intermediate between this and DHJ(3) is the following. Given a dense subset of <math>\mathbb{F}_3^n,</math> is it always possible to find a triple inside it of the form <math>(x,x+d,x+2d)</math> with <math>d\ne 0,</math> where now we insist that all the coordinates of d are either 0 or 1? Obviously this is a strengthening of Roth's theorem in <math>\mathbb{F}_3^n.</math> It is also a weakening of DHJ(3), since with DHJ(3) one adds the extra condition that the supports of x and d are disjoint.
We describe a subset <math>A</math> of <math>[3]^3</math> as a string <math>abc</math>, where <math>a, b, c \subset [3]^2</math> correspond to strings of the form <math>1**</math>, <math>2**</math>, <math>3**</math> in <math>[3]^3</math> respectively. Thus for instance <math>D_3 = xyz</math>, and so from (7) we have <math>c_3=18</math>.


== Consequences ==
'''Lemma 1.'''
* The only 18-element line-free subset of <math>[3]^3</math> is <math>D_3 = xyz</math>.
* The only 17-element line-free subsets of <math>[3]^3</math> are formed by removing a point from <math>D_3=xyz</math>, or by removing either 111, 222, or 333 from <math>D_{3,2} = yzx</math> or <math>D_{3,3}=zxy</math>.


DHJ(3) implies the [[IP-Szemerédi theorem]], which implies the [[corners theorem]] (in both <math>{\Bbb Z}/N{\Bbb Z}</math> and <math>[3]^n</math>), which in turn implies [[Roth's theorem]] (in both <math>{\Bbb Z}/N{\Bbb Z}</math> and <math>[3]^n</math>).
'''Proof'''.  We prove the second claim. As <math>17=6+6+5</math>, and <math>c_2=6</math>, at least two of the slices of a 17-element line-free set must be from x, y, z, w, with the third slice having 5 points.  If two of the slices are identical, the last slice can have only 3 points, a contradiction.  If one of the slices is a w, then the 5-point slice will contain a diagonal, contradiction.  By symmetry we may now assume that two of the slices are x and y, which force the last slice to be z with one point removed.  Now one sees that the slices must be in the order xyz, yzx, or zxy, because any other combination has too many lines that need to be removed.  The sets yzx, zxy contain the diagonal {111,222,333} and so one additional point needs to be removed.  


DHJ(3) also implies the k=3 version of the [[coloring Hales-Jewett theorem]].
The first claim follows by a similar argument to the second.
<math>\Box</math>
 
== n=4 ==
 
:<math>c_4=52</math>:
 
Indeed, divide a line-free set in <math>[3]^4</math> into three blocks <math>1***, 2***, 3***</math> of <math>[3]^3</math>.  If two of them are of size 18, then they must both be xyz, and the third block can have at most 6 elements, leading to an inferior bound of 42.  So the best one can do is <math>18+17+17=52</math> which can be attained by deleting the diagonal {1111,2222,3333} from <math>D_{4,1} = xyz\ yzx\ xzy</math>, <math>D_4 = yzx\ zxy\ xyz</math>, or <math>D_{4,2} = zxy\ xyz\ yxz</math>.  In fact,
 
'''Lemma 2.'''
 
* The only 52-element line-free sets in <math>[3]^4</math> are formed by removing the diagonal {1111,2222,3333} from <math>D_{4,j}</math> for some j=0,1,2.
* The only 51-element line-free sets in <math>[3]^4</math> are formed by removing the diagonal and one further point from <math>D_{4,j}</math> for some j=0,1,2.
* The only 50-element line-free sets in <math>[3]^4</math> are formed by removing the diagonal and two further points from <math>D_{4,j}</math> for some j=0,1,2  OR is equal to one of the three permutations of the set <math>X := \Gamma_{3,1,0} \cup \Gamma_{3,0,1} \cup \Gamma_{2,2,0} \cup \Gamma_{2,0,2} \cup \Gamma_{1,1,2} \cup \Gamma_{1,2,1} \cup \Gamma_{0,2,2}</math>.
 
'''Proof''' It suffices to prove the third claim.  In fact it suffices to show that every 50-point line-free set is either contained in the 54-point set <math>D_{4,j}</math> for some j=0,1,2, or is some permutation of the set X. Indeed, if a 50-point line-free set is contained in, say, <math>D_4</math>, then it cannot contain 2222, since otherwise it must omit one point from each of the four pairs formed from {2333, 2111} by permuting the indices, and must also omit one of {1111, 1222, 1333}, leading to at most 49 points in all; similarly, it cannot contain 1111, and so omits the entire diagonal {1111,2222,3333}, with two more points to be omitted.  Similarly when <math>D_4</math> is replaced by one of the other <math>D_{4,j}</math>
 
Next, observe that every three-dimensional slice of a line-free set can have at most <math>c_3=18</math> points; thus when one partitions a 50-point line-free set into three such slices, it must divide either as 18+16+16, 17+17+16, or some permutation of these.
 
Suppose that we can slice the set into two slices of 17 points and one slice of 16 points.  By the various symmetries, we may assume that the 1*** slice and 2*** slices have 17 points, and the 3*** slice has 16 points.  By Lemma 1, the 1-slice is <math>\{1\} \times D_{3,j}</math> with one point removed, and the 2-slice is <math>\{2\} \times D_{3,k}</math> with one point removed, for some <math>j,k \in \{0,1,2\}</math>.
 
If j=k, then the 1-slice and 2-slice have at least 15 points in common, so the 3-slice can have at most <math>27-15=12</math> points, a contradiction.  If jk = 01, 12, or 20, then observe that from Lemma 1 the *1**, *2**, *3** slices cannot equal a 17-point or 18-point line-free set, so each have at most 16 points, leading to only 48 points in all, a contradiction.  Thus we must have jk = 10, 21, or 02.
 
Let's first suppose that jk=02.  Then by Lemma 1, the 2*** slice contains the nine points formed from {2211, 2322, 2331} and permuting the last three indices, while the 1*** slice contains at least eight of the nine points formed from {1211, 1322, 1311} and permuting the last three indices.  Thus the 3*** slice can contain at most one of the nine points formed from {3211, 3322, 3311} and permuting the last three indices.  If it does contain one of these points, say 3211, then it must omit one point from each of the four pairs {3222, 3233}, {3212, 3213}, {3221, 3231}, {3111, 3311}, leading to at most 15 points on this slice, a contradiction.  So the 3*** slice must omit all nine points, and is therefore contained in <math>\{3\} \times D_{4,1}</math>, and so the 50-point set is contained in <math>D_{4,1}</math>, and we are done by the discussion at the beginning of the proof.
 
The case jk=10 is similar to the jk=02 case (indeed one can get from one case to the other by swapping the 1 and 2 indices).  Now suppose instead that jk=12.  Then by Lemma 1, the 1*** slice contains the six points from permuting the last three indices of 1123, and similarly the 2*** slice contains the six points from permuting the last three indices of 2123.  Thus the 3*** slice must avoid all six points formed by permuting the last three indices of 3123.  Similarly, as 1133 lies in the 1*** slice and 2233 lies in the 2*** slice, 3333 must be avoided in the 3*** slice.
 
Now we claim that 3111 must be avoided also; for if 3111 was in the set, then one point from each of the six pairs formed from {3311, 3211}, {3331, 3221} and permuting the last three indices must lie outside the 3*** slice, which reduces the size of that slice to at most <math>27-6-1-6=14</math>, which is too small.  Similarly, 3222 must be avoided, which puts the 3*** slice inside <math>\{3\} \times D_3</math> and then places the 50-point set inside <math>D_4</math>, and we are done by the discussion at the beginning of the proof.
 
We have handled the case in which at least one of the slicings of the 50-point set is of the form 50=17+17+16.  The only remaining case is when all slicings of the 50-point set are of the form 18+17+16 (or a permutation thereof).  By the symmetries of the situation, we may assume that the 1*** slice has 18 points, and thus by Lemma 1 takes the form <math>\{1\} \times D_3</math>.  Inspecting the *1**, *2**, *3** slices, we then see (from Lemma 1) that only the *1** slice can have 18 points; since we are assuming that this slicing is some permutation of 50=18+17+16, we conclude that the *1** slice must have exactly 18 points, and is thus described precisely by Lemma 1.  Similarly for the **1* and ***1 slices.  Indeed, by Lemma 1, we see that the 50-point set must agree exactly with <math>D_{4,1}</math> on any of these slices.  In particular, on the remaining portion <math>\{2,3\}^4</math> of the cube, there are exactly 6 points of the 50-point set in <math>\{2,3\}^4</math>.
 
Suppose that 3333 was in the set; then since all permutations of 3311, 3331 are known to lie in the set, then 3322, 3332 must lie outside the set.  Also, as 1222 lies in the set, at least one of 2222, 3222 lie outside the set.  This leaves only 5 points in <math>\{2,3\}^4</math>, a contradiction.  Thus 3333 lies outside the set; similarly 2222 lies outside the set.
 
Let a be the number of points in the 50-point set which are some permutation of 2233, thus <math>0 \leq a \leq 6</math>.  If a=0 then the set lies in <math>D_{4,1}</math> and we are done.  If a=6 then the set is exactly X and we are done.  Now suppose a=1,2,3.  By symmetry we may assume that 2233 lies in the set.  Then (since 2133, 1233 2231, 2213 are known to lie in the set) 2333, 3233, 2223, 2232 lie outside the set, which leaves at most 5 points inside <math>\{2,3\}^4</math>, a contradiction.
 
The remaining case is when a=4,5.  Then one of the three pairs {2233, 3322}, {2323, 3232}, {2332, 3223} lie in the set.  By symmetry we may assume that {2233, 3322} lie in the set.  Then by arguing as before we see that all eight points formed by permuting 2333 or 3222 lie outside the set, leading to at most 5 points inside <math>\{2,3\}^4</math>, a contradiction.
<math>\Box</math>
 
== n=5 ==
 
:<math>c_5=150</math>:
 
'''Lemma 3'''.  Any line-free subset of <math>D_{5,j}</math> can have at most 150 points.
 
'''Proof'''.  By rotation we may work with <math>D_5</math>.  This set has 162 points.  By looking at the triplets {10000, 11110, 12220} and cyclic permutations we must lose 5 points; similarly from the triplets {20000,22220, 21110} and cyclic permutations.  Finally from {11000,11111,11222} and {22000,22222,22111} we lose two more points. <math>\Box</math>
 
Equality can be attained by removing  <math>\Gamma_{0,4,1}, \Gamma_{0,5,0}, \Gamma_{4,0,1}, \Gamma_{5,0,0}</math> from <math>D_5</math>.  Thus <math>c_5 \geq 150</math>.
 
Another pattern of 150 points is this: Take the 450 points
in <math>{}[3]^6</math> which are (1,2,3), (0,2,4) and permutations,
then select the 150 whose final coordinate is 1. That gives
this many points in each cube:
 
17 18 17
 
17 17 18
 
12 17 17
 
'''Lemma 4'''.  A line-free subset of <math>[3]^5</math> with over 150 points cannot have two parallel <math>[3]^4</math> slices, each of which contain at least 51 points.
 
'''Proof'''.  Suppose not.  By symmetry, we may assume that the 1**** and 2**** slices have at least 51 points, and that the whole set has at least 151 points, which force the third slice to have at least <math>151-2c_4 = 47</math> points.
 
By Lemma 2, the 1**** slice takes the form <math>\{1\} \times D_{4,j}</math> for some <math>j=0,1,2</math> with the diagonal {11111,12222,13333} and possibly one more point removed, and similarly the 2**** slice takes the form <math>\{2\} \times D_{4,k}</math> for some <math>k=0,1,2</math> with the diagonal {21111,22222,23333} and possibly one more point removed.
 
Suppose first that j=k.  Then the 1-slice and 2-slice have at least 50 points in common, leaving at most 31 points for the 3-slice, a contradiction.  Next, suppose that jk=01.  Then observe that the *i*** slice cannot look like any of the configurations in Lemma 2 and so must have at most 50 points for i=1,2,3, leading to 150 points in all, a contradiction.  Similarly if jk=12 or 20.  Thus we must have jk equal to 10, 21, or 02.
 
Let's suppose first that jk=10.  The first slice then is equal to <math>\{1\} \times D_{4,1}</math> with the diagonal and possibly one more point removed, while the second slice is equal to <math>\{2\} \times D_{4,0}</math> with the diagonal and possibly one more point removed.  Superimposing these slices, we thus see that the third slice is contained in <math>\{3\} \times D_{4,2}</math> except possibly for two additional points, together with the one point 32222 of the diagonal that lies outside of <math>\{3\} \times D_{4,2}</math>.
 
The lines x12xx, x13xx (plus permutations of the last four digits) must each contain one point outside the set.  The first two slices can only absorb two of these, and so at least 14 of the 16 points formed by permuting the last four digits of 31233, 31333 must lie outside the set.  These points all lie in <math>\{3\} \times D_{4,2}</math>, and so the 3**** slice can have at most <math>|D_{4,2}|-14+3=43</math> points, a contradiction.
 
The case jk=02 is similar to the case jk=10 (indeed one can obtain one from the other by swapping 1 and 2).  Now we turn to the case jk=21.  Arguing as before we see that the third slice is contained in <math>\{3\} \times D_4</math> except possibly for two points, together with 33333. 
 
If 33333 was in the set, then each of the lines xx333, xxx33 (and permutations of the last four digits) must have a point missing from the first two slices, which cannot be absorbed by the two points we are permitted to remove; thus 33333 is not in the set.  For similar reasons, 33331 is not in the set, as can be seen by looking at xxx31 and permutations of the last four digits.  Indeed, any string containing four threes does not lie in the set; this means that at least 8 points are missing from <math>\{3\} \times D_4</math>, leaving only at most 46 points inside that set.  Furthermore, any point in the 3**** slice outside of <math>\{3\} \times D_4</math> can only be created by removing a point from the first two slices, so the total cardinality is at most <math>46+52+52 = 150</math>, a contradiction.<math>\Box</math>
 
'''Corollary'''. <math>c_5 \leq 152</math>
 
'''Proof'''.  By Lemma 4 and the bound <math>c_4=52</math>, any line-free set with over 150 points can have one slice of cardinality 52, but then the other two slices can have at most 50 points. <math>\Box</math>
 
 
An integer programming method has established the upper bound <math>c_5\leq 150</math>, with 12 extremal solutions.
 
[http://abel.math.umu.se/~klasm/extremal-c5 This file] contains the extermisers. One point per line and different extermisers separated by a line with “—”
 
[http://abel.math.umu.se/~klasm/linprog-d=5-t=3.lpt This is the linear program], readable by Gnu’s glpsol linear programing solver, which also quickly proves that 150 is the optimum.
 
Each variable corresponds to a point in the cube, numbered according to their lexicografic ordering. If a variable is 1 then the point is in the set, if it is 0 then it is not in the set.
There is one linear inequality for each combinatorial line, stating that at least one point must be missing from the line.
 
== n=6 ==
 
:<math>c_6=450</math>:
 
The upper bound follows since <math>c_6 \leq 3 c_5</math>.  The lower bound can be formed by gluing together all the [[slice]]s <math>\Gamma_{a,b,c}</math> where (a,b,c) is a permutation of (0,2,4) or (1,2,3).
 
Computer verification, using the <math>c_5=150</math> extremals, has shown that there is exactly one extremiser for <math>c_6=450</math>.
 
== n=7 ==
 
:<math>1302 \leq c_7 \leq 1348</math>:
 
To see the upper bound <math>c_7 \leq 3c_6-2</math>, observe that if two parallel six-dimensional slices had <math>c_6</math> points, then by uniqueness they are identical, and the third slice can have at most <math>3^6-c_6=279</math> points, far too few to get anywhere close to <math>1348</math>.  Thus there can be at most one slice with <math>c_6</math> points, and the other two have at most <math>c_6-1</math>, giving the claim.
 
The lower bound can be formed by removing 016,106,052,502,151,511,160,610 from <math>D_7</math>.
 
'''Lemma 5''' Any line-free subset of <math>D_7</math> has at most 1302 points.
 
'''Proof''' Start with the 1458 points of <math>D_7</math>.  You must lose:
 
* 42 points from (1,2,4),(1,5,1),(4,2,1)
* 42 points from (2,1,4),(2,4,1),(5,1,1)
* 21 points from (0,2,5),(0,5,2),(3,2,2)
* 21 points from (2,0,5),(2,3,2),(5,0,2)
* 15 points from (0,1,6),(0,4,3),(3,1,3),(0,7,0),(3,4,0),(6,1,0)
* 15 points from (1,0,6),(1,3,3),(4,0,3),(7,0,0),(4,3,0),(1,6,0)
 
where (a,b,c) is shorthand for the [[slice]] <math>\Gamma_{a,b,c}</math>.
<math>\Box</math>
 
== Larger n ==
 
The following construction gives lower bounds for the number of triangle-free points,
There are of the order <math>2.7 \sqrt{log(N)/N}3^N</math> points for large N (N ~ 5000)
 
It applies when N is a multiple of 3.
* For N=3M-1, restrict the first digit of a 3M sequence to be 1.  So this construction has exactly one-third as many points for N=3M-1 as it has for N=3M.
* For N=3M-2, restrict the first two digits of a 3M sequence to be 12.  This leaves roughly one ninth of the points for N=3M-2 as for N=3M.
 
The current lower bounds for <math>c_{3m}</math> are built like this, with abc being shorthand for <math>\Gamma_{a,b,c}</math>:
 
* <math>c_3</math> from (012) and permutations
* <math>c_6</math> from (123,024) and perms
* <math>c_9</math> from (234,135,045) and perms
* <math>c_{12}</math> from (345,246,156,02A,057) and perms (A=10)
* <math>c_{15}</math> from (456,357,267,13B,168,04B,078) and perms (B=11)
 
To get the triples in each row, add 1 to the triples in the previous row; then include new triples that have a zero.
 
A general formula for these points is given below.  I think that they are triangle-free.  (For N<21, ignore any triple with a negative entry.)
 
* There are thirteen groups of points in the centre, formed from adding one of the following points, or its permutation, to (M,M,M), when N=3M:
** (-7,-3,+10), (-7, 0,+7),(-7,+3,+4),(-6,-4,+10),(-6,-1,+7),(-6,+2,+4),(-5,-1,+6),(-5,+2,+3),(-4,-2,+6),(-4,+1,+3),(-3,+1,+2),(-2,0,+2),(-1,0,+1)
* There are also eight string of points, stretching to the edges of the (abc) triangle:
** For N=6K = 3M
*** M+(-8-2x,-6-2x,14+4x),M+(-8-2x,-3-2x,11+4x),M+(-8-2x,x,8+x),M+(-8-2x,3+x,5+x) and permutations (x>=0, M-8-2x>=0)
*** M+(-9-2x,-5-2x,14+4x),M+(-9-2x,-2-2x,11+4x),M+(-9-2x,1+x,8+x),M+(-9-2x,4+x,5+x) and permutations (x>=0, M-9-2x>=0)
 
 
An alternate construction:
 
First define a sequence, of all positive numbers which, in base 3, do not contain a 1. Add 1 to all multiples of 3 in this sequence. This sequence does not contain a length-3 arithmetic progression.
 
It starts 1,2,7,8,19,20,25,26,55, …
 
Second, list all the (abc) triples for which the larger two differ by a number
from the sequence, excluding the case when the smaller two differ by 1, but then including the case when (a,b,c) is a permutation of N/3+(-1,0,1)
 
== Asymptotics ==
 
DHJ(3) is equivalent to the upper bound
 
:<math>c_n \leq o(3^n)</math>
 
In the opposite direction, observe that if we take a set <math>S \subset [3n]</math> that contains no 3-term arithmetic progressions, then the set <math>\bigcup_{(a,b,c) \in \Delta_n: a+2b \in S} \Gamma_{a,b,c}</math> is line-free.  From this and the Behrend construction it appears that we have the lower bound
 
:<math>c_n \geq 3^{n-O(\sqrt{\log n})}</math>
 
though this has to be checked.
 
[http://terrytao.wordpress.com/2009/02/05/upper-and-lower-bounds-for-the-density-hales-jewett-problem/#comment-35652 Numerics suggest] that the first large n construction given above above give a lower bound of roughly <math>2.7 \sqrt{\log(n)/n} \times 3^n</math>, which would asymptotically be inferior to the Behrend bound.
 
The second large n construction had numerical asymptotics for \log(c_n/3^n) close to <math>1.2-\sqrt{\log(n)}</math> between n=1000 and n=10000, consistent with the Behrend bound.
 
== Numerical methods ==
 
A greedy algorithm [http://thetangentspace.com/wiki/Hales-Jewett_Theorem was implemented here].  The results were sharp for <math>n \leq 3</math> but were slightly inferior to the constructions above for larger n.

Revision as of 10:43, 20 February 2009

Upper and lower bounds for [math]\displaystyle{ c_n }[/math] for small values of n.

[math]\displaystyle{ c_n }[/math] is the size of the largest subset of [math]\displaystyle{ [3]^n }[/math] that does not contain a combinatorial line (OEIS A156762. A spreadsheet for all the latest bounds on [math]\displaystyle{ c_n }[/math] can be found here. In this page we record the proofs justifying these bounds.


n 0 1 2 3 4 5 6 7
[math]\displaystyle{ c_n }[/math] 1 2 6 18 52 150 450 [1302,1348]

Basic constructions

For all [math]\displaystyle{ n \geq 1 }[/math], a basic example of a mostly line-free set is

[math]\displaystyle{ D_n := \{ (x_1,\ldots,x_n) \in [3]^n: \sum_{i=1}^n x_i \neq 0 \ \operatorname{mod}\ 3 \} }[/math]. (1)

This has cardinality [math]\displaystyle{ |D_n| = 2 \times 3^{n-1} }[/math]. The only lines in [math]\displaystyle{ D_n }[/math] are those with

  1. A number of wildcards equal to a multiple of three;
  2. The number of 1s unequal to the number of 2s modulo 3.

One way to construct line-free sets is to start with [math]\displaystyle{ D_n }[/math] and remove some additional points. We also have the variants [math]\displaystyle{ D_{n,0}=D_n, D_{n,1}, D_{n,2} }[/math] defined as

[math]\displaystyle{ D_{n,j} := \{ (x_1,\ldots,x_n) \in [3]^n: \sum_{i=1}^n x_i \neq j \ \operatorname{mod}\ 3 \} }[/math]. (1')

When n is not a multiple of 3, then [math]\displaystyle{ D_{n,0}, D_{n,1}, D_{n,2} }[/math] are all cyclic permutations of each other; but when n is a multiple of 3, then [math]\displaystyle{ D_{n,0} }[/math] plays a special role (though [math]\displaystyle{ D_{n,1}, D_{n,2} }[/math] are still interchangeable).

Another useful construction proceeds by using the slices [math]\displaystyle{ \Gamma_{a,b,c} \subset [3]^n }[/math] for [math]\displaystyle{ (a,b,c) }[/math] in the triangular grid

[math]\displaystyle{ \Delta_n := \{ (a,b,c) \in {\Bbb Z}_+^3: a+b+c = n \}, }[/math]. (2)

where [math]\displaystyle{ \Gamma_{a,b,c} }[/math] is defined as the strings in [math]\displaystyle{ [3]^n }[/math] with [math]\displaystyle{ a }[/math] 1s, [math]\displaystyle{ b }[/math] 2s, and [math]\displaystyle{ c }[/math] 3s. Note that

[math]\displaystyle{ |\Gamma_{a,b,c}| = \frac{n!}{a! b! c!}. }[/math] (3)

Given any set [math]\displaystyle{ B \subset \Delta_n }[/math] that avoids equilateral triangles [math]\displaystyle{ (a+r,b,c), (a,b+r,c), (a,b,c+r) }[/math], the set

[math]\displaystyle{ \Gamma_B := \bigcup_{(a,b,c) \in B} \Gamma_{a,b,c} }[/math] (4)

is line-free and has cardinality

[math]\displaystyle{ |\Gamma_B| = \sum_{(a,b,c) \in B} \frac{n!}{a! b! c!}, }[/math] (5)

and thus provides a lower bound for [math]\displaystyle{ c_n }[/math]:

[math]\displaystyle{ c_n \geq \sum_{(a,b,c) \in B} \frac{n!}{a! b! c!}. }[/math] (6)

All lower bounds on [math]\displaystyle{ c_n }[/math] have proceeded so far by choosing a good set of B and applying (6). Note that [math]\displaystyle{ D_n }[/math] is the same as [math]\displaystyle{ \Gamma_{B_n} }[/math], where [math]\displaystyle{ B_n }[/math] consists of those triples [math]\displaystyle{ (a,b,c) \in \Delta_n }[/math] in which [math]\displaystyle{ a \neq b\ \operatorname{mod}\ 3 }[/math].

Note that if one takes a line-free set and permutes the alphabet [math]\displaystyle{ \{1,2,3\} }[/math] in any fashion (e.g. replacing all 1s by 2s and vice versa), one also gets a line-free set. This potentially gives six examples from any given starting example of a line-free set, though in practice there is enough symmetry that the total number of examples produced this way is less than six. (These six examples also correspond to the six symmetries of the triangular grid [math]\displaystyle{ \Delta_n }[/math] formed by rotation and reflection.)

Another symmetry comes from permuting the [math]\displaystyle{ n }[/math] indices in the strings of [math]\displaystyle{ [3]^n }[/math] (e.g. replacing every string by its reversal). But the sets [math]\displaystyle{ \Gamma_B }[/math] are automatically invariant under such permutations and thus do not produce new line-free sets via this symmetry.

The basic upper bound

Because [math]\displaystyle{ [3]^{n+1} }[/math] can be expressed as the union of three copies of [math]\displaystyle{ [3]^n }[/math], we have the basic upper bound

[math]\displaystyle{ c_{n+1} \leq 3 c_n. }[/math] (7)

Note that equality only occurs if one can find an [math]\displaystyle{ n+1 }[/math]-dimensional line-free set such that every n-dimensional slice has the maximum possible cardinality of [math]\displaystyle{ c_n }[/math].

n=0

[math]\displaystyle{ c_0=1 }[/math]:

This is clear.

n=1

[math]\displaystyle{ c_1=2 }[/math]:

The three sets [math]\displaystyle{ D_1 = \{1,2\} }[/math], [math]\displaystyle{ D_{1,1} = \{2,3\} }[/math], and [math]\displaystyle{ D_{1,2} = \{1,3\} }[/math] are the only two-element sets which are line-free in [math]\displaystyle{ [3]^1 }[/math], and there are no three-element sets.

n=2

[math]\displaystyle{ c_2=6 }[/math]:

There are four six-element sets in [math]\displaystyle{ [3]^2 }[/math] which are line-free, which we denote [math]\displaystyle{ x = D_{2,2} }[/math], [math]\displaystyle{ y=D_{2,1} }[/math], [math]\displaystyle{ z=D_2 }[/math], and [math]\displaystyle{ w }[/math] and are displayed graphically as follows.

    13 .. 33       .. 23 33       13 23 ..       13 23 ..
x = 12 22 ..   y = 12 .. 32   z = .. 22 32   w = 12 .. 32
    .. 21 31       11 21 ..       11 .. 31       .. 21 31

Combining this with the basic upper bound (7) we see that [math]\displaystyle{ c_2=6 }[/math].

n=3

[math]\displaystyle{ c_3=18 }[/math]:

We describe a subset [math]\displaystyle{ A }[/math] of [math]\displaystyle{ [3]^3 }[/math] as a string [math]\displaystyle{ abc }[/math], where [math]\displaystyle{ a, b, c \subset [3]^2 }[/math] correspond to strings of the form [math]\displaystyle{ 1** }[/math], [math]\displaystyle{ 2** }[/math], [math]\displaystyle{ 3** }[/math] in [math]\displaystyle{ [3]^3 }[/math] respectively. Thus for instance [math]\displaystyle{ D_3 = xyz }[/math], and so from (7) we have [math]\displaystyle{ c_3=18 }[/math].

Lemma 1.

  • The only 18-element line-free subset of [math]\displaystyle{ [3]^3 }[/math] is [math]\displaystyle{ D_3 = xyz }[/math].
  • The only 17-element line-free subsets of [math]\displaystyle{ [3]^3 }[/math] are formed by removing a point from [math]\displaystyle{ D_3=xyz }[/math], or by removing either 111, 222, or 333 from [math]\displaystyle{ D_{3,2} = yzx }[/math] or [math]\displaystyle{ D_{3,3}=zxy }[/math].

Proof. We prove the second claim. As [math]\displaystyle{ 17=6+6+5 }[/math], and [math]\displaystyle{ c_2=6 }[/math], at least two of the slices of a 17-element line-free set must be from x, y, z, w, with the third slice having 5 points. If two of the slices are identical, the last slice can have only 3 points, a contradiction. If one of the slices is a w, then the 5-point slice will contain a diagonal, contradiction. By symmetry we may now assume that two of the slices are x and y, which force the last slice to be z with one point removed. Now one sees that the slices must be in the order xyz, yzx, or zxy, because any other combination has too many lines that need to be removed. The sets yzx, zxy contain the diagonal {111,222,333} and so one additional point needs to be removed.

The first claim follows by a similar argument to the second. [math]\displaystyle{ \Box }[/math]

n=4

[math]\displaystyle{ c_4=52 }[/math]:

Indeed, divide a line-free set in [math]\displaystyle{ [3]^4 }[/math] into three blocks [math]\displaystyle{ 1***, 2***, 3*** }[/math] of [math]\displaystyle{ [3]^3 }[/math]. If two of them are of size 18, then they must both be xyz, and the third block can have at most 6 elements, leading to an inferior bound of 42. So the best one can do is [math]\displaystyle{ 18+17+17=52 }[/math] which can be attained by deleting the diagonal {1111,2222,3333} from [math]\displaystyle{ D_{4,1} = xyz\ yzx\ xzy }[/math], [math]\displaystyle{ D_4 = yzx\ zxy\ xyz }[/math], or [math]\displaystyle{ D_{4,2} = zxy\ xyz\ yxz }[/math]. In fact,

Lemma 2.

  • The only 52-element line-free sets in [math]\displaystyle{ [3]^4 }[/math] are formed by removing the diagonal {1111,2222,3333} from [math]\displaystyle{ D_{4,j} }[/math] for some j=0,1,2.
  • The only 51-element line-free sets in [math]\displaystyle{ [3]^4 }[/math] are formed by removing the diagonal and one further point from [math]\displaystyle{ D_{4,j} }[/math] for some j=0,1,2.
  • The only 50-element line-free sets in [math]\displaystyle{ [3]^4 }[/math] are formed by removing the diagonal and two further points from [math]\displaystyle{ D_{4,j} }[/math] for some j=0,1,2 OR is equal to one of the three permutations of the set [math]\displaystyle{ X := \Gamma_{3,1,0} \cup \Gamma_{3,0,1} \cup \Gamma_{2,2,0} \cup \Gamma_{2,0,2} \cup \Gamma_{1,1,2} \cup \Gamma_{1,2,1} \cup \Gamma_{0,2,2} }[/math].

Proof It suffices to prove the third claim. In fact it suffices to show that every 50-point line-free set is either contained in the 54-point set [math]\displaystyle{ D_{4,j} }[/math] for some j=0,1,2, or is some permutation of the set X. Indeed, if a 50-point line-free set is contained in, say, [math]\displaystyle{ D_4 }[/math], then it cannot contain 2222, since otherwise it must omit one point from each of the four pairs formed from {2333, 2111} by permuting the indices, and must also omit one of {1111, 1222, 1333}, leading to at most 49 points in all; similarly, it cannot contain 1111, and so omits the entire diagonal {1111,2222,3333}, with two more points to be omitted. Similarly when [math]\displaystyle{ D_4 }[/math] is replaced by one of the other [math]\displaystyle{ D_{4,j} }[/math]

Next, observe that every three-dimensional slice of a line-free set can have at most [math]\displaystyle{ c_3=18 }[/math] points; thus when one partitions a 50-point line-free set into three such slices, it must divide either as 18+16+16, 17+17+16, or some permutation of these.

Suppose that we can slice the set into two slices of 17 points and one slice of 16 points. By the various symmetries, we may assume that the 1*** slice and 2*** slices have 17 points, and the 3*** slice has 16 points. By Lemma 1, the 1-slice is [math]\displaystyle{ \{1\} \times D_{3,j} }[/math] with one point removed, and the 2-slice is [math]\displaystyle{ \{2\} \times D_{3,k} }[/math] with one point removed, for some [math]\displaystyle{ j,k \in \{0,1,2\} }[/math].

If j=k, then the 1-slice and 2-slice have at least 15 points in common, so the 3-slice can have at most [math]\displaystyle{ 27-15=12 }[/math] points, a contradiction. If jk = 01, 12, or 20, then observe that from Lemma 1 the *1**, *2**, *3** slices cannot equal a 17-point or 18-point line-free set, so each have at most 16 points, leading to only 48 points in all, a contradiction. Thus we must have jk = 10, 21, or 02.

Let's first suppose that jk=02. Then by Lemma 1, the 2*** slice contains the nine points formed from {2211, 2322, 2331} and permuting the last three indices, while the 1*** slice contains at least eight of the nine points formed from {1211, 1322, 1311} and permuting the last three indices. Thus the 3*** slice can contain at most one of the nine points formed from {3211, 3322, 3311} and permuting the last three indices. If it does contain one of these points, say 3211, then it must omit one point from each of the four pairs {3222, 3233}, {3212, 3213}, {3221, 3231}, {3111, 3311}, leading to at most 15 points on this slice, a contradiction. So the 3*** slice must omit all nine points, and is therefore contained in [math]\displaystyle{ \{3\} \times D_{4,1} }[/math], and so the 50-point set is contained in [math]\displaystyle{ D_{4,1} }[/math], and we are done by the discussion at the beginning of the proof.

The case jk=10 is similar to the jk=02 case (indeed one can get from one case to the other by swapping the 1 and 2 indices). Now suppose instead that jk=12. Then by Lemma 1, the 1*** slice contains the six points from permuting the last three indices of 1123, and similarly the 2*** slice contains the six points from permuting the last three indices of 2123. Thus the 3*** slice must avoid all six points formed by permuting the last three indices of 3123. Similarly, as 1133 lies in the 1*** slice and 2233 lies in the 2*** slice, 3333 must be avoided in the 3*** slice.

Now we claim that 3111 must be avoided also; for if 3111 was in the set, then one point from each of the six pairs formed from {3311, 3211}, {3331, 3221} and permuting the last three indices must lie outside the 3*** slice, which reduces the size of that slice to at most [math]\displaystyle{ 27-6-1-6=14 }[/math], which is too small. Similarly, 3222 must be avoided, which puts the 3*** slice inside [math]\displaystyle{ \{3\} \times D_3 }[/math] and then places the 50-point set inside [math]\displaystyle{ D_4 }[/math], and we are done by the discussion at the beginning of the proof.

We have handled the case in which at least one of the slicings of the 50-point set is of the form 50=17+17+16. The only remaining case is when all slicings of the 50-point set are of the form 18+17+16 (or a permutation thereof). By the symmetries of the situation, we may assume that the 1*** slice has 18 points, and thus by Lemma 1 takes the form [math]\displaystyle{ \{1\} \times D_3 }[/math]. Inspecting the *1**, *2**, *3** slices, we then see (from Lemma 1) that only the *1** slice can have 18 points; since we are assuming that this slicing is some permutation of 50=18+17+16, we conclude that the *1** slice must have exactly 18 points, and is thus described precisely by Lemma 1. Similarly for the **1* and ***1 slices. Indeed, by Lemma 1, we see that the 50-point set must agree exactly with [math]\displaystyle{ D_{4,1} }[/math] on any of these slices. In particular, on the remaining portion [math]\displaystyle{ \{2,3\}^4 }[/math] of the cube, there are exactly 6 points of the 50-point set in [math]\displaystyle{ \{2,3\}^4 }[/math].

Suppose that 3333 was in the set; then since all permutations of 3311, 3331 are known to lie in the set, then 3322, 3332 must lie outside the set. Also, as 1222 lies in the set, at least one of 2222, 3222 lie outside the set. This leaves only 5 points in [math]\displaystyle{ \{2,3\}^4 }[/math], a contradiction. Thus 3333 lies outside the set; similarly 2222 lies outside the set.

Let a be the number of points in the 50-point set which are some permutation of 2233, thus [math]\displaystyle{ 0 \leq a \leq 6 }[/math]. If a=0 then the set lies in [math]\displaystyle{ D_{4,1} }[/math] and we are done. If a=6 then the set is exactly X and we are done. Now suppose a=1,2,3. By symmetry we may assume that 2233 lies in the set. Then (since 2133, 1233 2231, 2213 are known to lie in the set) 2333, 3233, 2223, 2232 lie outside the set, which leaves at most 5 points inside [math]\displaystyle{ \{2,3\}^4 }[/math], a contradiction.

The remaining case is when a=4,5. Then one of the three pairs {2233, 3322}, {2323, 3232}, {2332, 3223} lie in the set. By symmetry we may assume that {2233, 3322} lie in the set. Then by arguing as before we see that all eight points formed by permuting 2333 or 3222 lie outside the set, leading to at most 5 points inside [math]\displaystyle{ \{2,3\}^4 }[/math], a contradiction. [math]\displaystyle{ \Box }[/math]

n=5

[math]\displaystyle{ c_5=150 }[/math]:

Lemma 3. Any line-free subset of [math]\displaystyle{ D_{5,j} }[/math] can have at most 150 points.

Proof. By rotation we may work with [math]\displaystyle{ D_5 }[/math]. This set has 162 points. By looking at the triplets {10000, 11110, 12220} and cyclic permutations we must lose 5 points; similarly from the triplets {20000,22220, 21110} and cyclic permutations. Finally from {11000,11111,11222} and {22000,22222,22111} we lose two more points. [math]\displaystyle{ \Box }[/math]

Equality can be attained by removing [math]\displaystyle{ \Gamma_{0,4,1}, \Gamma_{0,5,0}, \Gamma_{4,0,1}, \Gamma_{5,0,0} }[/math] from [math]\displaystyle{ D_5 }[/math]. Thus [math]\displaystyle{ c_5 \geq 150 }[/math].

Another pattern of 150 points is this: Take the 450 points in [math]\displaystyle{ {}[3]^6 }[/math] which are (1,2,3), (0,2,4) and permutations, then select the 150 whose final coordinate is 1. That gives this many points in each cube:

17 18 17

17 17 18

12 17 17

Lemma 4. A line-free subset of [math]\displaystyle{ [3]^5 }[/math] with over 150 points cannot have two parallel [math]\displaystyle{ [3]^4 }[/math] slices, each of which contain at least 51 points.

Proof. Suppose not. By symmetry, we may assume that the 1**** and 2**** slices have at least 51 points, and that the whole set has at least 151 points, which force the third slice to have at least [math]\displaystyle{ 151-2c_4 = 47 }[/math] points.

By Lemma 2, the 1**** slice takes the form [math]\displaystyle{ \{1\} \times D_{4,j} }[/math] for some [math]\displaystyle{ j=0,1,2 }[/math] with the diagonal {11111,12222,13333} and possibly one more point removed, and similarly the 2**** slice takes the form [math]\displaystyle{ \{2\} \times D_{4,k} }[/math] for some [math]\displaystyle{ k=0,1,2 }[/math] with the diagonal {21111,22222,23333} and possibly one more point removed.

Suppose first that j=k. Then the 1-slice and 2-slice have at least 50 points in common, leaving at most 31 points for the 3-slice, a contradiction. Next, suppose that jk=01. Then observe that the *i*** slice cannot look like any of the configurations in Lemma 2 and so must have at most 50 points for i=1,2,3, leading to 150 points in all, a contradiction. Similarly if jk=12 or 20. Thus we must have jk equal to 10, 21, or 02.

Let's suppose first that jk=10. The first slice then is equal to [math]\displaystyle{ \{1\} \times D_{4,1} }[/math] with the diagonal and possibly one more point removed, while the second slice is equal to [math]\displaystyle{ \{2\} \times D_{4,0} }[/math] with the diagonal and possibly one more point removed. Superimposing these slices, we thus see that the third slice is contained in [math]\displaystyle{ \{3\} \times D_{4,2} }[/math] except possibly for two additional points, together with the one point 32222 of the diagonal that lies outside of [math]\displaystyle{ \{3\} \times D_{4,2} }[/math].

The lines x12xx, x13xx (plus permutations of the last four digits) must each contain one point outside the set. The first two slices can only absorb two of these, and so at least 14 of the 16 points formed by permuting the last four digits of 31233, 31333 must lie outside the set. These points all lie in [math]\displaystyle{ \{3\} \times D_{4,2} }[/math], and so the 3**** slice can have at most [math]\displaystyle{ |D_{4,2}|-14+3=43 }[/math] points, a contradiction.

The case jk=02 is similar to the case jk=10 (indeed one can obtain one from the other by swapping 1 and 2). Now we turn to the case jk=21. Arguing as before we see that the third slice is contained in [math]\displaystyle{ \{3\} \times D_4 }[/math] except possibly for two points, together with 33333.

If 33333 was in the set, then each of the lines xx333, xxx33 (and permutations of the last four digits) must have a point missing from the first two slices, which cannot be absorbed by the two points we are permitted to remove; thus 33333 is not in the set. For similar reasons, 33331 is not in the set, as can be seen by looking at xxx31 and permutations of the last four digits. Indeed, any string containing four threes does not lie in the set; this means that at least 8 points are missing from [math]\displaystyle{ \{3\} \times D_4 }[/math], leaving only at most 46 points inside that set. Furthermore, any point in the 3**** slice outside of [math]\displaystyle{ \{3\} \times D_4 }[/math] can only be created by removing a point from the first two slices, so the total cardinality is at most [math]\displaystyle{ 46+52+52 = 150 }[/math], a contradiction.[math]\displaystyle{ \Box }[/math]

Corollary. [math]\displaystyle{ c_5 \leq 152 }[/math]

Proof. By Lemma 4 and the bound [math]\displaystyle{ c_4=52 }[/math], any line-free set with over 150 points can have one slice of cardinality 52, but then the other two slices can have at most 50 points. [math]\displaystyle{ \Box }[/math]


An integer programming method has established the upper bound [math]\displaystyle{ c_5\leq 150 }[/math], with 12 extremal solutions.

This file contains the extermisers. One point per line and different extermisers separated by a line with “—”

This is the linear program, readable by Gnu’s glpsol linear programing solver, which also quickly proves that 150 is the optimum.

Each variable corresponds to a point in the cube, numbered according to their lexicografic ordering. If a variable is 1 then the point is in the set, if it is 0 then it is not in the set. There is one linear inequality for each combinatorial line, stating that at least one point must be missing from the line.

n=6

[math]\displaystyle{ c_6=450 }[/math]:

The upper bound follows since [math]\displaystyle{ c_6 \leq 3 c_5 }[/math]. The lower bound can be formed by gluing together all the slices [math]\displaystyle{ \Gamma_{a,b,c} }[/math] where (a,b,c) is a permutation of (0,2,4) or (1,2,3).

Computer verification, using the [math]\displaystyle{ c_5=150 }[/math] extremals, has shown that there is exactly one extremiser for [math]\displaystyle{ c_6=450 }[/math].

n=7

[math]\displaystyle{ 1302 \leq c_7 \leq 1348 }[/math]:

To see the upper bound [math]\displaystyle{ c_7 \leq 3c_6-2 }[/math], observe that if two parallel six-dimensional slices had [math]\displaystyle{ c_6 }[/math] points, then by uniqueness they are identical, and the third slice can have at most [math]\displaystyle{ 3^6-c_6=279 }[/math] points, far too few to get anywhere close to [math]\displaystyle{ 1348 }[/math]. Thus there can be at most one slice with [math]\displaystyle{ c_6 }[/math] points, and the other two have at most [math]\displaystyle{ c_6-1 }[/math], giving the claim.

The lower bound can be formed by removing 016,106,052,502,151,511,160,610 from [math]\displaystyle{ D_7 }[/math].

Lemma 5 Any line-free subset of [math]\displaystyle{ D_7 }[/math] has at most 1302 points.

Proof Start with the 1458 points of [math]\displaystyle{ D_7 }[/math]. You must lose:

  • 42 points from (1,2,4),(1,5,1),(4,2,1)
  • 42 points from (2,1,4),(2,4,1),(5,1,1)
  • 21 points from (0,2,5),(0,5,2),(3,2,2)
  • 21 points from (2,0,5),(2,3,2),(5,0,2)
  • 15 points from (0,1,6),(0,4,3),(3,1,3),(0,7,0),(3,4,0),(6,1,0)
  • 15 points from (1,0,6),(1,3,3),(4,0,3),(7,0,0),(4,3,0),(1,6,0)

where (a,b,c) is shorthand for the slice [math]\displaystyle{ \Gamma_{a,b,c} }[/math]. [math]\displaystyle{ \Box }[/math]

Larger n

The following construction gives lower bounds for the number of triangle-free points, There are of the order [math]\displaystyle{ 2.7 \sqrt{log(N)/N}3^N }[/math] points for large N (N ~ 5000)

It applies when N is a multiple of 3.

  • For N=3M-1, restrict the first digit of a 3M sequence to be 1. So this construction has exactly one-third as many points for N=3M-1 as it has for N=3M.
  • For N=3M-2, restrict the first two digits of a 3M sequence to be 12. This leaves roughly one ninth of the points for N=3M-2 as for N=3M.

The current lower bounds for [math]\displaystyle{ c_{3m} }[/math] are built like this, with abc being shorthand for [math]\displaystyle{ \Gamma_{a,b,c} }[/math]:

  • [math]\displaystyle{ c_3 }[/math] from (012) and permutations
  • [math]\displaystyle{ c_6 }[/math] from (123,024) and perms
  • [math]\displaystyle{ c_9 }[/math] from (234,135,045) and perms
  • [math]\displaystyle{ c_{12} }[/math] from (345,246,156,02A,057) and perms (A=10)
  • [math]\displaystyle{ c_{15} }[/math] from (456,357,267,13B,168,04B,078) and perms (B=11)

To get the triples in each row, add 1 to the triples in the previous row; then include new triples that have a zero.

A general formula for these points is given below. I think that they are triangle-free. (For N<21, ignore any triple with a negative entry.)

  • There are thirteen groups of points in the centre, formed from adding one of the following points, or its permutation, to (M,M,M), when N=3M:
    • (-7,-3,+10), (-7, 0,+7),(-7,+3,+4),(-6,-4,+10),(-6,-1,+7),(-6,+2,+4),(-5,-1,+6),(-5,+2,+3),(-4,-2,+6),(-4,+1,+3),(-3,+1,+2),(-2,0,+2),(-1,0,+1)
  • There are also eight string of points, stretching to the edges of the (abc) triangle:
    • For N=6K = 3M
      • M+(-8-2x,-6-2x,14+4x),M+(-8-2x,-3-2x,11+4x),M+(-8-2x,x,8+x),M+(-8-2x,3+x,5+x) and permutations (x>=0, M-8-2x>=0)
      • M+(-9-2x,-5-2x,14+4x),M+(-9-2x,-2-2x,11+4x),M+(-9-2x,1+x,8+x),M+(-9-2x,4+x,5+x) and permutations (x>=0, M-9-2x>=0)


An alternate construction:

First define a sequence, of all positive numbers which, in base 3, do not contain a 1. Add 1 to all multiples of 3 in this sequence. This sequence does not contain a length-3 arithmetic progression.

It starts 1,2,7,8,19,20,25,26,55, …

Second, list all the (abc) triples for which the larger two differ by a number from the sequence, excluding the case when the smaller two differ by 1, but then including the case when (a,b,c) is a permutation of N/3+(-1,0,1)

Asymptotics

DHJ(3) is equivalent to the upper bound

[math]\displaystyle{ c_n \leq o(3^n) }[/math]

In the opposite direction, observe that if we take a set [math]\displaystyle{ S \subset [3n] }[/math] that contains no 3-term arithmetic progressions, then the set [math]\displaystyle{ \bigcup_{(a,b,c) \in \Delta_n: a+2b \in S} \Gamma_{a,b,c} }[/math] is line-free. From this and the Behrend construction it appears that we have the lower bound

[math]\displaystyle{ c_n \geq 3^{n-O(\sqrt{\log n})} }[/math]

though this has to be checked.

Numerics suggest that the first large n construction given above above give a lower bound of roughly [math]\displaystyle{ 2.7 \sqrt{\log(n)/n} \times 3^n }[/math], which would asymptotically be inferior to the Behrend bound.

The second large n construction had numerical asymptotics for \log(c_n/3^n) close to [math]\displaystyle{ 1.2-\sqrt{\log(n)} }[/math] between n=1000 and n=10000, consistent with the Behrend bound.

Numerical methods

A greedy algorithm was implemented here. The results were sharp for [math]\displaystyle{ n \leq 3 }[/math] but were slightly inferior to the constructions above for larger n.