http://michaelnielsen.org/polymath1/api.php?action=feedcontributions&user=121.220.134.232&feedformat=atomPolymath1Wiki - User contributions [en]2019-10-23T19:27:09ZUser contributionsMediaWiki 1.23.5http://michaelnielsen.org/polymath1/index.php?title=Higher-dimensional_DHJ_numbersHigher-dimensional DHJ numbers2009-03-31T04:36:07Z<p>121.220.134.232: /* (4,k) is saturated for odd k */</p>
<hr />
<div>For any n, k let <math>c_{n,k}</math> denote the cardinality of the largest subset of <math>[k]^n</math> that does not contain a combinatorial line. When k=3, the quantity <math>c_{n,k} = c_n</math> is studied for instance in [[upper and lower bounds|this page]]. The [[DHJ|density Hales-Jewett theorem]] asserts that for any fixed k, <math>\lim_{n \to \infty} c_n / k^n = 0</math>.<br />
<br />
A [http://abel.math.umu.se/~klasm/Data/HJ/ computer search] has found the following <math>c_n</math> values for different values of dimension n and edgelength k. Several of these values reach the upper bound of <math>(k-1)k^{n-1}</math>.<br />
<br />
{| border=1 | <br />
|n\k || 2 || 3 || 4 || 5 || 6 || 7<br />
|-<br />
| 2 || 2 || 6 || 12 || 20 || 30 || 42<br />
|-<br />
| 3 || 3 || 18 || 48 || 100 || 180 || 294<br />
|-<br />
| 4 || 6 || 52 || 183 || 500 || 1029-1079 || 2058<br />
|-<br />
| 5 || 10 || 150 || 683-732 || 2500<br />
|}<br />
<br />
We trivially have<br />
:<math>c_{n,1} = 0</math> for n > 0 (and <math>c_{0,0}=1</math>)<br />
and [[Sperner's theorem]] tells us that<br />
:<math>c_{n,2} = \binom{n}{\lfloor n/2\rfloor}</math>.<br />
<br />
Now we look at the opposite regime, in which n is small and k is large. We easily have<br />
:<math>c_{0,k} = 1</math><br />
and<br />
:<math>c_{1,k} = k-1</math>;<br />
<br />
together with the trivial bound<br />
:<math>c_{n+1,k} \leq k c_{n,k}</math><br />
<br />
this implies that<br />
:<math>c_{n,k} \leq (k-1) k^{n-1}</math><br />
for any <math>n \geq 1</math>. Let us call a pair (n,k) with n > 0 ''saturated'' if <math>c_{n,k} = (k-1) k^{n-1}</math>, thus there exists a line-free set with exactly one point omitted from every row and column. <br />
<br />
'''Question''': Which pairs (n,k) are saturated?<br />
<br />
From the above discussion we see that (1,k) is saturated for all k >= 1, and (n,1) is (rather trivially) saturated for all n. Sperner's theorem tells us that (n,2) is saturated only for n= 1, 2. Note that if (n,k) is unsaturated then (n',k) will be unsaturated for all n' > n. <br />
<br />
== (2,k) is saturated when k is at least 1 ==<br />
<br />
It is simple to show when restricting to dimension two the maximal set size has to be k(k-1). This can be done by removing the diagonal values 11, 22, 33, …, kk. Since they are in disjoint lines this removal is minimal.<br />
<br />
The k missing points are one per line and one per column.<br />
So their y-coordinates are a shuffle of their x-coordinates.<br />
There are k! rearrangements of the numbers 1 to k.<br />
The k points include a point on the diagonal, so this shuffle is not a derangement. There are k!/e derangements of the numbers 1 to k, so k!(1-1/e) optimal solutions<br />
<br />
The number of optimal solutions is [http://www.research.att.com/~njas/sequences/A002467 this sequence].<br />
<br />
== (3,k) is saturated when k is at least 3 ==<br />
<br />
Let S be a latin square of side k on the symbols 1…k, with colour i in position (i,i) ( This is not possible for k=2 )<br />
<br />
Let axis one in S correspond to coordinate 1 in [k]^3, axis two to coordinate 2 and interpret the colour in position (i,j) as the third coordinate. Delete the points so defined.<br />
<br />
The line with three wild cards has now been removed.<br />
A line with two wildcards will be missing the point corresponding to the diagonal in S.<br />
A line with a single wildcard will be missing a point corresponding to an off diagonal point in S.<br />
<br />
Something similar should work in higher dimensions if one can find latin cubes etc with the right diagonal properties.<br />
<br />
== (n,k) is saturated when all prime divisors of k are at least n ==<br />
<br />
First consider the case when k is prime and at least n: Delete those points whose coordinates add up to a multiple of k.<br />
Every combinatorial line has one point deleted, except for the major diagonal of d=k, which has all points deleted.<br />
<br />
Now consider for instance the case (n,k) = (4,35). Select one value modulo 35 and eliminate it.<br />
Combinatorial lines with one, two, three or four moving coordinates will<br />
realize all values modulo 35 as one, two, three, or four are units modulo 35, thus (4,35) is saturated.<br />
<br />
The same argument tells us that (n,k) is saturated when all prime divisors of k are at least n.<br />
<br />
On the other hand, computer data shows that (4,4) and (4,6) are not saturated.<br />
<br />
== (4,k) ==<br />
<br />
There are five types of points: xyzw, xxyz, xxyy, xxxy and xxxx. Let p(xyzw) be the number of points removed whose coordinates are all different, and so on. <br />
<br />
There are seven types of line: *xyz, *xxy, *xxx, **xy, **xx, ***x, ****. Enough points must be removed to remove all lines. That leads to the following inequalities<br />
* <math> 4p(xyzw)+2p(xxyz) \ge 4k(k-1)(k-2)</math><br />
* <math> 2p(xxyz)+4p(xxyy)+3p(xxxy) \ge 12k(k-1)</math><br />
* <math> p(xxxy)+4p(xxxx) \ge 4k </math><br />
* <math> p(xxyz)+3p(xxxy) \ge 6k(k-1) </math><br />
* <math> 2p(xxyy)+6p(xxxx) \ge 6k </math><br />
* <math> p(xxxx) \ge 1</math><br />
<br />
If (4,k) is saturated, then for some h between 0 and k-1 inclusive the k^3 missing points fall into the following types<br />
* (k-1)(k-2)(k-3) - 6h of type xyzw<br />
* 6(k-1)(k-2) + 12h of type xxyz<br />
* 3(k-1) - 3h of type xxyy<br />
* 4(k-1) - 4h of type xxxy<br />
* 1 + h of type xxxx<br />
<br />
== General lower bounds ==<br />
<br />
There are k^{n-1} disjoint lines *abcd..m, so the density of removed points must be at least 1/k, and retained points at most (k-1)/k. If n \le p \le k then one can get a density of (p-1)/p by deleting points whose coordinates sum to a multiple of p.<br />
The lower bound (p-1)/p approaches (k-1)/k as k\rightarrow \infty.<br />
<br />
If k is prime and <math>k \ge n</math>, then one can remove all combinatorial lines by deleting all points whose coordinates sum to a multiple of k. So the density of deleted points in the optimal configuration is 1/k when k is prime.<br />
<br />
Let p be the smallest prime greater than or equal to both k and n. One can remove all combinatorial lines by deleting all points whose coordinates sum to <math>0\le x\le p-k</math> (mod p), So the density of deleted points is at most (p-k+1)/p. This approaches zero as <math>k\rightarrow\infty</math>. For example, the following paper shows there is a prime between x-x^0.525 and x.<br />
<br />
Baker, R. C.(1-BYU); Harman, G.(4-LNDHB); Pintz, J.(H-AOS)<br />
The difference between consecutive primes. II.<br />
Proc. London Math. Soc. (3) 83 (2001), no. 3, 532–562.<br />
<br />
I think these results can be used to get lower bounds on lines free sets for large n for all values of k. For any k and any n we can find a prime prelatively close to k^n then we remove the first k+1 values mod p then we pick a value then we remove the must k+1 so we only have k+2, 2k+3, etch<br />
the idea is to prevent any two values on a line because two points on a combinatorial line increase by at most k. This has density 1/k so we have<br />
a line free density of 1/(k+1).<br />
<br />
I think the above bound could possibly be improved. First by getting most of the set concentrated around the point with equal numbers of ones twos and threes or the point with values closes to equality the standard deviation should be something like the square root of n. Then we could apply the near prime with sets c(k^.5 + 1) and get a density of roughly c/k^.5<br />
which I think will be better than the Behrend-Elkin construction as e^-x will eventually be less than 1/x as x increases without limit and the square root of k will increase without limit.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Higher-dimensional_DHJ_numbersHigher-dimensional DHJ numbers2009-03-31T04:25:56Z<p>121.220.134.232: /* c(4,k) */</p>
<hr />
<div>For any n, k let <math>c_{n,k}</math> denote the cardinality of the largest subset of <math>[k]^n</math> that does not contain a combinatorial line. When k=3, the quantity <math>c_{n,k} = c_n</math> is studied for instance in [[upper and lower bounds|this page]]. The [[DHJ|density Hales-Jewett theorem]] asserts that for any fixed k, <math>\lim_{n \to \infty} c_n / k^n = 0</math>.<br />
<br />
A [http://abel.math.umu.se/~klasm/Data/HJ/ computer search] has found the following <math>c_n</math> values for different values of dimension n and edgelength k. Several of these values reach the upper bound of <math>(k-1)k^{n-1}</math>.<br />
<br />
{| border=1 | <br />
|n\k || 2 || 3 || 4 || 5 || 6 || 7<br />
|-<br />
| 2 || 2 || 6 || 12 || 20 || 30 || 42<br />
|-<br />
| 3 || 3 || 18 || 48 || 100 || 180 || 294<br />
|-<br />
| 4 || 6 || 52 || 183 || 500 || 1029-1079 || 2058<br />
|-<br />
| 5 || 10 || 150 || 683-732 || 2500<br />
|}<br />
<br />
We trivially have<br />
:<math>c_{n,1} = 0</math> for n > 0 (and <math>c_{0,0}=1</math>)<br />
and [[Sperner's theorem]] tells us that<br />
:<math>c_{n,2} = \binom{n}{\lfloor n/2\rfloor}</math>.<br />
<br />
Now we look at the opposite regime, in which n is small and k is large. We easily have<br />
:<math>c_{0,k} = 1</math><br />
and<br />
:<math>c_{1,k} = k-1</math>;<br />
<br />
together with the trivial bound<br />
:<math>c_{n+1,k} \leq k c_{n,k}</math><br />
<br />
this implies that<br />
:<math>c_{n,k} \leq (k-1) k^{n-1}</math><br />
for any <math>n \geq 1</math>. Let us call a pair (n,k) with n > 0 ''saturated'' if <math>c_{n,k} = (k-1) k^{n-1}</math>, thus there exists a line-free set with exactly one point omitted from every row and column. <br />
<br />
'''Question''': Which pairs (n,k) are saturated?<br />
<br />
From the above discussion we see that (1,k) is saturated for all k >= 1, and (n,1) is (rather trivially) saturated for all n. Sperner's theorem tells us that (n,2) is saturated only for n= 1, 2. Note that if (n,k) is unsaturated then (n',k) will be unsaturated for all n' > n. <br />
<br />
== (2,k) is saturated when k is at least 1 ==<br />
<br />
It is simple to show when restricting to dimension two the maximal set size has to be k(k-1). This can be done by removing the diagonal values 11, 22, 33, …, kk. Since they are in disjoint lines this removal is minimal.<br />
<br />
The k missing points are one per line and one per column.<br />
So their y-coordinates are a shuffle of their x-coordinates.<br />
There are k! rearrangements of the numbers 1 to k.<br />
The k points include a point on the diagonal, so this shuffle is not a derangement. There are k!/e derangements of the numbers 1 to k, so k!(1-1/e) optimal solutions<br />
<br />
The number of optimal solutions is [http://www.research.att.com/~njas/sequences/A002467 this sequence].<br />
<br />
== (3,k) is saturated when k is at least 3 ==<br />
<br />
Let S be a latin square of side k on the symbols 1…k, with colour i in position (i,i) ( This is not possible for k=2 )<br />
<br />
Let axis one in S correspond to coordinate 1 in [k]^3, axis two to coordinate 2 and interpret the colour in position (i,j) as the third coordinate. Delete the points so defined.<br />
<br />
The line with three wild cards has now been removed.<br />
A line with two wildcards will be missing the point corresponding to the diagonal in S.<br />
A line with a single wildcard will be missing a point corresponding to an off diagonal point in S.<br />
<br />
Something similar should work in higher dimensions if one can find latin cubes etc with the right diagonal properties.<br />
<br />
== (n,k) is saturated when all prime divisors of k are at least n ==<br />
<br />
First consider the case when k is prime and at least n: Delete those points whose coordinates add up to a multiple of k.<br />
Every combinatorial line has one point deleted, except for the major diagonal of d=k, which has all points deleted.<br />
<br />
Now consider for instance the case (n,k) = (4,35). Select one value modulo 35 and eliminate it.<br />
Combinatorial lines with one, two, three or four moving coordinates will<br />
realize all values modulo 35 as one, two, three, or four are units modulo 35, thus (4,35) is saturated.<br />
<br />
The same argument tells us that (n,k) is saturated when all prime divisors of k are at least n.<br />
<br />
On the other hand, computer data shows that (4,4) and (4,6) are not saturated.<br />
<br />
== (4,k) is saturated for odd k ==<br />
<br />
If k is odd, then (4,k) is saturated:<br />
<br />
Delete all xxxx points, and all points xxyz and xyzw whose coordinates add up to a multiple of k.<br />
<br />
<br />
There are five types of points: xyzw, xxyz, xxyy, xxxy and xxxx. Let p(xyzw) be the number of points removed whose coordinates are all different, and so on. <br />
<br />
There are seven types of line: *xyz, *xxy, *xxx, **xy, **xx, ***x, ****. Enough points must be removed to remove all lines. That leads to the following inequalities<br />
* <math> 4p(xyzw)+2p(xxyz) \ge 4k(k-1)(k-2)</math><br />
* <math> 2p(xxyz)+4p(xxyy)+3p(xxxy) \ge 12k(k-1)</math><br />
* <math> p(xxxy)+4p(xxxx) \ge 4k </math><br />
* <math> p(xxyz)+3p(xxxy) \ge 6k(k-1) </math><br />
* <math> 2p(xxyy)+6p(xxxx) \ge 6k </math><br />
* <math> p(xxxx) \ge 1</math><br />
<br />
If (4,k) is saturated, then for some h between 0 and k-1 inclusive the k^3 missing points fall into the following types<br />
* (k-1)(k-2)(k-3) - 6h of type xyzw<br />
* 6(k-1)(k-2) + 12h of type xxyz<br />
* 3(k-1) - 3h of type xxyy<br />
* 4(k-1) - 4h of type xxxy<br />
* 1 + h of type xxxx<br />
<br />
== General lower bounds ==<br />
<br />
There are k^{n-1} disjoint lines *abcd..m, so the density of removed points must be at least 1/k, and retained points at most (k-1)/k. If n \le p \le k then one can get a density of (p-1)/p by deleting points whose coordinates sum to a multiple of p.<br />
The lower bound (p-1)/p approaches (k-1)/k as k\rightarrow \infty.<br />
<br />
If k is prime and <math>k \ge n</math>, then one can remove all combinatorial lines by deleting all points whose coordinates sum to a multiple of k. So the density of deleted points in the optimal configuration is 1/k when k is prime.<br />
<br />
Let p be the smallest prime greater than or equal to both k and n. One can remove all combinatorial lines by deleting all points whose coordinates sum to <math>0\le x\le p-k</math> (mod p), So the density of deleted points is at most (p-k+1)/p. This approaches zero as <math>k\rightarrow\infty</math>. For example, the following paper shows there is a prime between x-x^0.525 and x.<br />
<br />
Baker, R. C.(1-BYU); Harman, G.(4-LNDHB); Pintz, J.(H-AOS)<br />
The difference between consecutive primes. II.<br />
Proc. London Math. Soc. (3) 83 (2001), no. 3, 532–562.<br />
<br />
I think these results can be used to get lower bounds on lines free sets for large n for all values of k. For any k and any n we can find a prime prelatively close to k^n then we remove the first k+1 values mod p then we pick a value then we remove the must k+1 so we only have k+2, 2k+3, etch<br />
the idea is to prevent any two values on a line because two points on a combinatorial line increase by at most k. This has density 1/k so we have<br />
a line free density of 1/(k+1).<br />
<br />
I think the above bound could possibly be improved. First by getting most of the set concentrated around the point with equal numbers of ones twos and threes or the point with values closes to equality the standard deviation should be something like the square root of n. Then we could apply the near prime with sets c(k^.5 + 1) and get a density of roughly c/k^.5<br />
which I think will be better than the Behrend-Elkin construction as e^-x will eventually be less than 1/x as x increases without limit and the square root of k will increase without limit.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Higher-dimensional_DHJ_numbersHigher-dimensional DHJ numbers2009-03-31T02:40:44Z<p>121.220.134.232: </p>
<hr />
<div>For any n, k let <math>c_{n,k}</math> denote the cardinality of the largest subset of <math>[k]^n</math> that does not contain a combinatorial line. When k=3, the quantity <math>c_{n,k} = c_n</math> is studied for instance in [[upper and lower bounds|this page]]. The [[DHJ|density Hales-Jewett theorem]] asserts that for any fixed k, <math>\lim_{n \to \infty} c_n / k^n = 0</math>.<br />
<br />
A [http://abel.math.umu.se/~klasm/Data/HJ/ computer search] has found the following <math>c_n</math> values for different values of dimension n and edgelength k. Several of these values reach the upper bound of <math>(k-1)k^{n-1}</math>.<br />
<br />
{| border=1 | <br />
|n\k || 2 || 3 || 4 || 5 || 6 || 7<br />
|-<br />
| 2 || 2 || 6 || 12 || 20 || 30 || 42<br />
|-<br />
| 3 || 3 || 18 || 48 || 100 || 180 || 294<br />
|-<br />
| 4 || 6 || 52 || 183 || 500 || 1029-1079 || 2058<br />
|-<br />
| 5 || 10 || 150 || 683-732 || 2500<br />
|}<br />
<br />
We trivially have<br />
:<math>c_{n,1} = 0</math> for n > 0 (and <math>c_{0,0}=1</math>)<br />
and [[Sperner's theorem]] tells us that<br />
:<math>c_{n,2} = \binom{n}{\lfloor n/2\rfloor}</math>.<br />
<br />
Now we look at the opposite regime, in which n is small and k is large. We easily have<br />
:<math>c_{0,k} = 1</math><br />
and<br />
:<math>c_{1,k} = k-1</math>;<br />
<br />
together with the trivial bound<br />
:<math>c_{n+1,k} \leq k c_{n,k}</math><br />
<br />
this implies that<br />
:<math>c_{n,k} \leq (k-1) k^{n-1}</math><br />
for any <math>n \geq 1</math>. Let us call a pair (n,k) with n > 0 ''saturated'' if <math>c_{n,k} = (k-1) k^{n-1}</math>, thus there exists a line-free set with exactly one point omitted from every row and column. <br />
<br />
'''Question''': Which pairs (n,k) are saturated?<br />
<br />
From the above discussion we see that (1,k) is saturated for all k >= 1, and (n,1) is (rather trivially) saturated for all n. Sperner's theorem tells us that (n,2) is saturated only for n= 1, 2. Note that if (n,k) is unsaturated then (n',k) will be unsaturated for all n' > n. <br />
<br />
== (2,k) is saturated when k is at least 1 ==<br />
<br />
It is simple to show when restricting to dimension two the maximal set size has to be k(k-1). This can be done by removing the diagonal values 11, 22, 33, …, kk. Since they are in disjoint lines this removal is minimal.<br />
<br />
The k missing points are one per line and one per column.<br />
So their y-coordinates are a shuffle of their x-coordinates.<br />
There are k! rearrangements of the numbers 1 to k.<br />
The k points include a point on the diagonal, so this shuffle is not a derangement. There are k!/e derangements of the numbers 1 to k, so k!(1-1/e) optimal solutions<br />
<br />
The number of optimal solutions is [http://www.research.att.com/~njas/sequences/A002467 this sequence].<br />
<br />
== (3,k) is saturated when k is at least 3 ==<br />
<br />
Let S be a latin square of side k on the symbols 1…k, with colour i in position (i,i) ( This is not possible for k=2 )<br />
<br />
Let axis one in S correspond to coordinate 1 in [k]^3, axis two to coordinate 2 and interpret the colour in position (i,j) as the third coordinate. Delete the points so defined.<br />
<br />
The line with three wild cards has now been removed.<br />
A line with two wildcards will be missing the point corresponding to the diagonal in S.<br />
A line with a single wildcard will be missing a point corresponding to an off diagonal point in S.<br />
<br />
Something similar should work in higher dimensions if one can find latin cubes etc with the right diagonal properties.<br />
<br />
== (n,k) is saturated when all prime divisors of k are at least n ==<br />
<br />
First consider the case when k is prime and at least n: Delete those points whose coordinates add up to a multiple of k.<br />
Every combinatorial line has one point deleted, except for the major diagonal of d=k, which has all points deleted.<br />
<br />
Now consider for instance the case (n,k) = (4,35). Select one value modulo 35 and eliminate it.<br />
Combinatorial lines with one, two, three or four moving coordinates will<br />
realize all values modulo 35 as one, two, three, or four are units modulo 35, thus (4,35) is saturated.<br />
<br />
The same argument tells us that (n,k) is saturated when all prime divisors of k are at least n.<br />
<br />
On the other hand, computer data shows that (4,4) and (4,6) are not saturated.<br />
<br />
== c(4,k) ==<br />
<br />
There are five types of points: xyzw, xxyz, xxyy, xxxy and xxxx. Let p(xyzw) be the number of points removed whose coordinates are all different, and so on. <br />
<br />
There are seven types of line: *xyz, *xxy, *xxx, **xy, **xx, ***x, ****. Enough points must be removed to remove all lines. That leads to the following inequalities<br />
* <math> 4p(xyzw)+2p(xxyz) \ge 4k(k-1)(k-2)</math><br />
* <math> 2p(xxyz)+4p(xxyy)+3p(xxxy) \ge 12k(k-1)</math><br />
* <math> p(xxxy)+4p(xxxx) \ge 4k </math><br />
* <math> p(xxyz)+3p(xxxy) \ge 6k(k-1) </math><br />
* <math> 2p(xxyy)+6p(xxxx) \ge 6k </math><br />
* <math> p(xxxx) \ge 1</math><br />
<br />
If (4,k) is saturated, then for some h between 0 and k-1 inclusive the k^3 missing points fall into the following types<br />
* (k-1)(k-2)(k-3) - 6h of type xyzw<br />
* 6(k-1)(k-2) + 12h of type xxyz<br />
* 3(k-1) - 3h of type xxyy<br />
* 4(k-1) - 4h of type xxxy<br />
* 1 + h of type xxxx<br />
<br />
== General lower bounds ==<br />
<br />
There are k^{n-1} disjoint lines *abcd..m, so the density of removed points must be at least 1/k, and retained points at most (k-1)/k. If n \le p \le k then one can get a density of (p-1)/p by deleting points whose coordinates sum to a multiple of p.<br />
The lower bound (p-1)/p approaches (k-1)/k as k\rightarrow \infty.<br />
<br />
If k is prime and <math>k \ge n</math>, then one can remove all combinatorial lines by deleting all points whose coordinates sum to a multiple of k. So the density of deleted points in the optimal configuration is 1/k when k is prime.<br />
<br />
Let p be the smallest prime greater than or equal to both k and n. One can remove all combinatorial lines by deleting all points whose coordinates sum to <math>0\le x\le p-k</math> (mod p), So the density of deleted points is at most (p-k+1)/p. This approaches zero as <math>k\rightarrow\infty</math>. For example, the following paper shows there is a prime between x-x^0.525 and x.<br />
<br />
Baker, R. C.(1-BYU); Harman, G.(4-LNDHB); Pintz, J.(H-AOS)<br />
The difference between consecutive primes. II.<br />
Proc. London Math. Soc. (3) 83 (2001), no. 3, 532–562.<br />
<br />
I think these results can be used to get lower bounds on lines free sets for large n for all values of k. For any k and any n we can find a prime prelatively close to k^n then we remove the first k+1 values mod p then we pick a value then we remove the must k+1 so we only have k+2, 2k+3, etch<br />
the idea is to prevent any two values on a line because two points on a combinatorial line increase by at most k. This has density 1/k so we have<br />
a line free density of 1/(k+1).<br />
<br />
I think the above bound could possibly be improved. First by getting most of the set concentrated around the point with equal numbers of ones twos and threes or the point with values closes to equality the standard deviation should be something like the square root of n. Then we could apply the near prime with sets c(k^.5 + 1) and get a density of roughly c/k^.5<br />
which I think will be better than the Behrend-Elkin construction as e^-x will eventually be less than 1/x as x increases without limit and the square root of k will increase without limit.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Talk:Fujimura%27s_problemTalk:Fujimura's problem2009-03-29T10:22:40Z<p>121.220.134.232: /* General n */</p>
<hr />
<div>Let <math>\overline{c}^\mu_{n,4}</math> be the largest subset of the tetrahedral grid:<br />
<br />
:<math> \{ (a,b,c,d) \in {\Bbb Z}_+^4: a+b+c+d=n \}</math><br />
<br />
which contains no tetrahedrons <math>(a+r,b,c,d), (a,b+r,c,d), (a,b,c+r,d), (a,b,c,d+r)</math> with <math>r > 0</math>; call such sets ''tetrahedron-free''. <br />
<br />
These are the currently known values of the sequence:<br />
<br />
{|<br />
| n || 0 || 1 || 2<br />
|-<br />
| <math>\overline{c}^\mu_{n,4}</math> || 1 || 3 || 7<br />
|}<br />
<br />
== n=0 ==<br />
<br />
<math>\overline{c}^\mu_{0,4} = 1</math>:<br />
<br />
There are no tetrahedrons, so no removals are needed.<br />
<br />
== n=1 ==<br />
<br />
<math>\overline{c}^\mu_{1,4} = 3</math>:<br />
<br />
Removing any one point on the grid will leave the set tetrahedron-free.<br />
<br />
== n=2 ==<br />
<br />
<math>\overline{c}^\mu_{2,4} = 7</math>:<br />
<br />
Suppose the set can be tetrahedron-free in two removals. One of (2,0,0,0), (0,2,0,0), (0,0,2,0), and (0,0,0,2) must be removed. Removing any one of the four leaves three tetrahedrons to remove. However, no point coincides with all three tetrahedrons, therefore there must be more than two removals.<br />
<br />
Three removals (for example (0,0,0,2), (1,1,0,0) and (0,0,2,0)) leaves the set tetrahedron-free with a set size of 7.<br />
<br />
== General n ==<br />
<br />
A lower bound of 2(n-1)(n-2) can be obtained by keeping all points with exactly one coordinate equal to zero.<br />
<br />
You get a non-constructive quadratic lower bound for the quadruple problem by taking a random subset of size <math>cn^2</math>. If c is not too large the linearity of expectation shows that the expected number of tetrahedrons in such a set is less than one, and so there must be a set of that size with no tetrahedrons. I think <math> c = \frac{24^{1/4}}{6} + o(\frac{1}{n})</math>.<br />
<br />
With coordinates (a,b,c,d), take the value a+2b+3c. This forms an arithmetic progression of length 4 for any of the tetrahedrons we are looking for. So we can take subsets of the form a+2b+3c=k, where k comes from a set with no such arithmetic progressions. [[http://arxiv.org/PS_cache/arxiv/pdf/0811/0811.3057v2.pdf This paper]] gives a complicated formula for the possible number of subsets.<br />
<br />
One upper bound can be found by counting tetrahedrons. For a given n the tetrahedral grid has <math>\frac{1}{24}n(n+1)(n+2)(n+3)</math> tetrahedrons. Each point on the grid is part of n tetrahedrons, so <math>\frac{1}{24}(n+1)(n+2)(n+3)</math> points must be removed to remove all tetrahedrons. This gives an upper bound of <math>\frac{1}{8}(n+1)(n+2)(n+3)</math>.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Talk:Fujimura%27s_problemTalk:Fujimura's problem2009-03-29T10:17:04Z<p>121.220.134.232: /* General n */</p>
<hr />
<div>Let <math>\overline{c}^\mu_{n,4}</math> be the largest subset of the tetrahedral grid:<br />
<br />
:<math> \{ (a,b,c,d) \in {\Bbb Z}_+^4: a+b+c+d=n \}</math><br />
<br />
which contains no tetrahedrons <math>(a+r,b,c,d), (a,b+r,c,d), (a,b,c+r,d), (a,b,c,d+r)</math> with <math>r > 0</math>; call such sets ''tetrahedron-free''. <br />
<br />
These are the currently known values of the sequence:<br />
<br />
{|<br />
| n || 0 || 1 || 2<br />
|-<br />
| <math>\overline{c}^\mu_{n,4}</math> || 1 || 3 || 7<br />
|}<br />
<br />
== n=0 ==<br />
<br />
<math>\overline{c}^\mu_{0,4} = 1</math>:<br />
<br />
There are no tetrahedrons, so no removals are needed.<br />
<br />
== n=1 ==<br />
<br />
<math>\overline{c}^\mu_{1,4} = 3</math>:<br />
<br />
Removing any one point on the grid will leave the set tetrahedron-free.<br />
<br />
== n=2 ==<br />
<br />
<math>\overline{c}^\mu_{2,4} = 7</math>:<br />
<br />
Suppose the set can be tetrahedron-free in two removals. One of (2,0,0,0), (0,2,0,0), (0,0,2,0), and (0,0,0,2) must be removed. Removing any one of the four leaves three tetrahedrons to remove. However, no point coincides with all three tetrahedrons, therefore there must be more than two removals.<br />
<br />
Three removals (for example (0,0,0,2), (1,1,0,0) and (0,0,2,0)) leaves the set tetrahedron-free with a set size of 7.<br />
<br />
== General n ==<br />
<br />
A lower bound of 2(n-1)(n-2) can be obtained by keeping all points with exactly one coordinate equal to zero.<br />
<br />
You get a non-constructive quadratic lower bound for the quadruple problem by taking a random subset of size <math>cn^2</math>. If c is not too large the linearity of expectation shows that the expected number of tetrahedrons in such a set is less than one, and so there must be a set of that size with no tetrahedrons. I think <math> c = \frac{24^{1/4}}{6} + o(\frac{1}{n})</math>.<br />
<br />
One upper bound can be found by counting tetrahedrons. For a given n the tetrahedral grid has <math>\frac{1}{24}n(n+1)(n+2)(n+3)</math> tetrahedrons. Each point on the grid is part of n tetrahedrons, so <math>\frac{1}{24}(n+1)(n+2)(n+3)</math> points must be removed to remove all tetrahedrons. This gives an upper bound of <math>\frac{1}{8}(n+1)(n+2)(n+3)</math>.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Talk:Fujimura%27s_problemTalk:Fujimura's problem2009-03-29T10:14:25Z<p>121.220.134.232: /* General n */</p>
<hr />
<div>Let <math>\overline{c}^\mu_{n,4}</math> be the largest subset of the tetrahedral grid:<br />
<br />
:<math> \{ (a,b,c,d) \in {\Bbb Z}_+^4: a+b+c+d=n \}</math><br />
<br />
which contains no tetrahedrons <math>(a+r,b,c,d), (a,b+r,c,d), (a,b,c+r,d), (a,b,c,d+r)</math> with <math>r > 0</math>; call such sets ''tetrahedron-free''. <br />
<br />
These are the currently known values of the sequence:<br />
<br />
{|<br />
| n || 0 || 1 || 2<br />
|-<br />
| <math>\overline{c}^\mu_{n,4}</math> || 1 || 3 || 7<br />
|}<br />
<br />
== n=0 ==<br />
<br />
<math>\overline{c}^\mu_{0,4} = 1</math>:<br />
<br />
There are no tetrahedrons, so no removals are needed.<br />
<br />
== n=1 ==<br />
<br />
<math>\overline{c}^\mu_{1,4} = 3</math>:<br />
<br />
Removing any one point on the grid will leave the set tetrahedron-free.<br />
<br />
== n=2 ==<br />
<br />
<math>\overline{c}^\mu_{2,4} = 7</math>:<br />
<br />
Suppose the set can be tetrahedron-free in two removals. One of (2,0,0,0), (0,2,0,0), (0,0,2,0), and (0,0,0,2) must be removed. Removing any one of the four leaves three tetrahedrons to remove. However, no point coincides with all three tetrahedrons, therefore there must be more than two removals.<br />
<br />
Three removals (for example (0,0,0,2), (1,1,0,0) and (0,0,2,0)) leaves the set tetrahedron-free with a set size of 7.<br />
<br />
== General n ==<br />
<br />
A lower bound of 2(n-1)(n-2) can be obtained by keeping all points with exactly one coordinate equal to zero.<br />
<br />
You get a non-constructive quadratic lower bound for the quadruple problem by taking a random subset of size <math>cn^2</math>. If c is not too large the linearity of expectation shows that the expected number of tetrahedrons in such a set is less than one, and so there must be a set of that size with no tetrahedrons. I think c = (1/6)*24^.25 + o(1/n).<br />
<br />
One upper bound can be found by counting tetrahedrons. For a given n the tetrahedral grid has <math>\frac{1}{24}n(n+1)(n+2)(n+3)</math> tetrahedrons. Each point on the grid is part of n tetrahedrons, so <math>\frac{1}{24}(n+1)(n+2)(n+3)</math> points must be removed to remove all tetrahedrons. This gives an upper bound of <math>\frac{1}{8}(n+1)(n+2)(n+3)</math>.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Talk:Fujimura%27s_problemTalk:Fujimura's problem2009-03-28T17:47:52Z<p>121.220.134.232: /* General n */</p>
<hr />
<div>Let <math>\overline{c}^\mu_{n,4}</math> be the largest subset of the tetrahedral grid:<br />
<br />
:<math> \{ (a,b,c,d) \in {\Bbb Z}_+^4: a+b+c+d=n \}</math><br />
<br />
which contains no tetrahedrons <math>(a+r,b,c,d), (a,b+r,c,d), (a,b,c+r,d), (a,b,c,d+r)</math> with <math>r > 0</math>; call such sets ''tetrahedron-free''. <br />
<br />
These are the currently known values of the sequence:<br />
<br />
{|<br />
| n || 0 || 1 || 2<br />
|-<br />
| <math>\overline{c}^\mu_{n,4}</math> || 1 || 3 || 7<br />
|}<br />
<br />
== n=0 ==<br />
<br />
<math>\overline{c}^\mu_{0,4} = 1</math>:<br />
<br />
There are no tetrahedrons, so no removals are needed.<br />
<br />
== n=1 ==<br />
<br />
<math>\overline{c}^\mu_{1,4} = 3</math>:<br />
<br />
Removing any one point on the grid will leave the set tetrahedron-free.<br />
<br />
== n=2 ==<br />
<br />
<math>\overline{c}^\mu_{2,4} = 7</math>:<br />
<br />
Suppose the set can be tetrahedron-free in two removals. One of (2,0,0,0), (0,2,0,0), (0,0,2,0), and (0,0,0,2) must be removed. Removing any one of the four leaves three tetrahedrons to remove. However, no point coincides with all three tetrahedrons, therefore there must be more than two removals.<br />
<br />
Three removals (for example (0,0,0,2), (1,1,0,0) and (0,0,2,0)) leaves the set tetrahedron-free with a set size of 7.<br />
<br />
== General n ==<br />
<br />
A lower bound of 6(n-1) can be obtained by removing (n,0,0,0), (0,n,0,0), (0,0,n,0), and (0,0,0,n) and all "internal points" (points where no coordinate is a zero). Alternatively, this can be thought of as keeping all points with exactly two coordinates equal to zero.<br />
<br />
A lower bound of 2(n-1)(n-2) can be obtained by keeping all points with exactly one coordinate equal to zero.<br />
<br />
You get a non-constructive quadratic lower bound for the quadruple problem by taking a random subset of size <math>cn^2</math>. If c is not too large the linearity of expectation shows that the expected number of tetrahedrons in such a set is less than one, and so there must be a set of that size with no tetrahedrons.<br />
<br />
One upper bound can be found by counting tetrahedrons. For a given n the tetrahedral grid has <math>\frac{1}{24}n(n+1)(n+2)(n+3)</math> tetrahedrons. Each point on the grid is part of n tetrahedrons, so <math>\frac{1}{24}(n+1)(n+2)(n+3)</math> points must be removed to remove all tetrahedrons. This gives an upper bound of <math>\frac{1}{8}(n+1)(n+2)(n+3)</math>.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Upper_and_lower_boundsUpper and lower bounds2009-03-26T05:38:56Z<p>121.220.134.232: /* Other k values */</p>
<hr />
<div><center>'''Upper and lower bounds for <math>c_n</math> for small values of n.'''</center><br />
<br />
<math>c_n</math> is the size of the largest subset of <math>[3]^n</math> that does not contain a combinatorial line (OEIS [http://www.research.att.com/~njas/sequences/A156762 A156762]. A spreadsheet for all the latest bounds on <math>c_n</math> [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DsU-uZ1tK7VEg can be found here]. In this page we record the proofs justifying these bounds.<br />
<br />
<br />
{|<br />
| n || 0 || 1 || 2 || 3 || 4 || 5 || 6 || 7<br />
|-<br />
| <math>c_n</math> || 1 || 2 || 6 || 18 || 52 || 150 || 450 || [1302,1348]<br />
|}<br />
<br />
== Basic constructions ==<br />
<br />
For all <math>n \geq 1</math>, a basic example of a mostly line-free set is<br />
<br />
:<math>D_n := \{ (x_1,\ldots,x_n) \in [3]^n: \sum_{i=1}^n x_i \neq 0 \ \operatorname{mod}\ 3 \}</math>. (1)<br />
<br />
This has cardinality <math>|D_n| = 2 \times 3^{n-1}</math>. The only lines in <math>D_n</math> are those with<br />
<br />
# A number of wildcards equal to a multiple of three;<br />
# The number of 1s unequal to the number of 2s modulo 3.<br />
<br />
One way to construct line-free sets is to start with <math>D_n</math> and remove some additional points. We also have the variants <math>D_{n,0}=D_n, D_{n,1}, D_{n,2}</math> defined as<br />
<br />
:<math>D_{n,j} := \{ (x_1,\ldots,x_n) \in [3]^n: \sum_{i=1}^n x_i \neq j \ \operatorname{mod}\ 3 \}</math>. (1')<br />
<br />
When n is not a multiple of 3, then <math>D_{n,0}, D_{n,1}, D_{n,2}</math> are all cyclic permutations of each other; but when n is a multiple of 3, then <math>D_{n,0}</math> plays a special role (though <math>D_{n,1}, D_{n,2}</math> are still interchangeable).<br />
<br />
Another useful construction proceeds by using the slices <math>\Gamma_{a,b,c} \subset [3]^n</math> for <math>(a,b,c)</math> in the triangular grid<br />
<br />
:<math>\Delta_n := \{ (a,b,c) \in {\Bbb Z}_+^3: a+b+c = n \},</math>. (2)<br />
<br />
where <math>\Gamma_{a,b,c}</math> is defined as the strings in <math>[3]^n</math> with <math>a</math> 1s, <math>b</math> 2s, and <math>c</math> 3s. Note that<br />
<br />
:<math>|\Gamma_{a,b,c}| = \frac{n!}{a! b! c!}.</math> (3)<br />
<br />
Given any set <math>B \subset \Delta_n</math> that avoids equilateral triangles <math> (a+r,b,c), (a,b+r,c), (a,b,c+r)</math>, the set<br />
<br />
:<math>\Gamma_B := \bigcup_{(a,b,c) \in B} \Gamma_{a,b,c}</math> (4)<br />
<br />
is line-free and has cardinality<br />
<br />
:<math>|\Gamma_B| = \sum_{(a,b,c) \in B} \frac{n!}{a! b! c!},</math> (5)<br />
<br />
and thus provides a lower bound for <math>c_n</math>:<br />
<br />
:<math>c_n \geq \sum_{(a,b,c) \in B} \frac{n!}{a! b! c!}.</math> (6)<br />
<br />
All lower bounds on <math>c_n</math> have proceeded so far by choosing a good set of B and applying (6). Note that <math>D_n</math> is the same as <math>\Gamma_{B_n}</math>, where <math>B_n</math> consists of those triples <math>(a,b,c) \in \Delta_n</math> in which <math>a \neq b\ \operatorname{mod}\ 3</math>.<br />
<br />
Note that if one takes a line-free set and permutes the alphabet <math>\{1,2,3\}</math> in any fashion (e.g. replacing all 1s by 2s and vice versa), one also gets a line-free set. This potentially gives six examples from any given starting example of a line-free set, though in practice there is enough symmetry that the total number of examples produced this way is less than six. (These six examples also correspond to the six symmetries of the triangular grid <math>\Delta_n</math> formed by rotation and reflection.)<br />
<br />
Another symmetry comes from permuting the <math>n</math> indices in the strings of <math>[3]^n</math> (e.g. replacing every string by its reversal). But the sets <math>\Gamma_B</math> are automatically invariant under such permutations and thus do not produce new line-free sets via this symmetry.<br />
<br />
== The basic upper bound ==<br />
<br />
Because <math>[3]^{n+1}</math> can be expressed as the union of three copies of <math>[3]^n</math>, we have the basic upper bound<br />
<br />
:<math>c_{n+1} \leq 3 c_n.</math> (7)<br />
<br />
Note that equality only occurs if one can find an <math>n+1</math>-dimensional line-free set such that every n-dimensional slice has the maximum possible cardinality of <math>c_n</math>.<br />
<br />
== n=0 ==<br />
<br />
:<math>c_0=1</math>:<br />
<br />
This is clear.<br />
<br />
== n=1 ==<br />
<br />
:<math>c_1=2</math>:<br />
<br />
The three sets <math>D_1 = \{1,2\}</math>, <math>D_{1,1} = \{2,3\}</math>, and <math>D_{1,2} = \{1,3\}</math> are the only two-element sets which are line-free in <math>[3]^1</math>, and there are no three-element sets.<br />
<br />
== n=2 ==<br />
<br />
:<math>c_2=6</math>:<br />
<br />
There are four six-element sets in <math>[3]^2</math> which are line-free, which we denote <math>x = D_{2,2}</math>, <math>y=D_{2,1}</math>, <math>z=D_2</math>, and <math>w</math> and are displayed graphically as follows.<br />
<br />
13 .. 33 .. 23 33 13 23 .. 13 23 ..<br />
x = 12 22 .. y = 12 .. 32 z = .. 22 32 w = 12 .. 32<br />
.. 21 31 11 21 .. 11 .. 31 .. 21 31<br />
<br />
Combining this with the basic upper bound (7) we see that <math>c_2=6</math>.<br />
<br />
== n=3 ==<br />
<br />
:<math>c_3=18</math>:<br />
<br />
We describe a subset <math>A</math> of <math>[3]^3</math> as a string <math>abc</math>, where <math>a, b, c \subset [3]^2</math> correspond to strings of the form <math>1**</math>, <math>2**</math>, <math>3**</math> in <math>[3]^3</math> respectively. Thus for instance <math>D_3 = xyz</math>, and so from (7) we have <math>c_3=18</math>.<br />
<br />
'''Lemma 1.'''<br />
* The only 18-element line-free subset of <math>[3]^3</math> is <math>D_3 = xyz</math>.<br />
* The only 17-element line-free subsets of <math>[3]^3</math> are formed by removing a point from <math>D_3=xyz</math>, or by removing either 111, 222, or 333 from <math>D_{3,2} = yzx</math> or <math>D_{3,3}=zxy</math>.<br />
<br />
'''Proof'''. We prove the second claim. As <math>17=6+6+5</math>, and <math>c_2=6</math>, at least two of the slices of a 17-element line-free set must be from x, y, z, w, with the third slice having 5 points. If two of the slices are identical, the last slice can have only 3 points, a contradiction. If one of the slices is a w, then the 5-point slice will contain a diagonal, contradiction. By symmetry we may now assume that two of the slices are x and y, which force the last slice to be z with one point removed. Now one sees that the slices must be in the order xyz, yzx, or zxy, because any other combination has too many lines that need to be removed. The sets yzx, zxy contain the diagonal {111,222,333} and so one additional point needs to be removed. <br />
<br />
The first claim follows by a similar argument to the second.<br />
<math>\Box</math><br />
<br />
== n=4 ==<br />
<br />
:<math>c_4=52</math>:<br />
<br />
Indeed, divide a line-free set in <math>[3]^4</math> into three blocks <math>1***, 2***, 3***</math> of <math>[3]^3</math>. If two of them are of size 18, then they must both be xyz, and the third block can have at most 6 elements, leading to an inferior bound of 42. So the best one can do is <math>18+17+17=52</math> which can be attained by deleting the diagonal {1111,2222,3333} from <math>D_{4,1} = xyz\ yzx\ xzy</math>, <math>D_4 = yzx\ zxy\ xyz</math>, or <math>D_{4,2} = zxy\ xyz\ yzx</math>. In fact,<br />
<br />
'''Lemma 2.'''<br />
<br />
* The only 52-element line-free sets in <math>[3]^4</math> are formed by removing the diagonal {1111,2222,3333} from <math>D_{4,j}</math> for some j=0,1,2.<br />
* The only 51-element line-free sets in <math>[3]^4</math> are formed by removing the diagonal and one further point from <math>D_{4,j}</math> for some j=0,1,2.<br />
* The only 50-element line-free sets in <math>[3]^4</math> are formed by removing the diagonal and two further points from <math>D_{4,j}</math> for some j=0,1,2 OR is equal to one of the three permutations of the set <math>X := \Gamma_{3,1,0} \cup \Gamma_{3,0,1} \cup \Gamma_{2,2,0} \cup \Gamma_{2,0,2} \cup \Gamma_{1,1,2} \cup \Gamma_{1,2,1} \cup \Gamma_{0,2,2}</math>.<br />
<br />
'''Proof''' It suffices to prove the third claim. In fact it suffices to show that every 50-point line-free set is either contained in the 54-point set <math>D_{4,j}</math> for some j=0,1,2, or is some permutation of the set X. Indeed, if a 50-point line-free set is contained in, say, <math>D_4</math>, then it cannot contain 2222, since otherwise it must omit one point from each of the four pairs formed from {2333, 2111} by permuting the indices, and must also omit one of {1111, 1222, 1333}, leading to at most 49 points in all; similarly, it cannot contain 1111, and so omits the entire diagonal {1111,2222,3333}, with two more points to be omitted. Similarly when <math>D_4</math> is replaced by one of the other <math>D_{4,j}</math><br />
<br />
Next, observe that every three-dimensional slice of a line-free set can have at most <math>c_3=18</math> points; thus when one partitions a 50-point line-free set into three such slices, it must divide either as 18+16+16, 17+17+16, or some permutation of these.<br />
<br />
Suppose that we can slice the set into two slices of 17 points and one slice of 16 points. By the various symmetries, we may assume that the 1*** slice and 2*** slices have 17 points, and the 3*** slice has 16 points. By Lemma 1, the 1-slice is <math>\{1\} \times D_{3,j}</math> with one point removed, and the 2-slice is <math>\{2\} \times D_{3,k}</math> with one point removed, for some <math>j,k \in \{0,1,2\}</math>.<br />
<br />
If j=k, then the 1-slice and 2-slice have at least 15 points in common, so the 3-slice can have at most <math>27-15=12</math> points, a contradiction. If jk = 01, 12, or 20, then observe that from Lemma 1 the *1**, *2**, *3** slices cannot equal a 17-point or 18-point line-free set, so each have at most 16 points, leading to only 48 points in all, a contradiction. Thus we must have jk = 10, 21, or 02.<br />
<br />
Let's first suppose that jk=02. Then by Lemma 1, the 2*** slice contains the nine points formed from {2211, 2322, 2331} and permuting the last three indices, while the 1*** slice contains at least eight of the nine points formed from {1211, 1322, 1311} and permuting the last three indices. Thus the 3*** slice can contain at most one of the nine points formed from {3211, 3322, 3311} and permuting the last three indices. If it does contain one of these points, say 3211, then it must omit one point from each of the four pairs {3222, 3233}, {3212, 3213}, {3221, 3231}, {3111, 3311}, leading to at most 15 points on this slice, a contradiction. So the 3*** slice must omit all nine points, and is therefore contained in <math>\{3\} \times D_{4,1}</math>, and so the 50-point set is contained in <math>D_{4,1}</math>, and we are done by the discussion at the beginning of the proof.<br />
<br />
The case jk=10 is similar to the jk=02 case (indeed one can get from one case to the other by swapping the 1 and 2 indices). Now suppose instead that jk=12. Then by Lemma 1, the 1*** slice contains the six points from permuting the last three indices of 1123, and similarly the 2*** slice contains the six points from permuting the last three indices of 2123. Thus the 3*** slice must avoid all six points formed by permuting the last three indices of 3123. Similarly, as 1133 lies in the 1*** slice and 2233 lies in the 2*** slice, 3333 must be avoided in the 3*** slice.<br />
<br />
Now we claim that 3111 must be avoided also; for if 3111 was in the set, then one point from each of the six pairs formed from {3311, 3211}, {3331, 3221} and permuting the last three indices must lie outside the 3*** slice, which reduces the size of that slice to at most <math>27-6-1-6=14</math>, which is too small. Similarly, 3222 must be avoided, which puts the 3*** slice inside <math>\{3\} \times D_3</math> and then places the 50-point set inside <math>D_4</math>, and we are done by the discussion at the beginning of the proof.<br />
<br />
We have handled the case in which at least one of the slicings of the 50-point set is of the form 50=17+17+16. The only remaining case is when all slicings of the 50-point set are of the form 18+17+16 (or a permutation thereof). By the symmetries of the situation, we may assume that the 1*** slice has 18 points, and thus by Lemma 1 takes the form <math>\{1\} \times D_3</math>. Inspecting the *1**, *2**, *3** slices, we then see (from Lemma 1) that only the *1** slice can have 18 points; since we are assuming that this slicing is some permutation of 50=18+17+16, we conclude that the *1** slice must have exactly 18 points, and is thus described precisely by Lemma 1. Similarly for the **1* and ***1 slices. Indeed, by Lemma 1, we see that the 50-point set must agree exactly with <math>D_{4,1}</math> on any of these slices. In particular, on the remaining portion <math>\{2,3\}^4</math> of the cube, there are exactly 6 points of the 50-point set in <math>\{2,3\}^4</math>.<br />
<br />
Suppose that 3333 was in the set; then since all permutations of 3311, 3331 are known to lie in the set, then 3322, 3332 must lie outside the set. Also, as 1222 lies in the set, at least one of 2222, 3222 lie outside the set. This leaves only 5 points in <math>\{2,3\}^4</math>, a contradiction. Thus 3333 lies outside the set; similarly 2222 lies outside the set.<br />
<br />
Let a be the number of points in the 50-point set which are some permutation of 2233, thus <math>0 \leq a \leq 6</math>. If a=0 then the set lies in <math>D_{4,1}</math> and we are done. If a=6 then the set is exactly X and we are done. Now suppose a=1,2,3. By symmetry we may assume that 2233 lies in the set. Then (since 2133, 1233 2231, 2213 are known to lie in the set) 2333, 3233, 2223, 2232 lie outside the set, which leaves at most 5 points inside <math>\{2,3\}^4</math>, a contradiction.<br />
<br />
The remaining case is when a=4,5. Then one of the three pairs {2233, 3322}, {2323, 3232}, {2332, 3223} lie in the set. By symmetry we may assume that {2233, 3322} lie in the set. Then by arguing as before we see that all eight points formed by permuting 2333 or 3222 lie outside the set, leading to at most 5 points inside <math>\{2,3\}^4</math>, a contradiction.<br />
<math>\Box</math><br />
<br />
== n=5 ==<br />
<br />
:<math>c_5=150</math>:<br />
<br />
'''Lemma 3'''. Any line-free subset of <math>D_{5,j}</math> can have at most 150 points.<br />
<br />
'''Proof'''. By rotation we may work with <math>D_5</math>. This set has 162 points. By looking at the triplets {10000, 11110, 12220} and cyclic permutations we must lose 5 points; similarly from the triplets {20000,22220, 21110} and cyclic permutations. Finally from {11000,11111,11222} and {22000,22222,22111} we lose two more points. <math>\Box</math><br />
<br />
Equality can be attained by removing <math>\Gamma_{0,4,1}, \Gamma_{0,5,0}, \Gamma_{4,0,1}, \Gamma_{5,0,0}</math> from <math>D_5</math>. Thus <math>c_5 \geq 150</math>.<br />
<br />
Another pattern of 150 points is this: Take the 450 points<br />
in <math>{}[3]^6</math> which are (1,2,3), (0,2,4) and permutations,<br />
then select the 150 whose final coordinate is 1. That gives<br />
this many points in each cube:<br />
<br />
17 18 17<br />
<br />
17 17 18<br />
<br />
12 17 17<br />
<br />
'''Lemma 4'''. A line-free subset of <math>[3]^5</math> with over 150 points cannot have two parallel <math>[3]^4</math> slices, each of which contain at least 51 points.<br />
<br />
'''Proof'''. Suppose not. By symmetry, we may assume that the 1**** and 2**** slices have at least 51 points, and that the whole set has at least 151 points, which force the third slice to have at least <math>151-2c_4 = 47</math> points.<br />
<br />
By Lemma 2, the 1**** slice takes the form <math>\{1\} \times D_{4,j}</math> for some <math>j=0,1,2</math> with the diagonal {11111,12222,13333} and possibly one more point removed, and similarly the 2**** slice takes the form <math>\{2\} \times D_{4,k}</math> for some <math>k=0,1,2</math> with the diagonal {21111,22222,23333} and possibly one more point removed.<br />
<br />
Suppose first that j=k. Then the 1-slice and 2-slice have at least 50 points in common, leaving at most 31 points for the 3-slice, a contradiction. Next, suppose that jk=01. Then observe that the *i*** slice cannot look like any of the configurations in Lemma 2 and so must have at most 50 points for i=1,2,3, leading to 150 points in all, a contradiction. Similarly if jk=12 or 20. Thus we must have jk equal to 10, 21, or 02.<br />
<br />
Let's suppose first that jk=10. The first slice then is equal to <math>\{1\} \times D_{4,1}</math> with the diagonal and possibly one more point removed, while the second slice is equal to <math>\{2\} \times D_{4,0}</math> with the diagonal and possibly one more point removed. Superimposing these slices, we thus see that the third slice is contained in <math>\{3\} \times D_{4,2}</math> except possibly for two additional points, together with the one point 32222 of the diagonal that lies outside of <math>\{3\} \times D_{4,2}</math>.<br />
<br />
The lines x12xx, x13xx (plus permutations of the last four digits) must each contain one point outside the set. The first two slices can only absorb two of these, and so at least 14 of the 16 points formed by permuting the last four digits of 31233, 31333 must lie outside the set. These points all lie in <math>\{3\} \times D_{4,2}</math>, and so the 3**** slice can have at most <math>|D_{4,2}|-14+3=43</math> points, a contradiction.<br />
<br />
The case jk=02 is similar to the case jk=10 (indeed one can obtain one from the other by swapping 1 and 2). Now we turn to the case jk=21. Arguing as before we see that the third slice is contained in <math>\{3\} \times D_4</math> except possibly for two points, together with 33333. <br />
<br />
If 33333 was in the set, then each of the lines xx333, xxx33 (and permutations of the last four digits) must have a point missing from the first two slices, which cannot be absorbed by the two points we are permitted to remove; thus 33333 is not in the set. For similar reasons, 33331 is not in the set, as can be seen by looking at xxx31 and permutations of the last four digits. Indeed, any string containing four threes does not lie in the set; this means that at least 8 points are missing from <math>\{3\} \times D_4</math>, leaving only at most 46 points inside that set. Furthermore, any point in the 3**** slice outside of <math>\{3\} \times D_4</math> can only be created by removing a point from the first two slices, so the total cardinality is at most <math>46+52+52 = 150</math>, a contradiction.<math>\Box</math><br />
<br />
'''Corollary'''. <math>c_5 \leq 152</math><br />
<br />
'''Proof'''. By Lemma 4 and the bound <math>c_4=52</math>, any line-free set with over 150 points can have one slice of cardinality 52, but then the other two slices can have at most 50 points. <math>\Box</math><br />
<br />
<br />
'''Lemma 5''' Any solution with 151 or more points has a slice with at most 49 points.<br />
<br />
'''Proof''' Suppose we have 151 points without a line, and each of three slices has at least 50 points.<br />
<br />
Using earlier notation, we split subsets of <math>[3]^4</math> into nine subsets of <math>[3]^2</math>. <br />
So we think of x,y,z,a,b and c as subsets of a square. Each slice is one of the following.<br />
*<math>D_4 = y'zx,zx'y,xyz</math> (with one or two points removed)<br />
*<math>D_{4,2} = z'xy,xyz,yzx'</math> (with one or two points removed)<br />
*<math>D_{4,1} = xyz,yz'x,zxy'</math> (with one or two points removed)<br />
*<math>X = xyz, ybw, zwc</math><br />
*<math>Y = axw, xyz, wzc</math><br />
*<math>Z = awx, wby, xyz</math><br />
<br />
where a, b and c have four points each.<br />
<br />
.. 32 33 31 .. 33 .. .. ..<br />
a = .. 22 23 b = .. .. .. c = 21 22 ..<br />
.. .. .. 11 .. 13 11 12 ..<br />
<br />
x', y' and z' are subsets of x, y and z respectively, and have five points each.<br />
<br />
Suppose all three slices are subsets of <math>D_{4,j}</math>. <br />
We can remove at most five points from the full set of three D_{4,j}. <br />
Consider columns 2,3,4,6,7,8. At most two of these columns contain xyz, so one point must be removed from the other four.<br />
This uses up all but one of the removals.<br />
So the slices must be <math>D_{4,2},D_{4,1},D_{4,0}</math> or a cyclic permutation of that.<br />
Then the cube, which contains the first square of slice 1; the fifth square of slice 2; <br />
and the ninth square of slice 3, contains three copies of the same square. <br />
It takes more than one point removed to remove all lines from that cube.<br />
So we can't have all three slices subsets of <math>D_{4,j}</math>.<br />
<br />
Suppose one slice is X,Y or Z, and two others are subsets of <math>D_{4,j}</math>. <br />
We can remove at most three points from the full <math>D_{4,j}</math><br />
By symmetry, suppose one slice is X. Consider columns 2,3,4 and 7. They must be cyclic permutations of x,y,z,<br />
and two of them are not xyz, so must lose a point. <br />
Columns 6 and 8 must both lose a point, and we only have 150 points left.<br />
So if one slice is X,Y or Z, the full set contains a line.<br />
<br />
Suppose two slices are from X,Y and Z, and the other is a subset of <math>D_{4,j}</math>. <br />
By symmetry, suppose two slices are X and Y. Columns 3,6,7 and 8 all contain w, and therefore at most 16 points each.<br />
Columns 1,5 and 9 contain a,b, or c, and therefore at most 16 points. <br />
So the total number of points is at most 7*16+2*18 = 148. This contradicts the assumption of 151 points.<br />
<math>\Box</math><br />
<br />
'''Corollary''' <math>c_5 \leq 151 </math><br />
<br />
'''Proof''' By Lemmas 2 and 4, the maximum number of points is 52+50+49=151. <math>\Box</math><br />
<br />
'''Lemma 5.1''' No solution with 151 points contains as a slice the X defined in Lemma 2<br />
<br />
'''Proof''' Suppose one row is X. Another row is <math>D_{4,j}</math>.<br />
<br />
Suppose X is in the first row. Label the other rows with letters from the alphabet.<br />
<br />
xyz ybw zwc<br />
<br />
mno pqr stu<br />
<br />
def ghi jkl<br />
<br />
Reslice the array into a left nine, middle nine and right nine. One of these squares<br />
contains 52 points, and it can only be the left nine. One of its three columns contains<br />
18 points, and it can only be its left-hand column, xmd. So m=y and d=z. But none of the {math>D_{4,j}</math> begins with y or z, which is a contradiction. So X is not in the first row.<br />
<br />
So X is in the second or third row. By symmetry, suppose it is in the second row<br />
<br />
def ghi jkl<br />
<br />
xyz ybw zwc<br />
<br />
mno pqr stu<br />
<br />
Again, the left-hand nine must contain 52 points, so it is <math>D_{4,2}</math>.<br />
So either the first row is <math>D_{4,2}</math> or the third row is <math>D_{4,0}</math>.<br />
If the first row is <math>D_{4,2}</math> then the only way to have 50 points in the middle or right-hand nine is if the middle nine is X<br />
<br />
z'xy xyz yzx'<br />
<br />
xyz ybw zwc<br />
<br />
yzx' zwc stu<br />
<br />
In the seventh column, s contains 5 points and in the eighth column, t contains 4 points.<br />
The final row can now contain at most 48 points, and the whole array contains only 52+50+48 = 150 points.<br />
<br />
If the third row is <math>D_{4,0}</math>, then neither the middle nine nor the right-hand nine contains 50 points, by the classification of Lemma 4 and the formulas at the start of Lemma 5.<br />
Again, only 52+49+49 = 150 points are possible.<br />
<br />
A similar argument is possible if X is in the third row; or if X is replaced by Y or Z.<br />
<br />
So when a 151-point set is sliced into three, one slice is <math>D_{4,j}</math> and another slice is 50 points contained in <math>D_{4,k}</math>. <math>\Box</math><br />
<br />
'''Lemma 5.2''' There is no 151-point solution<br />
<br />
'''Proof''' Assume by symmetry that the first row contains 52 points and the second row contains 50.<br />
<br />
If <math>D_{4,1}</math> is in the first row, then the second row must be contained in <math>D_{4,0}</math>. <br />
<br />
xyz yz'x zxy'<br />
<br />
y'zx zx'y xyz<br />
<br />
def ghi jkl<br />
<br />
But then none of the left nine, middle nine or right nine can contain 52 points, which contradicts the corollary to Lemma 5.<br />
<br />
Suppose the first row contains D_{4,0}. Then the second row is contained in <math>D_{4,2}</math>, otherwise the cubes formed from the nine columns of the diagram would need to remove too many points.<br />
<br />
y'zx zx'y xyz<br />
<br />
z'xy xyz yzx'<br />
<br />
def ghi jkl<br />
<br />
But then neither the left nine, middle nine or right nine contains 52 points.<br />
<br />
So the first row contains <math>D_{4,2}</math>, and the second row is contained in <math>D_{4,1}</math>. Two points may be removed from the second row of this diagram.<br />
<br />
z'xy xyz yzx'<br />
<br />
xyz yz'x zxy'<br />
<br />
def ghi jkl<br />
<br />
Slice it into the left nine, middle nine and right nine. Two of them are contained in <math>D_{4,j}</math><br />
so at least two of def, ghi, and jkl are contained in the corresponding slice of <math>D_{4,0}</math>.<br />
Slice along a different axis, and at least two of dgj,ehk,fil are contained in the corresponding slice of <br />
<math>D_{4,0}</math>. <br />
So eight of the nine squares in the bottom row are contained in the corresponding square of <math>D_{4,0}</math>.<br />
Indeed, slice along other axes, and all points except one are contained within <math>D_{4,0}</math>. <br />
This point is the intersection of all the 49-point slices. <br />
<br />
So, if there is a 151-point solution, then after removal of the specified point, <br />
there is a 150-point solution, within <math>D_{5,j}</math>, whose slices in each direction are 52+50+48.<br />
<br />
z'xy xyz yzx'<br />
<br />
xyz yz'x zxy'<br />
<br />
y'zx zx'y xyz<br />
<br />
One point must be lost from columns 3, 6, 7 and 8, and four more from the major diagonal z'z'z. That leaves 148 points instead of 150.<br />
<br />
So the 150-point solution does not exist with 52+50+48 slices; so the 151 point solution does not exist.<math>\Box</math><br />
<br />
<br />
An integer programming method has established the upper bound <math>c_5\leq 150</math>, with 12 extremal solutions.<br />
<br />
[http://abel.math.umu.se/~klasm/extremal-c5 This file] contains the extermisers. One point per line and different extermisers separated by a line with “—”<br />
<br />
[http://abel.math.umu.se/~klasm/linprog-d=5-t=3.lpt This is the linear program], readable by Gnu’s glpsol linear programing solver, which also quickly proves that 150 is the optimum.<br />
<br />
Each variable corresponds to a point in the cube, numbered according to their lexicografic ordering. If a variable is 1 then the point is in the set, if it is 0 then it is not in the set.<br />
There is one linear inequality for each combinatorial line, stating that at least one point must be missing from the line.<br />
<br />
== n=6 ==<br />
<br />
:<math>c_6=450</math>:<br />
<br />
The upper bound follows since <math>c_6 \leq 3 c_5</math>. The lower bound can be formed by gluing together all the [[slice]]s <math>\Gamma_{a,b,c}</math> where (a,b,c) is a permutation of (0,2,4) or (1,2,3).<br />
<br />
Computer verification, using the <math>c_5=150</math> extremals, has shown that there is exactly one extremiser for <math>c_6=450</math>.<br />
<br />
== n=7 ==<br />
<br />
:<math>1302 \leq c_7 \leq 1348</math>:<br />
<br />
To see the upper bound <math>c_7 \leq 3c_6-2</math>, observe that if two parallel six-dimensional slices had <math>c_6</math> points, then by uniqueness they are identical, and the third slice can have at most <math>3^6-c_6=279</math> points, far too few to get anywhere close to <math>1348</math>. Thus there can be at most one slice with <math>c_6</math> points, and the other two have at most <math>c_6-1</math>, giving the claim.<br />
<br />
The lower bound can be formed by removing 016,106,052,502,151,511,160,610 from <math>D_7</math>.<br />
<br />
'''Lemma 6''' Any line-free subset of <math>D_7</math> has at most 1302 points.<br />
<br />
'''Proof''' Start with the 1458 points of <math>D_7</math>. You must lose:<br />
<br />
* 42 points from (1,2,4),(1,5,1),(4,2,1)<br />
* 42 points from (2,1,4),(2,4,1),(5,1,1)<br />
* 21 points from (0,2,5),(0,5,2),(3,2,2)<br />
* 21 points from (2,0,5),(2,3,2),(5,0,2)<br />
* 15 points from (0,1,6),(0,4,3),(3,1,3),(0,7,0),(3,4,0),(6,1,0)<br />
* 15 points from (1,0,6),(1,3,3),(4,0,3),(7,0,0),(4,3,0),(1,6,0)<br />
<br />
where (a,b,c) is shorthand for the [[slice]] <math>\Gamma_{a,b,c}</math>.<br />
<math>\Box</math><br />
<br />
== Larger n ==<br />
<br />
The following construction gives lower bounds for the number of triangle-free points, <br />
There are of the order <math>2.7 \sqrt{log(N)/N}3^N</math> points for large N (N ~ 5000)<br />
<br />
It applies when N is a multiple of 3. <br />
* For N=3M-1, restrict the first digit of a 3M sequence to be 1. So this construction has exactly one-third as many points for N=3M-1 as it has for N=3M. <br />
* For N=3M-2, restrict the first two digits of a 3M sequence to be 12. This leaves roughly one ninth of the points for N=3M-2 as for N=3M.<br />
<br />
The current lower bounds for <math>c_{3m}</math> are built like this, with abc being shorthand for <math>\Gamma_{a,b,c}</math>:<br />
<br />
* <math>c_3</math> from (012) and permutations<br />
* <math>c_6</math> from (123,024) and perms<br />
* <math>c_9</math> from (234,135,045) and perms<br />
* <math>c_{12}</math> from (345,246,156,02A,057) and perms (A=10)<br />
* <math>c_{15}</math> from (456,357,267,13B,168,04B,078) and perms (B=11)<br />
<br />
To get the triples in each row, add 1 to the triples in the previous row; then include new triples that have a zero.<br />
<br />
A general formula for these points is given below. I think that they are triangle-free. (For N<21, ignore any triple with a negative entry.)<br />
<br />
* There are thirteen groups of points in the centre, formed from adding one of the following points, or its permutation, to (M,M,M), when N=3M:<br />
** (-7,-3,+10), (-7, 0,+7),(-7,+3,+4),(-6,-4,+10),(-6,-1,+7),(-6,+2,+4),(-5,-1,+6),(-5,+2,+3),(-4,-2,+6),(-4,+1,+3),(-3,+1,+2),(-2,0,+2),(-1,0,+1) <br />
* There are also eight string of points, stretching to the edges of the (abc) triangle:<br />
** For N=6K = 3M<br />
*** M+(-8-2x,-6-2x,14+4x),M+(-8-2x,-3-2x,11+4x),M+(-8-2x,x,8+x),M+(-8-2x,3+x,5+x) and permutations (x>=0, M-8-2x>=0)<br />
*** M+(-9-2x,-5-2x,14+4x),M+(-9-2x,-2-2x,11+4x),M+(-9-2x,1+x,8+x),M+(-9-2x,4+x,5+x) and permutations (x>=0, M-9-2x>=0)<br />
<br />
<br />
An alternate construction:<br />
<br />
First define a sequence, of all positive numbers which, in base 3, do not contain a 1. Add 1 to all multiples of 3 in this sequence. This sequence does not contain a length-3 arithmetic progression.<br />
<br />
It starts 1,2,7,8,19,20,25,26,55, …<br />
<br />
Second, list all the (abc) triples for which the larger two differ by a number<br />
from the sequence, excluding the case when the smaller two differ by 1, but then including the case when (a,b,c) is a permutation of N/3+(-1,0,1)<br />
<br />
== Asymptotics ==<br />
<br />
DHJ(3) is equivalent to the upper bound<br />
<br />
:<math>c_n \leq o(3^n)</math><br />
<br />
In the opposite direction, observe that if we take a set <math>S \subset [3n]</math> that contains no 3-term arithmetic progressions, then the set <math>\bigcup_{(a,b,c) \in \Delta_n: a+2b \in S} \Gamma_{a,b,c}</math> is line-free. From this and the Behrend construction it appears that we have the lower bound<br />
<br />
:<math>c_n \geq 3^{n-O(\sqrt{\log n})}.</math><br />
<br />
More precisely, we have<br />
<br />
:<math>c_n > C 3^{n - 4\sqrt{\log 2}\sqrt{\log n}+\frac 12 \log \log n}</math><br />
for some absolute constant C, and where all logarithms are base-3.<br />
<br />
'''Proof''' For convenience, let n be a multiple of 3. Elkin’s bound gives <math>r_3(\sqrt{n}) > C \sqrt{n} (\log n)^{1/4} \exp_2(-2 \sqrt{\log_2 n})</math>, and let <math>R</math> be a subset of <math>(-3\sqrt{n}/2,3\sqrt{n}/2)</math> without 3-term APs and with size <math>r_3(\sqrt{n})</math>, and with all elements being integer multiples of 3 (again as a matter of convenience). For each <math>r,s\in R</math>, let <math>a = (n-r-s)/3</math>. The set <math>A</math> is the union of all <math>\Gamma_{a,a+r,a+s}</math>. Since all of <math>a, a+r,a+s</math> are between <math>n/3-2\sqrt{n}</math> and <math>n/3+2\sqrt{n}</math>, the size of <math>\Gamma_{a,a+r,a+s}</math> is at least <math>C 3^n / n</math>. Since there are <math>r_3(\sqrt{n})^2</math> choices for r and s, we have a set with size at least<br />
<br />
:<math>C (\sqrt{n} (\log n)^{1/4} \exp_2(-2 \sqrt{\log_2 n}))^2 3^n / n</math>.<br />
<br />
This simplifies to <math>C \sqrt{\log n} \exp_3(n-\alpha \sqrt{\log_3(n)})</math>, where <math>\alpha=4 \sqrt{\log_3(2)}</math>.<br />
<br />
Now suppose that <math>x_i\in \Gamma_{a_i,a_i+r_i,a_i+s_i}</math> is a combinatorial line in the set A. Then <math>(a_i+s_i)-(a_i)=s_i</math> is a 3-term AP contained in R, so the <math>s_i</math> are all the same. Similarly, all of the <math>r_i</math> are the same, and therefore all of the <math>a_i</math> are the same, too. But this implies that the <math>x_i</math> sequence is constant, which means the line is degenerate. <math>\Box</math><br />
<br />
[http://terrytao.wordpress.com/2009/02/05/upper-and-lower-bounds-for-the-density-hales-jewett-problem/#comment-35652 Numerics suggest] that the first large n construction given above above give a lower bound of roughly <math>2.7 \sqrt{\log(n)/n} \times 3^n</math>, which would asymptotically be inferior to the Behrend bound.<br />
<br />
The second large n construction had numerical asymptotics for <math>\log(c_n/3^n)</math> close to <math>1.2-\sqrt{\log(n)}</math> between n=1000 and n=10000, consistent with the Behrend bound.<br />
<br />
== Other k values ==<br />
<br />
A [http://abel.math.umu.se/~klasm/Data/HJ/ computer search] has found the following <math>c_n</math> values for different values of dimension n and edgelength k. Several of these values reach the upper bound of <math>(k-1)k^{n-1}</math>.<br />
<br />
{| border=1 | <br />
|n\k || 2 || 3 || 4 || 5 || 6 || 7<br />
|-<br />
| 2 || 2 || 6 || 12 || 20 || 30 || 42<br />
|-<br />
| 3 || 3 || 18 || 48 || 100 || 180 || 294<br />
|-<br />
| 4 || 6 || 52 || 183 || 500 || 1029-1079 || 2058<br />
|-<br />
| 5 || 10 || 150 || 683-732 || 2500<br />
|}<br />
<br />
If k is prime and <math>k \ge n</math>, then one can remove all combinatorial lines by deleting all points whose coordinates sum to a multiple of k. So the density of deleted points in the optimal configuration is 1/k when k is prime.<br />
<br />
Let p be the smallest prime greater than or equal to both k and n. One can remove all combinatorial lines by deleting all points whose coordinates sum to <math>0\le x\le p-k</math> (mod p), So the density of deleted points is at most (p-k+1)/p. This approaches zero as <math>k\rightarrow\infty</math>. For example, the following paper shows there is a prime between x-x^0.525 and x.<br />
<br />
Baker, R. C.(1-BYU); Harman, G.(4-LNDHB); Pintz, J.(H-AOS)<br />
The difference between consecutive primes. II.<br />
Proc. London Math. Soc. (3) 83 (2001), no. 3, 532–562.<br />
<br />
== Numerical methods ==<br />
<br />
A greedy algorithm [http://thetangentspace.com/wiki/Hales-Jewett_Theorem was implemented here]. The results were sharp for <math>n \leq 3</math> but were slightly inferior to the constructions above for larger n.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Upper_and_lower_boundsUpper and lower bounds2009-03-26T05:21:59Z<p>121.220.134.232: /* Other k values */</p>
<hr />
<div><center>'''Upper and lower bounds for <math>c_n</math> for small values of n.'''</center><br />
<br />
<math>c_n</math> is the size of the largest subset of <math>[3]^n</math> that does not contain a combinatorial line (OEIS [http://www.research.att.com/~njas/sequences/A156762 A156762]. A spreadsheet for all the latest bounds on <math>c_n</math> [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DsU-uZ1tK7VEg can be found here]. In this page we record the proofs justifying these bounds.<br />
<br />
<br />
{|<br />
| n || 0 || 1 || 2 || 3 || 4 || 5 || 6 || 7<br />
|-<br />
| <math>c_n</math> || 1 || 2 || 6 || 18 || 52 || 150 || 450 || [1302,1348]<br />
|}<br />
<br />
== Basic constructions ==<br />
<br />
For all <math>n \geq 1</math>, a basic example of a mostly line-free set is<br />
<br />
:<math>D_n := \{ (x_1,\ldots,x_n) \in [3]^n: \sum_{i=1}^n x_i \neq 0 \ \operatorname{mod}\ 3 \}</math>. (1)<br />
<br />
This has cardinality <math>|D_n| = 2 \times 3^{n-1}</math>. The only lines in <math>D_n</math> are those with<br />
<br />
# A number of wildcards equal to a multiple of three;<br />
# The number of 1s unequal to the number of 2s modulo 3.<br />
<br />
One way to construct line-free sets is to start with <math>D_n</math> and remove some additional points. We also have the variants <math>D_{n,0}=D_n, D_{n,1}, D_{n,2}</math> defined as<br />
<br />
:<math>D_{n,j} := \{ (x_1,\ldots,x_n) \in [3]^n: \sum_{i=1}^n x_i \neq j \ \operatorname{mod}\ 3 \}</math>. (1')<br />
<br />
When n is not a multiple of 3, then <math>D_{n,0}, D_{n,1}, D_{n,2}</math> are all cyclic permutations of each other; but when n is a multiple of 3, then <math>D_{n,0}</math> plays a special role (though <math>D_{n,1}, D_{n,2}</math> are still interchangeable).<br />
<br />
Another useful construction proceeds by using the slices <math>\Gamma_{a,b,c} \subset [3]^n</math> for <math>(a,b,c)</math> in the triangular grid<br />
<br />
:<math>\Delta_n := \{ (a,b,c) \in {\Bbb Z}_+^3: a+b+c = n \},</math>. (2)<br />
<br />
where <math>\Gamma_{a,b,c}</math> is defined as the strings in <math>[3]^n</math> with <math>a</math> 1s, <math>b</math> 2s, and <math>c</math> 3s. Note that<br />
<br />
:<math>|\Gamma_{a,b,c}| = \frac{n!}{a! b! c!}.</math> (3)<br />
<br />
Given any set <math>B \subset \Delta_n</math> that avoids equilateral triangles <math> (a+r,b,c), (a,b+r,c), (a,b,c+r)</math>, the set<br />
<br />
:<math>\Gamma_B := \bigcup_{(a,b,c) \in B} \Gamma_{a,b,c}</math> (4)<br />
<br />
is line-free and has cardinality<br />
<br />
:<math>|\Gamma_B| = \sum_{(a,b,c) \in B} \frac{n!}{a! b! c!},</math> (5)<br />
<br />
and thus provides a lower bound for <math>c_n</math>:<br />
<br />
:<math>c_n \geq \sum_{(a,b,c) \in B} \frac{n!}{a! b! c!}.</math> (6)<br />
<br />
All lower bounds on <math>c_n</math> have proceeded so far by choosing a good set of B and applying (6). Note that <math>D_n</math> is the same as <math>\Gamma_{B_n}</math>, where <math>B_n</math> consists of those triples <math>(a,b,c) \in \Delta_n</math> in which <math>a \neq b\ \operatorname{mod}\ 3</math>.<br />
<br />
Note that if one takes a line-free set and permutes the alphabet <math>\{1,2,3\}</math> in any fashion (e.g. replacing all 1s by 2s and vice versa), one also gets a line-free set. This potentially gives six examples from any given starting example of a line-free set, though in practice there is enough symmetry that the total number of examples produced this way is less than six. (These six examples also correspond to the six symmetries of the triangular grid <math>\Delta_n</math> formed by rotation and reflection.)<br />
<br />
Another symmetry comes from permuting the <math>n</math> indices in the strings of <math>[3]^n</math> (e.g. replacing every string by its reversal). But the sets <math>\Gamma_B</math> are automatically invariant under such permutations and thus do not produce new line-free sets via this symmetry.<br />
<br />
== The basic upper bound ==<br />
<br />
Because <math>[3]^{n+1}</math> can be expressed as the union of three copies of <math>[3]^n</math>, we have the basic upper bound<br />
<br />
:<math>c_{n+1} \leq 3 c_n.</math> (7)<br />
<br />
Note that equality only occurs if one can find an <math>n+1</math>-dimensional line-free set such that every n-dimensional slice has the maximum possible cardinality of <math>c_n</math>.<br />
<br />
== n=0 ==<br />
<br />
:<math>c_0=1</math>:<br />
<br />
This is clear.<br />
<br />
== n=1 ==<br />
<br />
:<math>c_1=2</math>:<br />
<br />
The three sets <math>D_1 = \{1,2\}</math>, <math>D_{1,1} = \{2,3\}</math>, and <math>D_{1,2} = \{1,3\}</math> are the only two-element sets which are line-free in <math>[3]^1</math>, and there are no three-element sets.<br />
<br />
== n=2 ==<br />
<br />
:<math>c_2=6</math>:<br />
<br />
There are four six-element sets in <math>[3]^2</math> which are line-free, which we denote <math>x = D_{2,2}</math>, <math>y=D_{2,1}</math>, <math>z=D_2</math>, and <math>w</math> and are displayed graphically as follows.<br />
<br />
13 .. 33 .. 23 33 13 23 .. 13 23 ..<br />
x = 12 22 .. y = 12 .. 32 z = .. 22 32 w = 12 .. 32<br />
.. 21 31 11 21 .. 11 .. 31 .. 21 31<br />
<br />
Combining this with the basic upper bound (7) we see that <math>c_2=6</math>.<br />
<br />
== n=3 ==<br />
<br />
:<math>c_3=18</math>:<br />
<br />
We describe a subset <math>A</math> of <math>[3]^3</math> as a string <math>abc</math>, where <math>a, b, c \subset [3]^2</math> correspond to strings of the form <math>1**</math>, <math>2**</math>, <math>3**</math> in <math>[3]^3</math> respectively. Thus for instance <math>D_3 = xyz</math>, and so from (7) we have <math>c_3=18</math>.<br />
<br />
'''Lemma 1.'''<br />
* The only 18-element line-free subset of <math>[3]^3</math> is <math>D_3 = xyz</math>.<br />
* The only 17-element line-free subsets of <math>[3]^3</math> are formed by removing a point from <math>D_3=xyz</math>, or by removing either 111, 222, or 333 from <math>D_{3,2} = yzx</math> or <math>D_{3,3}=zxy</math>.<br />
<br />
'''Proof'''. We prove the second claim. As <math>17=6+6+5</math>, and <math>c_2=6</math>, at least two of the slices of a 17-element line-free set must be from x, y, z, w, with the third slice having 5 points. If two of the slices are identical, the last slice can have only 3 points, a contradiction. If one of the slices is a w, then the 5-point slice will contain a diagonal, contradiction. By symmetry we may now assume that two of the slices are x and y, which force the last slice to be z with one point removed. Now one sees that the slices must be in the order xyz, yzx, or zxy, because any other combination has too many lines that need to be removed. The sets yzx, zxy contain the diagonal {111,222,333} and so one additional point needs to be removed. <br />
<br />
The first claim follows by a similar argument to the second.<br />
<math>\Box</math><br />
<br />
== n=4 ==<br />
<br />
:<math>c_4=52</math>:<br />
<br />
Indeed, divide a line-free set in <math>[3]^4</math> into three blocks <math>1***, 2***, 3***</math> of <math>[3]^3</math>. If two of them are of size 18, then they must both be xyz, and the third block can have at most 6 elements, leading to an inferior bound of 42. So the best one can do is <math>18+17+17=52</math> which can be attained by deleting the diagonal {1111,2222,3333} from <math>D_{4,1} = xyz\ yzx\ xzy</math>, <math>D_4 = yzx\ zxy\ xyz</math>, or <math>D_{4,2} = zxy\ xyz\ yzx</math>. In fact,<br />
<br />
'''Lemma 2.'''<br />
<br />
* The only 52-element line-free sets in <math>[3]^4</math> are formed by removing the diagonal {1111,2222,3333} from <math>D_{4,j}</math> for some j=0,1,2.<br />
* The only 51-element line-free sets in <math>[3]^4</math> are formed by removing the diagonal and one further point from <math>D_{4,j}</math> for some j=0,1,2.<br />
* The only 50-element line-free sets in <math>[3]^4</math> are formed by removing the diagonal and two further points from <math>D_{4,j}</math> for some j=0,1,2 OR is equal to one of the three permutations of the set <math>X := \Gamma_{3,1,0} \cup \Gamma_{3,0,1} \cup \Gamma_{2,2,0} \cup \Gamma_{2,0,2} \cup \Gamma_{1,1,2} \cup \Gamma_{1,2,1} \cup \Gamma_{0,2,2}</math>.<br />
<br />
'''Proof''' It suffices to prove the third claim. In fact it suffices to show that every 50-point line-free set is either contained in the 54-point set <math>D_{4,j}</math> for some j=0,1,2, or is some permutation of the set X. Indeed, if a 50-point line-free set is contained in, say, <math>D_4</math>, then it cannot contain 2222, since otherwise it must omit one point from each of the four pairs formed from {2333, 2111} by permuting the indices, and must also omit one of {1111, 1222, 1333}, leading to at most 49 points in all; similarly, it cannot contain 1111, and so omits the entire diagonal {1111,2222,3333}, with two more points to be omitted. Similarly when <math>D_4</math> is replaced by one of the other <math>D_{4,j}</math><br />
<br />
Next, observe that every three-dimensional slice of a line-free set can have at most <math>c_3=18</math> points; thus when one partitions a 50-point line-free set into three such slices, it must divide either as 18+16+16, 17+17+16, or some permutation of these.<br />
<br />
Suppose that we can slice the set into two slices of 17 points and one slice of 16 points. By the various symmetries, we may assume that the 1*** slice and 2*** slices have 17 points, and the 3*** slice has 16 points. By Lemma 1, the 1-slice is <math>\{1\} \times D_{3,j}</math> with one point removed, and the 2-slice is <math>\{2\} \times D_{3,k}</math> with one point removed, for some <math>j,k \in \{0,1,2\}</math>.<br />
<br />
If j=k, then the 1-slice and 2-slice have at least 15 points in common, so the 3-slice can have at most <math>27-15=12</math> points, a contradiction. If jk = 01, 12, or 20, then observe that from Lemma 1 the *1**, *2**, *3** slices cannot equal a 17-point or 18-point line-free set, so each have at most 16 points, leading to only 48 points in all, a contradiction. Thus we must have jk = 10, 21, or 02.<br />
<br />
Let's first suppose that jk=02. Then by Lemma 1, the 2*** slice contains the nine points formed from {2211, 2322, 2331} and permuting the last three indices, while the 1*** slice contains at least eight of the nine points formed from {1211, 1322, 1311} and permuting the last three indices. Thus the 3*** slice can contain at most one of the nine points formed from {3211, 3322, 3311} and permuting the last three indices. If it does contain one of these points, say 3211, then it must omit one point from each of the four pairs {3222, 3233}, {3212, 3213}, {3221, 3231}, {3111, 3311}, leading to at most 15 points on this slice, a contradiction. So the 3*** slice must omit all nine points, and is therefore contained in <math>\{3\} \times D_{4,1}</math>, and so the 50-point set is contained in <math>D_{4,1}</math>, and we are done by the discussion at the beginning of the proof.<br />
<br />
The case jk=10 is similar to the jk=02 case (indeed one can get from one case to the other by swapping the 1 and 2 indices). Now suppose instead that jk=12. Then by Lemma 1, the 1*** slice contains the six points from permuting the last three indices of 1123, and similarly the 2*** slice contains the six points from permuting the last three indices of 2123. Thus the 3*** slice must avoid all six points formed by permuting the last three indices of 3123. Similarly, as 1133 lies in the 1*** slice and 2233 lies in the 2*** slice, 3333 must be avoided in the 3*** slice.<br />
<br />
Now we claim that 3111 must be avoided also; for if 3111 was in the set, then one point from each of the six pairs formed from {3311, 3211}, {3331, 3221} and permuting the last three indices must lie outside the 3*** slice, which reduces the size of that slice to at most <math>27-6-1-6=14</math>, which is too small. Similarly, 3222 must be avoided, which puts the 3*** slice inside <math>\{3\} \times D_3</math> and then places the 50-point set inside <math>D_4</math>, and we are done by the discussion at the beginning of the proof.<br />
<br />
We have handled the case in which at least one of the slicings of the 50-point set is of the form 50=17+17+16. The only remaining case is when all slicings of the 50-point set are of the form 18+17+16 (or a permutation thereof). By the symmetries of the situation, we may assume that the 1*** slice has 18 points, and thus by Lemma 1 takes the form <math>\{1\} \times D_3</math>. Inspecting the *1**, *2**, *3** slices, we then see (from Lemma 1) that only the *1** slice can have 18 points; since we are assuming that this slicing is some permutation of 50=18+17+16, we conclude that the *1** slice must have exactly 18 points, and is thus described precisely by Lemma 1. Similarly for the **1* and ***1 slices. Indeed, by Lemma 1, we see that the 50-point set must agree exactly with <math>D_{4,1}</math> on any of these slices. In particular, on the remaining portion <math>\{2,3\}^4</math> of the cube, there are exactly 6 points of the 50-point set in <math>\{2,3\}^4</math>.<br />
<br />
Suppose that 3333 was in the set; then since all permutations of 3311, 3331 are known to lie in the set, then 3322, 3332 must lie outside the set. Also, as 1222 lies in the set, at least one of 2222, 3222 lie outside the set. This leaves only 5 points in <math>\{2,3\}^4</math>, a contradiction. Thus 3333 lies outside the set; similarly 2222 lies outside the set.<br />
<br />
Let a be the number of points in the 50-point set which are some permutation of 2233, thus <math>0 \leq a \leq 6</math>. If a=0 then the set lies in <math>D_{4,1}</math> and we are done. If a=6 then the set is exactly X and we are done. Now suppose a=1,2,3. By symmetry we may assume that 2233 lies in the set. Then (since 2133, 1233 2231, 2213 are known to lie in the set) 2333, 3233, 2223, 2232 lie outside the set, which leaves at most 5 points inside <math>\{2,3\}^4</math>, a contradiction.<br />
<br />
The remaining case is when a=4,5. Then one of the three pairs {2233, 3322}, {2323, 3232}, {2332, 3223} lie in the set. By symmetry we may assume that {2233, 3322} lie in the set. Then by arguing as before we see that all eight points formed by permuting 2333 or 3222 lie outside the set, leading to at most 5 points inside <math>\{2,3\}^4</math>, a contradiction.<br />
<math>\Box</math><br />
<br />
== n=5 ==<br />
<br />
:<math>c_5=150</math>:<br />
<br />
'''Lemma 3'''. Any line-free subset of <math>D_{5,j}</math> can have at most 150 points.<br />
<br />
'''Proof'''. By rotation we may work with <math>D_5</math>. This set has 162 points. By looking at the triplets {10000, 11110, 12220} and cyclic permutations we must lose 5 points; similarly from the triplets {20000,22220, 21110} and cyclic permutations. Finally from {11000,11111,11222} and {22000,22222,22111} we lose two more points. <math>\Box</math><br />
<br />
Equality can be attained by removing <math>\Gamma_{0,4,1}, \Gamma_{0,5,0}, \Gamma_{4,0,1}, \Gamma_{5,0,0}</math> from <math>D_5</math>. Thus <math>c_5 \geq 150</math>.<br />
<br />
Another pattern of 150 points is this: Take the 450 points<br />
in <math>{}[3]^6</math> which are (1,2,3), (0,2,4) and permutations,<br />
then select the 150 whose final coordinate is 1. That gives<br />
this many points in each cube:<br />
<br />
17 18 17<br />
<br />
17 17 18<br />
<br />
12 17 17<br />
<br />
'''Lemma 4'''. A line-free subset of <math>[3]^5</math> with over 150 points cannot have two parallel <math>[3]^4</math> slices, each of which contain at least 51 points.<br />
<br />
'''Proof'''. Suppose not. By symmetry, we may assume that the 1**** and 2**** slices have at least 51 points, and that the whole set has at least 151 points, which force the third slice to have at least <math>151-2c_4 = 47</math> points.<br />
<br />
By Lemma 2, the 1**** slice takes the form <math>\{1\} \times D_{4,j}</math> for some <math>j=0,1,2</math> with the diagonal {11111,12222,13333} and possibly one more point removed, and similarly the 2**** slice takes the form <math>\{2\} \times D_{4,k}</math> for some <math>k=0,1,2</math> with the diagonal {21111,22222,23333} and possibly one more point removed.<br />
<br />
Suppose first that j=k. Then the 1-slice and 2-slice have at least 50 points in common, leaving at most 31 points for the 3-slice, a contradiction. Next, suppose that jk=01. Then observe that the *i*** slice cannot look like any of the configurations in Lemma 2 and so must have at most 50 points for i=1,2,3, leading to 150 points in all, a contradiction. Similarly if jk=12 or 20. Thus we must have jk equal to 10, 21, or 02.<br />
<br />
Let's suppose first that jk=10. The first slice then is equal to <math>\{1\} \times D_{4,1}</math> with the diagonal and possibly one more point removed, while the second slice is equal to <math>\{2\} \times D_{4,0}</math> with the diagonal and possibly one more point removed. Superimposing these slices, we thus see that the third slice is contained in <math>\{3\} \times D_{4,2}</math> except possibly for two additional points, together with the one point 32222 of the diagonal that lies outside of <math>\{3\} \times D_{4,2}</math>.<br />
<br />
The lines x12xx, x13xx (plus permutations of the last four digits) must each contain one point outside the set. The first two slices can only absorb two of these, and so at least 14 of the 16 points formed by permuting the last four digits of 31233, 31333 must lie outside the set. These points all lie in <math>\{3\} \times D_{4,2}</math>, and so the 3**** slice can have at most <math>|D_{4,2}|-14+3=43</math> points, a contradiction.<br />
<br />
The case jk=02 is similar to the case jk=10 (indeed one can obtain one from the other by swapping 1 and 2). Now we turn to the case jk=21. Arguing as before we see that the third slice is contained in <math>\{3\} \times D_4</math> except possibly for two points, together with 33333. <br />
<br />
If 33333 was in the set, then each of the lines xx333, xxx33 (and permutations of the last four digits) must have a point missing from the first two slices, which cannot be absorbed by the two points we are permitted to remove; thus 33333 is not in the set. For similar reasons, 33331 is not in the set, as can be seen by looking at xxx31 and permutations of the last four digits. Indeed, any string containing four threes does not lie in the set; this means that at least 8 points are missing from <math>\{3\} \times D_4</math>, leaving only at most 46 points inside that set. Furthermore, any point in the 3**** slice outside of <math>\{3\} \times D_4</math> can only be created by removing a point from the first two slices, so the total cardinality is at most <math>46+52+52 = 150</math>, a contradiction.<math>\Box</math><br />
<br />
'''Corollary'''. <math>c_5 \leq 152</math><br />
<br />
'''Proof'''. By Lemma 4 and the bound <math>c_4=52</math>, any line-free set with over 150 points can have one slice of cardinality 52, but then the other two slices can have at most 50 points. <math>\Box</math><br />
<br />
<br />
'''Lemma 5''' Any solution with 151 or more points has a slice with at most 49 points.<br />
<br />
'''Proof''' Suppose we have 151 points without a line, and each of three slices has at least 50 points.<br />
<br />
Using earlier notation, we split subsets of <math>[3]^4</math> into nine subsets of <math>[3]^2</math>. <br />
So we think of x,y,z,a,b and c as subsets of a square. Each slice is one of the following.<br />
*<math>D_4 = y'zx,zx'y,xyz</math> (with one or two points removed)<br />
*<math>D_{4,2} = z'xy,xyz,yzx'</math> (with one or two points removed)<br />
*<math>D_{4,1} = xyz,yz'x,zxy'</math> (with one or two points removed)<br />
*<math>X = xyz, ybw, zwc</math><br />
*<math>Y = axw, xyz, wzc</math><br />
*<math>Z = awx, wby, xyz</math><br />
<br />
where a, b and c have four points each.<br />
<br />
.. 32 33 31 .. 33 .. .. ..<br />
a = .. 22 23 b = .. .. .. c = 21 22 ..<br />
.. .. .. 11 .. 13 11 12 ..<br />
<br />
x', y' and z' are subsets of x, y and z respectively, and have five points each.<br />
<br />
Suppose all three slices are subsets of <math>D_{4,j}</math>. <br />
We can remove at most five points from the full set of three D_{4,j}. <br />
Consider columns 2,3,4,6,7,8. At most two of these columns contain xyz, so one point must be removed from the other four.<br />
This uses up all but one of the removals.<br />
So the slices must be <math>D_{4,2},D_{4,1},D_{4,0}</math> or a cyclic permutation of that.<br />
Then the cube, which contains the first square of slice 1; the fifth square of slice 2; <br />
and the ninth square of slice 3, contains three copies of the same square. <br />
It takes more than one point removed to remove all lines from that cube.<br />
So we can't have all three slices subsets of <math>D_{4,j}</math>.<br />
<br />
Suppose one slice is X,Y or Z, and two others are subsets of <math>D_{4,j}</math>. <br />
We can remove at most three points from the full <math>D_{4,j}</math><br />
By symmetry, suppose one slice is X. Consider columns 2,3,4 and 7. They must be cyclic permutations of x,y,z,<br />
and two of them are not xyz, so must lose a point. <br />
Columns 6 and 8 must both lose a point, and we only have 150 points left.<br />
So if one slice is X,Y or Z, the full set contains a line.<br />
<br />
Suppose two slices are from X,Y and Z, and the other is a subset of <math>D_{4,j}</math>. <br />
By symmetry, suppose two slices are X and Y. Columns 3,6,7 and 8 all contain w, and therefore at most 16 points each.<br />
Columns 1,5 and 9 contain a,b, or c, and therefore at most 16 points. <br />
So the total number of points is at most 7*16+2*18 = 148. This contradicts the assumption of 151 points.<br />
<math>\Box</math><br />
<br />
'''Corollary''' <math>c_5 \leq 151 </math><br />
<br />
'''Proof''' By Lemmas 2 and 4, the maximum number of points is 52+50+49=151. <math>\Box</math><br />
<br />
'''Lemma 5.1''' No solution with 151 points contains as a slice the X defined in Lemma 2<br />
<br />
'''Proof''' Suppose one row is X. Another row is <math>D_{4,j}</math>.<br />
<br />
Suppose X is in the first row. Label the other rows with letters from the alphabet.<br />
<br />
xyz ybw zwc<br />
<br />
mno pqr stu<br />
<br />
def ghi jkl<br />
<br />
Reslice the array into a left nine, middle nine and right nine. One of these squares<br />
contains 52 points, and it can only be the left nine. One of its three columns contains<br />
18 points, and it can only be its left-hand column, xmd. So m=y and d=z. But none of the {math>D_{4,j}</math> begins with y or z, which is a contradiction. So X is not in the first row.<br />
<br />
So X is in the second or third row. By symmetry, suppose it is in the second row<br />
<br />
def ghi jkl<br />
<br />
xyz ybw zwc<br />
<br />
mno pqr stu<br />
<br />
Again, the left-hand nine must contain 52 points, so it is <math>D_{4,2}</math>.<br />
So either the first row is <math>D_{4,2}</math> or the third row is <math>D_{4,0}</math>.<br />
If the first row is <math>D_{4,2}</math> then the only way to have 50 points in the middle or right-hand nine is if the middle nine is X<br />
<br />
z'xy xyz yzx'<br />
<br />
xyz ybw zwc<br />
<br />
yzx' zwc stu<br />
<br />
In the seventh column, s contains 5 points and in the eighth column, t contains 4 points.<br />
The final row can now contain at most 48 points, and the whole array contains only 52+50+48 = 150 points.<br />
<br />
If the third row is <math>D_{4,0}</math>, then neither the middle nine nor the right-hand nine contains 50 points, by the classification of Lemma 4 and the formulas at the start of Lemma 5.<br />
Again, only 52+49+49 = 150 points are possible.<br />
<br />
A similar argument is possible if X is in the third row; or if X is replaced by Y or Z.<br />
<br />
So when a 151-point set is sliced into three, one slice is <math>D_{4,j}</math> and another slice is 50 points contained in <math>D_{4,k}</math>. <math>\Box</math><br />
<br />
'''Lemma 5.2''' There is no 151-point solution<br />
<br />
'''Proof''' Assume by symmetry that the first row contains 52 points and the second row contains 50.<br />
<br />
If <math>D_{4,1}</math> is in the first row, then the second row must be contained in <math>D_{4,0}</math>. <br />
<br />
xyz yz'x zxy'<br />
<br />
y'zx zx'y xyz<br />
<br />
def ghi jkl<br />
<br />
But then none of the left nine, middle nine or right nine can contain 52 points, which contradicts the corollary to Lemma 5.<br />
<br />
Suppose the first row contains D_{4,0}. Then the second row is contained in <math>D_{4,2}</math>, otherwise the cubes formed from the nine columns of the diagram would need to remove too many points.<br />
<br />
y'zx zx'y xyz<br />
<br />
z'xy xyz yzx'<br />
<br />
def ghi jkl<br />
<br />
But then neither the left nine, middle nine or right nine contains 52 points.<br />
<br />
So the first row contains <math>D_{4,2}</math>, and the second row is contained in <math>D_{4,1}</math>. Two points may be removed from the second row of this diagram.<br />
<br />
z'xy xyz yzx'<br />
<br />
xyz yz'x zxy'<br />
<br />
def ghi jkl<br />
<br />
Slice it into the left nine, middle nine and right nine. Two of them are contained in <math>D_{4,j}</math><br />
so at least two of def, ghi, and jkl are contained in the corresponding slice of <math>D_{4,0}</math>.<br />
Slice along a different axis, and at least two of dgj,ehk,fil are contained in the corresponding slice of <br />
<math>D_{4,0}</math>. <br />
So eight of the nine squares in the bottom row are contained in the corresponding square of <math>D_{4,0}</math>.<br />
Indeed, slice along other axes, and all points except one are contained within <math>D_{4,0}</math>. <br />
This point is the intersection of all the 49-point slices. <br />
<br />
So, if there is a 151-point solution, then after removal of the specified point, <br />
there is a 150-point solution, within <math>D_{5,j}</math>, whose slices in each direction are 52+50+48.<br />
<br />
z'xy xyz yzx'<br />
<br />
xyz yz'x zxy'<br />
<br />
y'zx zx'y xyz<br />
<br />
One point must be lost from columns 3, 6, 7 and 8, and four more from the major diagonal z'z'z. That leaves 148 points instead of 150.<br />
<br />
So the 150-point solution does not exist with 52+50+48 slices; so the 151 point solution does not exist.<math>\Box</math><br />
<br />
<br />
An integer programming method has established the upper bound <math>c_5\leq 150</math>, with 12 extremal solutions.<br />
<br />
[http://abel.math.umu.se/~klasm/extremal-c5 This file] contains the extermisers. One point per line and different extermisers separated by a line with “—”<br />
<br />
[http://abel.math.umu.se/~klasm/linprog-d=5-t=3.lpt This is the linear program], readable by Gnu’s glpsol linear programing solver, which also quickly proves that 150 is the optimum.<br />
<br />
Each variable corresponds to a point in the cube, numbered according to their lexicografic ordering. If a variable is 1 then the point is in the set, if it is 0 then it is not in the set.<br />
There is one linear inequality for each combinatorial line, stating that at least one point must be missing from the line.<br />
<br />
== n=6 ==<br />
<br />
:<math>c_6=450</math>:<br />
<br />
The upper bound follows since <math>c_6 \leq 3 c_5</math>. The lower bound can be formed by gluing together all the [[slice]]s <math>\Gamma_{a,b,c}</math> where (a,b,c) is a permutation of (0,2,4) or (1,2,3).<br />
<br />
Computer verification, using the <math>c_5=150</math> extremals, has shown that there is exactly one extremiser for <math>c_6=450</math>.<br />
<br />
== n=7 ==<br />
<br />
:<math>1302 \leq c_7 \leq 1348</math>:<br />
<br />
To see the upper bound <math>c_7 \leq 3c_6-2</math>, observe that if two parallel six-dimensional slices had <math>c_6</math> points, then by uniqueness they are identical, and the third slice can have at most <math>3^6-c_6=279</math> points, far too few to get anywhere close to <math>1348</math>. Thus there can be at most one slice with <math>c_6</math> points, and the other two have at most <math>c_6-1</math>, giving the claim.<br />
<br />
The lower bound can be formed by removing 016,106,052,502,151,511,160,610 from <math>D_7</math>.<br />
<br />
'''Lemma 6''' Any line-free subset of <math>D_7</math> has at most 1302 points.<br />
<br />
'''Proof''' Start with the 1458 points of <math>D_7</math>. You must lose:<br />
<br />
* 42 points from (1,2,4),(1,5,1),(4,2,1)<br />
* 42 points from (2,1,4),(2,4,1),(5,1,1)<br />
* 21 points from (0,2,5),(0,5,2),(3,2,2)<br />
* 21 points from (2,0,5),(2,3,2),(5,0,2)<br />
* 15 points from (0,1,6),(0,4,3),(3,1,3),(0,7,0),(3,4,0),(6,1,0)<br />
* 15 points from (1,0,6),(1,3,3),(4,0,3),(7,0,0),(4,3,0),(1,6,0)<br />
<br />
where (a,b,c) is shorthand for the [[slice]] <math>\Gamma_{a,b,c}</math>.<br />
<math>\Box</math><br />
<br />
== Larger n ==<br />
<br />
The following construction gives lower bounds for the number of triangle-free points, <br />
There are of the order <math>2.7 \sqrt{log(N)/N}3^N</math> points for large N (N ~ 5000)<br />
<br />
It applies when N is a multiple of 3. <br />
* For N=3M-1, restrict the first digit of a 3M sequence to be 1. So this construction has exactly one-third as many points for N=3M-1 as it has for N=3M. <br />
* For N=3M-2, restrict the first two digits of a 3M sequence to be 12. This leaves roughly one ninth of the points for N=3M-2 as for N=3M.<br />
<br />
The current lower bounds for <math>c_{3m}</math> are built like this, with abc being shorthand for <math>\Gamma_{a,b,c}</math>:<br />
<br />
* <math>c_3</math> from (012) and permutations<br />
* <math>c_6</math> from (123,024) and perms<br />
* <math>c_9</math> from (234,135,045) and perms<br />
* <math>c_{12}</math> from (345,246,156,02A,057) and perms (A=10)<br />
* <math>c_{15}</math> from (456,357,267,13B,168,04B,078) and perms (B=11)<br />
<br />
To get the triples in each row, add 1 to the triples in the previous row; then include new triples that have a zero.<br />
<br />
A general formula for these points is given below. I think that they are triangle-free. (For N<21, ignore any triple with a negative entry.)<br />
<br />
* There are thirteen groups of points in the centre, formed from adding one of the following points, or its permutation, to (M,M,M), when N=3M:<br />
** (-7,-3,+10), (-7, 0,+7),(-7,+3,+4),(-6,-4,+10),(-6,-1,+7),(-6,+2,+4),(-5,-1,+6),(-5,+2,+3),(-4,-2,+6),(-4,+1,+3),(-3,+1,+2),(-2,0,+2),(-1,0,+1) <br />
* There are also eight string of points, stretching to the edges of the (abc) triangle:<br />
** For N=6K = 3M<br />
*** M+(-8-2x,-6-2x,14+4x),M+(-8-2x,-3-2x,11+4x),M+(-8-2x,x,8+x),M+(-8-2x,3+x,5+x) and permutations (x>=0, M-8-2x>=0)<br />
*** M+(-9-2x,-5-2x,14+4x),M+(-9-2x,-2-2x,11+4x),M+(-9-2x,1+x,8+x),M+(-9-2x,4+x,5+x) and permutations (x>=0, M-9-2x>=0)<br />
<br />
<br />
An alternate construction:<br />
<br />
First define a sequence, of all positive numbers which, in base 3, do not contain a 1. Add 1 to all multiples of 3 in this sequence. This sequence does not contain a length-3 arithmetic progression.<br />
<br />
It starts 1,2,7,8,19,20,25,26,55, …<br />
<br />
Second, list all the (abc) triples for which the larger two differ by a number<br />
from the sequence, excluding the case when the smaller two differ by 1, but then including the case when (a,b,c) is a permutation of N/3+(-1,0,1)<br />
<br />
== Asymptotics ==<br />
<br />
DHJ(3) is equivalent to the upper bound<br />
<br />
:<math>c_n \leq o(3^n)</math><br />
<br />
In the opposite direction, observe that if we take a set <math>S \subset [3n]</math> that contains no 3-term arithmetic progressions, then the set <math>\bigcup_{(a,b,c) \in \Delta_n: a+2b \in S} \Gamma_{a,b,c}</math> is line-free. From this and the Behrend construction it appears that we have the lower bound<br />
<br />
:<math>c_n \geq 3^{n-O(\sqrt{\log n})}.</math><br />
<br />
More precisely, we have<br />
<br />
:<math>c_n > C 3^{n - 4\sqrt{\log 2}\sqrt{\log n}+\frac 12 \log \log n}</math><br />
for some absolute constant C, and where all logarithms are base-3.<br />
<br />
'''Proof''' For convenience, let n be a multiple of 3. Elkin’s bound gives <math>r_3(\sqrt{n}) > C \sqrt{n} (\log n)^{1/4} \exp_2(-2 \sqrt{\log_2 n})</math>, and let <math>R</math> be a subset of <math>(-3\sqrt{n}/2,3\sqrt{n}/2)</math> without 3-term APs and with size <math>r_3(\sqrt{n})</math>, and with all elements being integer multiples of 3 (again as a matter of convenience). For each <math>r,s\in R</math>, let <math>a = (n-r-s)/3</math>. The set <math>A</math> is the union of all <math>\Gamma_{a,a+r,a+s}</math>. Since all of <math>a, a+r,a+s</math> are between <math>n/3-2\sqrt{n}</math> and <math>n/3+2\sqrt{n}</math>, the size of <math>\Gamma_{a,a+r,a+s}</math> is at least <math>C 3^n / n</math>. Since there are <math>r_3(\sqrt{n})^2</math> choices for r and s, we have a set with size at least<br />
<br />
:<math>C (\sqrt{n} (\log n)^{1/4} \exp_2(-2 \sqrt{\log_2 n}))^2 3^n / n</math>.<br />
<br />
This simplifies to <math>C \sqrt{\log n} \exp_3(n-\alpha \sqrt{\log_3(n)})</math>, where <math>\alpha=4 \sqrt{\log_3(2)}</math>.<br />
<br />
Now suppose that <math>x_i\in \Gamma_{a_i,a_i+r_i,a_i+s_i}</math> is a combinatorial line in the set A. Then <math>(a_i+s_i)-(a_i)=s_i</math> is a 3-term AP contained in R, so the <math>s_i</math> are all the same. Similarly, all of the <math>r_i</math> are the same, and therefore all of the <math>a_i</math> are the same, too. But this implies that the <math>x_i</math> sequence is constant, which means the line is degenerate. <math>\Box</math><br />
<br />
[http://terrytao.wordpress.com/2009/02/05/upper-and-lower-bounds-for-the-density-hales-jewett-problem/#comment-35652 Numerics suggest] that the first large n construction given above above give a lower bound of roughly <math>2.7 \sqrt{\log(n)/n} \times 3^n</math>, which would asymptotically be inferior to the Behrend bound.<br />
<br />
The second large n construction had numerical asymptotics for <math>\log(c_n/3^n)</math> close to <math>1.2-\sqrt{\log(n)}</math> between n=1000 and n=10000, consistent with the Behrend bound.<br />
<br />
== Other k values ==<br />
<br />
A [http://abel.math.umu.se/~klasm/Data/HJ/ computer search] has found the following <math>c_n</math> values for different values of dimension n and edgelength k. Several of these values reach the upper bound of <math>(k-1)k^{n-1}</math>.<br />
<br />
{|<br />
|n\k || 2 || 3 || 4 || 5 || 6 || 7<br />
|-<br />
| 2 || 2 || 6 || 12 || 20 || 30 || 42<br />
|-<br />
| 3 || 3 || 18 || 48 || 100 || 180 || 294<br />
|-<br />
| 4 || 6 || 52 || 183 || 500 || 1029-1079 || 2058<br />
|-<br />
| 5 || 10 || 150 || 683-732 || 2500<br />
|}<br />
<br />
If k is prime and <math>k \ge n</math>, then one can remove all combinatorial lines by deleting all points whose coordinates sum to a multiple of k. So the density of deleted points in the optimal configuration is 1/k when k is prime.<br />
<br />
Let p be the smallest prime greater than or equal to both k and n. One can remove all combinatorial lines by deleting all points whose coordinates sum to <math>0\le x\le p-k</math> (mod p), So the density of deleted points is at most (p-k+1)/p. This approaches zero as <math>k\rightarrow\infty</math>. For example, the following paper shows there is a prime between x-x^0.525 and x.<br />
<br />
Baker, R. C.(1-BYU); Harman, G.(4-LNDHB); Pintz, J.(H-AOS)<br />
The difference between consecutive primes. II.<br />
Proc. London Math. Soc. (3) 83 (2001), no. 3, 532–562.<br />
<br />
== Numerical methods ==<br />
<br />
A greedy algorithm [http://thetangentspace.com/wiki/Hales-Jewett_Theorem was implemented here]. The results were sharp for <math>n \leq 3</math> but were slightly inferior to the constructions above for larger n.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Upper_and_lower_boundsUpper and lower bounds2009-03-26T05:21:22Z<p>121.220.134.232: /* Other k values */</p>
<hr />
<div><center>'''Upper and lower bounds for <math>c_n</math> for small values of n.'''</center><br />
<br />
<math>c_n</math> is the size of the largest subset of <math>[3]^n</math> that does not contain a combinatorial line (OEIS [http://www.research.att.com/~njas/sequences/A156762 A156762]. A spreadsheet for all the latest bounds on <math>c_n</math> [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DsU-uZ1tK7VEg can be found here]. In this page we record the proofs justifying these bounds.<br />
<br />
<br />
{|<br />
| n || 0 || 1 || 2 || 3 || 4 || 5 || 6 || 7<br />
|-<br />
| <math>c_n</math> || 1 || 2 || 6 || 18 || 52 || 150 || 450 || [1302,1348]<br />
|}<br />
<br />
== Basic constructions ==<br />
<br />
For all <math>n \geq 1</math>, a basic example of a mostly line-free set is<br />
<br />
:<math>D_n := \{ (x_1,\ldots,x_n) \in [3]^n: \sum_{i=1}^n x_i \neq 0 \ \operatorname{mod}\ 3 \}</math>. (1)<br />
<br />
This has cardinality <math>|D_n| = 2 \times 3^{n-1}</math>. The only lines in <math>D_n</math> are those with<br />
<br />
# A number of wildcards equal to a multiple of three;<br />
# The number of 1s unequal to the number of 2s modulo 3.<br />
<br />
One way to construct line-free sets is to start with <math>D_n</math> and remove some additional points. We also have the variants <math>D_{n,0}=D_n, D_{n,1}, D_{n,2}</math> defined as<br />
<br />
:<math>D_{n,j} := \{ (x_1,\ldots,x_n) \in [3]^n: \sum_{i=1}^n x_i \neq j \ \operatorname{mod}\ 3 \}</math>. (1')<br />
<br />
When n is not a multiple of 3, then <math>D_{n,0}, D_{n,1}, D_{n,2}</math> are all cyclic permutations of each other; but when n is a multiple of 3, then <math>D_{n,0}</math> plays a special role (though <math>D_{n,1}, D_{n,2}</math> are still interchangeable).<br />
<br />
Another useful construction proceeds by using the slices <math>\Gamma_{a,b,c} \subset [3]^n</math> for <math>(a,b,c)</math> in the triangular grid<br />
<br />
:<math>\Delta_n := \{ (a,b,c) \in {\Bbb Z}_+^3: a+b+c = n \},</math>. (2)<br />
<br />
where <math>\Gamma_{a,b,c}</math> is defined as the strings in <math>[3]^n</math> with <math>a</math> 1s, <math>b</math> 2s, and <math>c</math> 3s. Note that<br />
<br />
:<math>|\Gamma_{a,b,c}| = \frac{n!}{a! b! c!}.</math> (3)<br />
<br />
Given any set <math>B \subset \Delta_n</math> that avoids equilateral triangles <math> (a+r,b,c), (a,b+r,c), (a,b,c+r)</math>, the set<br />
<br />
:<math>\Gamma_B := \bigcup_{(a,b,c) \in B} \Gamma_{a,b,c}</math> (4)<br />
<br />
is line-free and has cardinality<br />
<br />
:<math>|\Gamma_B| = \sum_{(a,b,c) \in B} \frac{n!}{a! b! c!},</math> (5)<br />
<br />
and thus provides a lower bound for <math>c_n</math>:<br />
<br />
:<math>c_n \geq \sum_{(a,b,c) \in B} \frac{n!}{a! b! c!}.</math> (6)<br />
<br />
All lower bounds on <math>c_n</math> have proceeded so far by choosing a good set of B and applying (6). Note that <math>D_n</math> is the same as <math>\Gamma_{B_n}</math>, where <math>B_n</math> consists of those triples <math>(a,b,c) \in \Delta_n</math> in which <math>a \neq b\ \operatorname{mod}\ 3</math>.<br />
<br />
Note that if one takes a line-free set and permutes the alphabet <math>\{1,2,3\}</math> in any fashion (e.g. replacing all 1s by 2s and vice versa), one also gets a line-free set. This potentially gives six examples from any given starting example of a line-free set, though in practice there is enough symmetry that the total number of examples produced this way is less than six. (These six examples also correspond to the six symmetries of the triangular grid <math>\Delta_n</math> formed by rotation and reflection.)<br />
<br />
Another symmetry comes from permuting the <math>n</math> indices in the strings of <math>[3]^n</math> (e.g. replacing every string by its reversal). But the sets <math>\Gamma_B</math> are automatically invariant under such permutations and thus do not produce new line-free sets via this symmetry.<br />
<br />
== The basic upper bound ==<br />
<br />
Because <math>[3]^{n+1}</math> can be expressed as the union of three copies of <math>[3]^n</math>, we have the basic upper bound<br />
<br />
:<math>c_{n+1} \leq 3 c_n.</math> (7)<br />
<br />
Note that equality only occurs if one can find an <math>n+1</math>-dimensional line-free set such that every n-dimensional slice has the maximum possible cardinality of <math>c_n</math>.<br />
<br />
== n=0 ==<br />
<br />
:<math>c_0=1</math>:<br />
<br />
This is clear.<br />
<br />
== n=1 ==<br />
<br />
:<math>c_1=2</math>:<br />
<br />
The three sets <math>D_1 = \{1,2\}</math>, <math>D_{1,1} = \{2,3\}</math>, and <math>D_{1,2} = \{1,3\}</math> are the only two-element sets which are line-free in <math>[3]^1</math>, and there are no three-element sets.<br />
<br />
== n=2 ==<br />
<br />
:<math>c_2=6</math>:<br />
<br />
There are four six-element sets in <math>[3]^2</math> which are line-free, which we denote <math>x = D_{2,2}</math>, <math>y=D_{2,1}</math>, <math>z=D_2</math>, and <math>w</math> and are displayed graphically as follows.<br />
<br />
13 .. 33 .. 23 33 13 23 .. 13 23 ..<br />
x = 12 22 .. y = 12 .. 32 z = .. 22 32 w = 12 .. 32<br />
.. 21 31 11 21 .. 11 .. 31 .. 21 31<br />
<br />
Combining this with the basic upper bound (7) we see that <math>c_2=6</math>.<br />
<br />
== n=3 ==<br />
<br />
:<math>c_3=18</math>:<br />
<br />
We describe a subset <math>A</math> of <math>[3]^3</math> as a string <math>abc</math>, where <math>a, b, c \subset [3]^2</math> correspond to strings of the form <math>1**</math>, <math>2**</math>, <math>3**</math> in <math>[3]^3</math> respectively. Thus for instance <math>D_3 = xyz</math>, and so from (7) we have <math>c_3=18</math>.<br />
<br />
'''Lemma 1.'''<br />
* The only 18-element line-free subset of <math>[3]^3</math> is <math>D_3 = xyz</math>.<br />
* The only 17-element line-free subsets of <math>[3]^3</math> are formed by removing a point from <math>D_3=xyz</math>, or by removing either 111, 222, or 333 from <math>D_{3,2} = yzx</math> or <math>D_{3,3}=zxy</math>.<br />
<br />
'''Proof'''. We prove the second claim. As <math>17=6+6+5</math>, and <math>c_2=6</math>, at least two of the slices of a 17-element line-free set must be from x, y, z, w, with the third slice having 5 points. If two of the slices are identical, the last slice can have only 3 points, a contradiction. If one of the slices is a w, then the 5-point slice will contain a diagonal, contradiction. By symmetry we may now assume that two of the slices are x and y, which force the last slice to be z with one point removed. Now one sees that the slices must be in the order xyz, yzx, or zxy, because any other combination has too many lines that need to be removed. The sets yzx, zxy contain the diagonal {111,222,333} and so one additional point needs to be removed. <br />
<br />
The first claim follows by a similar argument to the second.<br />
<math>\Box</math><br />
<br />
== n=4 ==<br />
<br />
:<math>c_4=52</math>:<br />
<br />
Indeed, divide a line-free set in <math>[3]^4</math> into three blocks <math>1***, 2***, 3***</math> of <math>[3]^3</math>. If two of them are of size 18, then they must both be xyz, and the third block can have at most 6 elements, leading to an inferior bound of 42. So the best one can do is <math>18+17+17=52</math> which can be attained by deleting the diagonal {1111,2222,3333} from <math>D_{4,1} = xyz\ yzx\ xzy</math>, <math>D_4 = yzx\ zxy\ xyz</math>, or <math>D_{4,2} = zxy\ xyz\ yzx</math>. In fact,<br />
<br />
'''Lemma 2.'''<br />
<br />
* The only 52-element line-free sets in <math>[3]^4</math> are formed by removing the diagonal {1111,2222,3333} from <math>D_{4,j}</math> for some j=0,1,2.<br />
* The only 51-element line-free sets in <math>[3]^4</math> are formed by removing the diagonal and one further point from <math>D_{4,j}</math> for some j=0,1,2.<br />
* The only 50-element line-free sets in <math>[3]^4</math> are formed by removing the diagonal and two further points from <math>D_{4,j}</math> for some j=0,1,2 OR is equal to one of the three permutations of the set <math>X := \Gamma_{3,1,0} \cup \Gamma_{3,0,1} \cup \Gamma_{2,2,0} \cup \Gamma_{2,0,2} \cup \Gamma_{1,1,2} \cup \Gamma_{1,2,1} \cup \Gamma_{0,2,2}</math>.<br />
<br />
'''Proof''' It suffices to prove the third claim. In fact it suffices to show that every 50-point line-free set is either contained in the 54-point set <math>D_{4,j}</math> for some j=0,1,2, or is some permutation of the set X. Indeed, if a 50-point line-free set is contained in, say, <math>D_4</math>, then it cannot contain 2222, since otherwise it must omit one point from each of the four pairs formed from {2333, 2111} by permuting the indices, and must also omit one of {1111, 1222, 1333}, leading to at most 49 points in all; similarly, it cannot contain 1111, and so omits the entire diagonal {1111,2222,3333}, with two more points to be omitted. Similarly when <math>D_4</math> is replaced by one of the other <math>D_{4,j}</math><br />
<br />
Next, observe that every three-dimensional slice of a line-free set can have at most <math>c_3=18</math> points; thus when one partitions a 50-point line-free set into three such slices, it must divide either as 18+16+16, 17+17+16, or some permutation of these.<br />
<br />
Suppose that we can slice the set into two slices of 17 points and one slice of 16 points. By the various symmetries, we may assume that the 1*** slice and 2*** slices have 17 points, and the 3*** slice has 16 points. By Lemma 1, the 1-slice is <math>\{1\} \times D_{3,j}</math> with one point removed, and the 2-slice is <math>\{2\} \times D_{3,k}</math> with one point removed, for some <math>j,k \in \{0,1,2\}</math>.<br />
<br />
If j=k, then the 1-slice and 2-slice have at least 15 points in common, so the 3-slice can have at most <math>27-15=12</math> points, a contradiction. If jk = 01, 12, or 20, then observe that from Lemma 1 the *1**, *2**, *3** slices cannot equal a 17-point or 18-point line-free set, so each have at most 16 points, leading to only 48 points in all, a contradiction. Thus we must have jk = 10, 21, or 02.<br />
<br />
Let's first suppose that jk=02. Then by Lemma 1, the 2*** slice contains the nine points formed from {2211, 2322, 2331} and permuting the last three indices, while the 1*** slice contains at least eight of the nine points formed from {1211, 1322, 1311} and permuting the last three indices. Thus the 3*** slice can contain at most one of the nine points formed from {3211, 3322, 3311} and permuting the last three indices. If it does contain one of these points, say 3211, then it must omit one point from each of the four pairs {3222, 3233}, {3212, 3213}, {3221, 3231}, {3111, 3311}, leading to at most 15 points on this slice, a contradiction. So the 3*** slice must omit all nine points, and is therefore contained in <math>\{3\} \times D_{4,1}</math>, and so the 50-point set is contained in <math>D_{4,1}</math>, and we are done by the discussion at the beginning of the proof.<br />
<br />
The case jk=10 is similar to the jk=02 case (indeed one can get from one case to the other by swapping the 1 and 2 indices). Now suppose instead that jk=12. Then by Lemma 1, the 1*** slice contains the six points from permuting the last three indices of 1123, and similarly the 2*** slice contains the six points from permuting the last three indices of 2123. Thus the 3*** slice must avoid all six points formed by permuting the last three indices of 3123. Similarly, as 1133 lies in the 1*** slice and 2233 lies in the 2*** slice, 3333 must be avoided in the 3*** slice.<br />
<br />
Now we claim that 3111 must be avoided also; for if 3111 was in the set, then one point from each of the six pairs formed from {3311, 3211}, {3331, 3221} and permuting the last three indices must lie outside the 3*** slice, which reduces the size of that slice to at most <math>27-6-1-6=14</math>, which is too small. Similarly, 3222 must be avoided, which puts the 3*** slice inside <math>\{3\} \times D_3</math> and then places the 50-point set inside <math>D_4</math>, and we are done by the discussion at the beginning of the proof.<br />
<br />
We have handled the case in which at least one of the slicings of the 50-point set is of the form 50=17+17+16. The only remaining case is when all slicings of the 50-point set are of the form 18+17+16 (or a permutation thereof). By the symmetries of the situation, we may assume that the 1*** slice has 18 points, and thus by Lemma 1 takes the form <math>\{1\} \times D_3</math>. Inspecting the *1**, *2**, *3** slices, we then see (from Lemma 1) that only the *1** slice can have 18 points; since we are assuming that this slicing is some permutation of 50=18+17+16, we conclude that the *1** slice must have exactly 18 points, and is thus described precisely by Lemma 1. Similarly for the **1* and ***1 slices. Indeed, by Lemma 1, we see that the 50-point set must agree exactly with <math>D_{4,1}</math> on any of these slices. In particular, on the remaining portion <math>\{2,3\}^4</math> of the cube, there are exactly 6 points of the 50-point set in <math>\{2,3\}^4</math>.<br />
<br />
Suppose that 3333 was in the set; then since all permutations of 3311, 3331 are known to lie in the set, then 3322, 3332 must lie outside the set. Also, as 1222 lies in the set, at least one of 2222, 3222 lie outside the set. This leaves only 5 points in <math>\{2,3\}^4</math>, a contradiction. Thus 3333 lies outside the set; similarly 2222 lies outside the set.<br />
<br />
Let a be the number of points in the 50-point set which are some permutation of 2233, thus <math>0 \leq a \leq 6</math>. If a=0 then the set lies in <math>D_{4,1}</math> and we are done. If a=6 then the set is exactly X and we are done. Now suppose a=1,2,3. By symmetry we may assume that 2233 lies in the set. Then (since 2133, 1233 2231, 2213 are known to lie in the set) 2333, 3233, 2223, 2232 lie outside the set, which leaves at most 5 points inside <math>\{2,3\}^4</math>, a contradiction.<br />
<br />
The remaining case is when a=4,5. Then one of the three pairs {2233, 3322}, {2323, 3232}, {2332, 3223} lie in the set. By symmetry we may assume that {2233, 3322} lie in the set. Then by arguing as before we see that all eight points formed by permuting 2333 or 3222 lie outside the set, leading to at most 5 points inside <math>\{2,3\}^4</math>, a contradiction.<br />
<math>\Box</math><br />
<br />
== n=5 ==<br />
<br />
:<math>c_5=150</math>:<br />
<br />
'''Lemma 3'''. Any line-free subset of <math>D_{5,j}</math> can have at most 150 points.<br />
<br />
'''Proof'''. By rotation we may work with <math>D_5</math>. This set has 162 points. By looking at the triplets {10000, 11110, 12220} and cyclic permutations we must lose 5 points; similarly from the triplets {20000,22220, 21110} and cyclic permutations. Finally from {11000,11111,11222} and {22000,22222,22111} we lose two more points. <math>\Box</math><br />
<br />
Equality can be attained by removing <math>\Gamma_{0,4,1}, \Gamma_{0,5,0}, \Gamma_{4,0,1}, \Gamma_{5,0,0}</math> from <math>D_5</math>. Thus <math>c_5 \geq 150</math>.<br />
<br />
Another pattern of 150 points is this: Take the 450 points<br />
in <math>{}[3]^6</math> which are (1,2,3), (0,2,4) and permutations,<br />
then select the 150 whose final coordinate is 1. That gives<br />
this many points in each cube:<br />
<br />
17 18 17<br />
<br />
17 17 18<br />
<br />
12 17 17<br />
<br />
'''Lemma 4'''. A line-free subset of <math>[3]^5</math> with over 150 points cannot have two parallel <math>[3]^4</math> slices, each of which contain at least 51 points.<br />
<br />
'''Proof'''. Suppose not. By symmetry, we may assume that the 1**** and 2**** slices have at least 51 points, and that the whole set has at least 151 points, which force the third slice to have at least <math>151-2c_4 = 47</math> points.<br />
<br />
By Lemma 2, the 1**** slice takes the form <math>\{1\} \times D_{4,j}</math> for some <math>j=0,1,2</math> with the diagonal {11111,12222,13333} and possibly one more point removed, and similarly the 2**** slice takes the form <math>\{2\} \times D_{4,k}</math> for some <math>k=0,1,2</math> with the diagonal {21111,22222,23333} and possibly one more point removed.<br />
<br />
Suppose first that j=k. Then the 1-slice and 2-slice have at least 50 points in common, leaving at most 31 points for the 3-slice, a contradiction. Next, suppose that jk=01. Then observe that the *i*** slice cannot look like any of the configurations in Lemma 2 and so must have at most 50 points for i=1,2,3, leading to 150 points in all, a contradiction. Similarly if jk=12 or 20. Thus we must have jk equal to 10, 21, or 02.<br />
<br />
Let's suppose first that jk=10. The first slice then is equal to <math>\{1\} \times D_{4,1}</math> with the diagonal and possibly one more point removed, while the second slice is equal to <math>\{2\} \times D_{4,0}</math> with the diagonal and possibly one more point removed. Superimposing these slices, we thus see that the third slice is contained in <math>\{3\} \times D_{4,2}</math> except possibly for two additional points, together with the one point 32222 of the diagonal that lies outside of <math>\{3\} \times D_{4,2}</math>.<br />
<br />
The lines x12xx, x13xx (plus permutations of the last four digits) must each contain one point outside the set. The first two slices can only absorb two of these, and so at least 14 of the 16 points formed by permuting the last four digits of 31233, 31333 must lie outside the set. These points all lie in <math>\{3\} \times D_{4,2}</math>, and so the 3**** slice can have at most <math>|D_{4,2}|-14+3=43</math> points, a contradiction.<br />
<br />
The case jk=02 is similar to the case jk=10 (indeed one can obtain one from the other by swapping 1 and 2). Now we turn to the case jk=21. Arguing as before we see that the third slice is contained in <math>\{3\} \times D_4</math> except possibly for two points, together with 33333. <br />
<br />
If 33333 was in the set, then each of the lines xx333, xxx33 (and permutations of the last four digits) must have a point missing from the first two slices, which cannot be absorbed by the two points we are permitted to remove; thus 33333 is not in the set. For similar reasons, 33331 is not in the set, as can be seen by looking at xxx31 and permutations of the last four digits. Indeed, any string containing four threes does not lie in the set; this means that at least 8 points are missing from <math>\{3\} \times D_4</math>, leaving only at most 46 points inside that set. Furthermore, any point in the 3**** slice outside of <math>\{3\} \times D_4</math> can only be created by removing a point from the first two slices, so the total cardinality is at most <math>46+52+52 = 150</math>, a contradiction.<math>\Box</math><br />
<br />
'''Corollary'''. <math>c_5 \leq 152</math><br />
<br />
'''Proof'''. By Lemma 4 and the bound <math>c_4=52</math>, any line-free set with over 150 points can have one slice of cardinality 52, but then the other two slices can have at most 50 points. <math>\Box</math><br />
<br />
<br />
'''Lemma 5''' Any solution with 151 or more points has a slice with at most 49 points.<br />
<br />
'''Proof''' Suppose we have 151 points without a line, and each of three slices has at least 50 points.<br />
<br />
Using earlier notation, we split subsets of <math>[3]^4</math> into nine subsets of <math>[3]^2</math>. <br />
So we think of x,y,z,a,b and c as subsets of a square. Each slice is one of the following.<br />
*<math>D_4 = y'zx,zx'y,xyz</math> (with one or two points removed)<br />
*<math>D_{4,2} = z'xy,xyz,yzx'</math> (with one or two points removed)<br />
*<math>D_{4,1} = xyz,yz'x,zxy'</math> (with one or two points removed)<br />
*<math>X = xyz, ybw, zwc</math><br />
*<math>Y = axw, xyz, wzc</math><br />
*<math>Z = awx, wby, xyz</math><br />
<br />
where a, b and c have four points each.<br />
<br />
.. 32 33 31 .. 33 .. .. ..<br />
a = .. 22 23 b = .. .. .. c = 21 22 ..<br />
.. .. .. 11 .. 13 11 12 ..<br />
<br />
x', y' and z' are subsets of x, y and z respectively, and have five points each.<br />
<br />
Suppose all three slices are subsets of <math>D_{4,j}</math>. <br />
We can remove at most five points from the full set of three D_{4,j}. <br />
Consider columns 2,3,4,6,7,8. At most two of these columns contain xyz, so one point must be removed from the other four.<br />
This uses up all but one of the removals.<br />
So the slices must be <math>D_{4,2},D_{4,1},D_{4,0}</math> or a cyclic permutation of that.<br />
Then the cube, which contains the first square of slice 1; the fifth square of slice 2; <br />
and the ninth square of slice 3, contains three copies of the same square. <br />
It takes more than one point removed to remove all lines from that cube.<br />
So we can't have all three slices subsets of <math>D_{4,j}</math>.<br />
<br />
Suppose one slice is X,Y or Z, and two others are subsets of <math>D_{4,j}</math>. <br />
We can remove at most three points from the full <math>D_{4,j}</math><br />
By symmetry, suppose one slice is X. Consider columns 2,3,4 and 7. They must be cyclic permutations of x,y,z,<br />
and two of them are not xyz, so must lose a point. <br />
Columns 6 and 8 must both lose a point, and we only have 150 points left.<br />
So if one slice is X,Y or Z, the full set contains a line.<br />
<br />
Suppose two slices are from X,Y and Z, and the other is a subset of <math>D_{4,j}</math>. <br />
By symmetry, suppose two slices are X and Y. Columns 3,6,7 and 8 all contain w, and therefore at most 16 points each.<br />
Columns 1,5 and 9 contain a,b, or c, and therefore at most 16 points. <br />
So the total number of points is at most 7*16+2*18 = 148. This contradicts the assumption of 151 points.<br />
<math>\Box</math><br />
<br />
'''Corollary''' <math>c_5 \leq 151 </math><br />
<br />
'''Proof''' By Lemmas 2 and 4, the maximum number of points is 52+50+49=151. <math>\Box</math><br />
<br />
'''Lemma 5.1''' No solution with 151 points contains as a slice the X defined in Lemma 2<br />
<br />
'''Proof''' Suppose one row is X. Another row is <math>D_{4,j}</math>.<br />
<br />
Suppose X is in the first row. Label the other rows with letters from the alphabet.<br />
<br />
xyz ybw zwc<br />
<br />
mno pqr stu<br />
<br />
def ghi jkl<br />
<br />
Reslice the array into a left nine, middle nine and right nine. One of these squares<br />
contains 52 points, and it can only be the left nine. One of its three columns contains<br />
18 points, and it can only be its left-hand column, xmd. So m=y and d=z. But none of the {math>D_{4,j}</math> begins with y or z, which is a contradiction. So X is not in the first row.<br />
<br />
So X is in the second or third row. By symmetry, suppose it is in the second row<br />
<br />
def ghi jkl<br />
<br />
xyz ybw zwc<br />
<br />
mno pqr stu<br />
<br />
Again, the left-hand nine must contain 52 points, so it is <math>D_{4,2}</math>.<br />
So either the first row is <math>D_{4,2}</math> or the third row is <math>D_{4,0}</math>.<br />
If the first row is <math>D_{4,2}</math> then the only way to have 50 points in the middle or right-hand nine is if the middle nine is X<br />
<br />
z'xy xyz yzx'<br />
<br />
xyz ybw zwc<br />
<br />
yzx' zwc stu<br />
<br />
In the seventh column, s contains 5 points and in the eighth column, t contains 4 points.<br />
The final row can now contain at most 48 points, and the whole array contains only 52+50+48 = 150 points.<br />
<br />
If the third row is <math>D_{4,0}</math>, then neither the middle nine nor the right-hand nine contains 50 points, by the classification of Lemma 4 and the formulas at the start of Lemma 5.<br />
Again, only 52+49+49 = 150 points are possible.<br />
<br />
A similar argument is possible if X is in the third row; or if X is replaced by Y or Z.<br />
<br />
So when a 151-point set is sliced into three, one slice is <math>D_{4,j}</math> and another slice is 50 points contained in <math>D_{4,k}</math>. <math>\Box</math><br />
<br />
'''Lemma 5.2''' There is no 151-point solution<br />
<br />
'''Proof''' Assume by symmetry that the first row contains 52 points and the second row contains 50.<br />
<br />
If <math>D_{4,1}</math> is in the first row, then the second row must be contained in <math>D_{4,0}</math>. <br />
<br />
xyz yz'x zxy'<br />
<br />
y'zx zx'y xyz<br />
<br />
def ghi jkl<br />
<br />
But then none of the left nine, middle nine or right nine can contain 52 points, which contradicts the corollary to Lemma 5.<br />
<br />
Suppose the first row contains D_{4,0}. Then the second row is contained in <math>D_{4,2}</math>, otherwise the cubes formed from the nine columns of the diagram would need to remove too many points.<br />
<br />
y'zx zx'y xyz<br />
<br />
z'xy xyz yzx'<br />
<br />
def ghi jkl<br />
<br />
But then neither the left nine, middle nine or right nine contains 52 points.<br />
<br />
So the first row contains <math>D_{4,2}</math>, and the second row is contained in <math>D_{4,1}</math>. Two points may be removed from the second row of this diagram.<br />
<br />
z'xy xyz yzx'<br />
<br />
xyz yz'x zxy'<br />
<br />
def ghi jkl<br />
<br />
Slice it into the left nine, middle nine and right nine. Two of them are contained in <math>D_{4,j}</math><br />
so at least two of def, ghi, and jkl are contained in the corresponding slice of <math>D_{4,0}</math>.<br />
Slice along a different axis, and at least two of dgj,ehk,fil are contained in the corresponding slice of <br />
<math>D_{4,0}</math>. <br />
So eight of the nine squares in the bottom row are contained in the corresponding square of <math>D_{4,0}</math>.<br />
Indeed, slice along other axes, and all points except one are contained within <math>D_{4,0}</math>. <br />
This point is the intersection of all the 49-point slices. <br />
<br />
So, if there is a 151-point solution, then after removal of the specified point, <br />
there is a 150-point solution, within <math>D_{5,j}</math>, whose slices in each direction are 52+50+48.<br />
<br />
z'xy xyz yzx'<br />
<br />
xyz yz'x zxy'<br />
<br />
y'zx zx'y xyz<br />
<br />
One point must be lost from columns 3, 6, 7 and 8, and four more from the major diagonal z'z'z. That leaves 148 points instead of 150.<br />
<br />
So the 150-point solution does not exist with 52+50+48 slices; so the 151 point solution does not exist.<math>\Box</math><br />
<br />
<br />
An integer programming method has established the upper bound <math>c_5\leq 150</math>, with 12 extremal solutions.<br />
<br />
[http://abel.math.umu.se/~klasm/extremal-c5 This file] contains the extermisers. One point per line and different extermisers separated by a line with “—”<br />
<br />
[http://abel.math.umu.se/~klasm/linprog-d=5-t=3.lpt This is the linear program], readable by Gnu’s glpsol linear programing solver, which also quickly proves that 150 is the optimum.<br />
<br />
Each variable corresponds to a point in the cube, numbered according to their lexicografic ordering. If a variable is 1 then the point is in the set, if it is 0 then it is not in the set.<br />
There is one linear inequality for each combinatorial line, stating that at least one point must be missing from the line.<br />
<br />
== n=6 ==<br />
<br />
:<math>c_6=450</math>:<br />
<br />
The upper bound follows since <math>c_6 \leq 3 c_5</math>. The lower bound can be formed by gluing together all the [[slice]]s <math>\Gamma_{a,b,c}</math> where (a,b,c) is a permutation of (0,2,4) or (1,2,3).<br />
<br />
Computer verification, using the <math>c_5=150</math> extremals, has shown that there is exactly one extremiser for <math>c_6=450</math>.<br />
<br />
== n=7 ==<br />
<br />
:<math>1302 \leq c_7 \leq 1348</math>:<br />
<br />
To see the upper bound <math>c_7 \leq 3c_6-2</math>, observe that if two parallel six-dimensional slices had <math>c_6</math> points, then by uniqueness they are identical, and the third slice can have at most <math>3^6-c_6=279</math> points, far too few to get anywhere close to <math>1348</math>. Thus there can be at most one slice with <math>c_6</math> points, and the other two have at most <math>c_6-1</math>, giving the claim.<br />
<br />
The lower bound can be formed by removing 016,106,052,502,151,511,160,610 from <math>D_7</math>.<br />
<br />
'''Lemma 6''' Any line-free subset of <math>D_7</math> has at most 1302 points.<br />
<br />
'''Proof''' Start with the 1458 points of <math>D_7</math>. You must lose:<br />
<br />
* 42 points from (1,2,4),(1,5,1),(4,2,1)<br />
* 42 points from (2,1,4),(2,4,1),(5,1,1)<br />
* 21 points from (0,2,5),(0,5,2),(3,2,2)<br />
* 21 points from (2,0,5),(2,3,2),(5,0,2)<br />
* 15 points from (0,1,6),(0,4,3),(3,1,3),(0,7,0),(3,4,0),(6,1,0)<br />
* 15 points from (1,0,6),(1,3,3),(4,0,3),(7,0,0),(4,3,0),(1,6,0)<br />
<br />
where (a,b,c) is shorthand for the [[slice]] <math>\Gamma_{a,b,c}</math>.<br />
<math>\Box</math><br />
<br />
== Larger n ==<br />
<br />
The following construction gives lower bounds for the number of triangle-free points, <br />
There are of the order <math>2.7 \sqrt{log(N)/N}3^N</math> points for large N (N ~ 5000)<br />
<br />
It applies when N is a multiple of 3. <br />
* For N=3M-1, restrict the first digit of a 3M sequence to be 1. So this construction has exactly one-third as many points for N=3M-1 as it has for N=3M. <br />
* For N=3M-2, restrict the first two digits of a 3M sequence to be 12. This leaves roughly one ninth of the points for N=3M-2 as for N=3M.<br />
<br />
The current lower bounds for <math>c_{3m}</math> are built like this, with abc being shorthand for <math>\Gamma_{a,b,c}</math>:<br />
<br />
* <math>c_3</math> from (012) and permutations<br />
* <math>c_6</math> from (123,024) and perms<br />
* <math>c_9</math> from (234,135,045) and perms<br />
* <math>c_{12}</math> from (345,246,156,02A,057) and perms (A=10)<br />
* <math>c_{15}</math> from (456,357,267,13B,168,04B,078) and perms (B=11)<br />
<br />
To get the triples in each row, add 1 to the triples in the previous row; then include new triples that have a zero.<br />
<br />
A general formula for these points is given below. I think that they are triangle-free. (For N<21, ignore any triple with a negative entry.)<br />
<br />
* There are thirteen groups of points in the centre, formed from adding one of the following points, or its permutation, to (M,M,M), when N=3M:<br />
** (-7,-3,+10), (-7, 0,+7),(-7,+3,+4),(-6,-4,+10),(-6,-1,+7),(-6,+2,+4),(-5,-1,+6),(-5,+2,+3),(-4,-2,+6),(-4,+1,+3),(-3,+1,+2),(-2,0,+2),(-1,0,+1) <br />
* There are also eight string of points, stretching to the edges of the (abc) triangle:<br />
** For N=6K = 3M<br />
*** M+(-8-2x,-6-2x,14+4x),M+(-8-2x,-3-2x,11+4x),M+(-8-2x,x,8+x),M+(-8-2x,3+x,5+x) and permutations (x>=0, M-8-2x>=0)<br />
*** M+(-9-2x,-5-2x,14+4x),M+(-9-2x,-2-2x,11+4x),M+(-9-2x,1+x,8+x),M+(-9-2x,4+x,5+x) and permutations (x>=0, M-9-2x>=0)<br />
<br />
<br />
An alternate construction:<br />
<br />
First define a sequence, of all positive numbers which, in base 3, do not contain a 1. Add 1 to all multiples of 3 in this sequence. This sequence does not contain a length-3 arithmetic progression.<br />
<br />
It starts 1,2,7,8,19,20,25,26,55, …<br />
<br />
Second, list all the (abc) triples for which the larger two differ by a number<br />
from the sequence, excluding the case when the smaller two differ by 1, but then including the case when (a,b,c) is a permutation of N/3+(-1,0,1)<br />
<br />
== Asymptotics ==<br />
<br />
DHJ(3) is equivalent to the upper bound<br />
<br />
:<math>c_n \leq o(3^n)</math><br />
<br />
In the opposite direction, observe that if we take a set <math>S \subset [3n]</math> that contains no 3-term arithmetic progressions, then the set <math>\bigcup_{(a,b,c) \in \Delta_n: a+2b \in S} \Gamma_{a,b,c}</math> is line-free. From this and the Behrend construction it appears that we have the lower bound<br />
<br />
:<math>c_n \geq 3^{n-O(\sqrt{\log n})}.</math><br />
<br />
More precisely, we have<br />
<br />
:<math>c_n > C 3^{n - 4\sqrt{\log 2}\sqrt{\log n}+\frac 12 \log \log n}</math><br />
for some absolute constant C, and where all logarithms are base-3.<br />
<br />
'''Proof''' For convenience, let n be a multiple of 3. Elkin’s bound gives <math>r_3(\sqrt{n}) > C \sqrt{n} (\log n)^{1/4} \exp_2(-2 \sqrt{\log_2 n})</math>, and let <math>R</math> be a subset of <math>(-3\sqrt{n}/2,3\sqrt{n}/2)</math> without 3-term APs and with size <math>r_3(\sqrt{n})</math>, and with all elements being integer multiples of 3 (again as a matter of convenience). For each <math>r,s\in R</math>, let <math>a = (n-r-s)/3</math>. The set <math>A</math> is the union of all <math>\Gamma_{a,a+r,a+s}</math>. Since all of <math>a, a+r,a+s</math> are between <math>n/3-2\sqrt{n}</math> and <math>n/3+2\sqrt{n}</math>, the size of <math>\Gamma_{a,a+r,a+s}</math> is at least <math>C 3^n / n</math>. Since there are <math>r_3(\sqrt{n})^2</math> choices for r and s, we have a set with size at least<br />
<br />
:<math>C (\sqrt{n} (\log n)^{1/4} \exp_2(-2 \sqrt{\log_2 n}))^2 3^n / n</math>.<br />
<br />
This simplifies to <math>C \sqrt{\log n} \exp_3(n-\alpha \sqrt{\log_3(n)})</math>, where <math>\alpha=4 \sqrt{\log_3(2)}</math>.<br />
<br />
Now suppose that <math>x_i\in \Gamma_{a_i,a_i+r_i,a_i+s_i}</math> is a combinatorial line in the set A. Then <math>(a_i+s_i)-(a_i)=s_i</math> is a 3-term AP contained in R, so the <math>s_i</math> are all the same. Similarly, all of the <math>r_i</math> are the same, and therefore all of the <math>a_i</math> are the same, too. But this implies that the <math>x_i</math> sequence is constant, which means the line is degenerate. <math>\Box</math><br />
<br />
[http://terrytao.wordpress.com/2009/02/05/upper-and-lower-bounds-for-the-density-hales-jewett-problem/#comment-35652 Numerics suggest] that the first large n construction given above above give a lower bound of roughly <math>2.7 \sqrt{\log(n)/n} \times 3^n</math>, which would asymptotically be inferior to the Behrend bound.<br />
<br />
The second large n construction had numerical asymptotics for <math>\log(c_n/3^n)</math> close to <math>1.2-\sqrt{\log(n)}</math> between n=1000 and n=10000, consistent with the Behrend bound.<br />
<br />
== Other k values ==<br />
<br />
A [http://abel.math.umu.se/~klasm/Data/HJ/ computer search] has found the following <math>c_n</math> values for different values of dimension n and edgelength k. Several of these values reach the upper bound of <math>(k-1)k^{n-1}</math>.<br />
<br />
{|<br />
|n\k || 2 || 3 || 4 || 5 || 6 || 7<br />
|-<br />
| 2 || 2 || 6 || 12 || 20 || 30 || 42<br />
|-<br />
| 3 || 3 || 18 || 48 || 100 || 180 || 294<br />
|-<br />
| 4 || 6 || 52 || 183 || 500 || 1029-1079 || 2058<br />
|-<br />
| 5 || 10 || 150 || 683-732 || 2500<br />
|}<br />
<br />
If k is prime and <math>k \ge n</math>, then one can remove all combinatorial lines by deleting all points whose coordinates sum to a multiple of k. So the density of deleted points in the optimal configuration is 1/k when k is prime.<br />
<br />
Let p be the smallest prime greater than or equal to max(k,n). One can remove all combinatorial lines by deleting all points whose coordinates sum to <math>0\le x\le p-k</math> (mod p), So the density of deleted points is at most (p-k+1)/p. This approaches zero as <math>k\rightarrow\infty</math>. For example, the following paper shows there is a prime between x-x^0.525 and x.<br />
<br />
Baker, R. C.(1-BYU); Harman, G.(4-LNDHB); Pintz, J.(H-AOS)<br />
The difference between consecutive primes. II.<br />
Proc. London Math. Soc. (3) 83 (2001), no. 3, 532–562.<br />
<br />
== Numerical methods ==<br />
<br />
A greedy algorithm [http://thetangentspace.com/wiki/Hales-Jewett_Theorem was implemented here]. The results were sharp for <math>n \leq 3</math> but were slightly inferior to the constructions above for larger n.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Upper_and_lower_boundsUpper and lower bounds2009-03-25T17:39:24Z<p>121.220.134.232: /* Other k values */</p>
<hr />
<div><center>'''Upper and lower bounds for <math>c_n</math> for small values of n.'''</center><br />
<br />
<math>c_n</math> is the size of the largest subset of <math>[3]^n</math> that does not contain a combinatorial line (OEIS [http://www.research.att.com/~njas/sequences/A156762 A156762]. A spreadsheet for all the latest bounds on <math>c_n</math> [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DsU-uZ1tK7VEg can be found here]. In this page we record the proofs justifying these bounds.<br />
<br />
<br />
{|<br />
| n || 0 || 1 || 2 || 3 || 4 || 5 || 6 || 7<br />
|-<br />
| <math>c_n</math> || 1 || 2 || 6 || 18 || 52 || 150 || 450 || [1302,1348]<br />
|}<br />
<br />
== Basic constructions ==<br />
<br />
For all <math>n \geq 1</math>, a basic example of a mostly line-free set is<br />
<br />
:<math>D_n := \{ (x_1,\ldots,x_n) \in [3]^n: \sum_{i=1}^n x_i \neq 0 \ \operatorname{mod}\ 3 \}</math>. (1)<br />
<br />
This has cardinality <math>|D_n| = 2 \times 3^{n-1}</math>. The only lines in <math>D_n</math> are those with<br />
<br />
# A number of wildcards equal to a multiple of three;<br />
# The number of 1s unequal to the number of 2s modulo 3.<br />
<br />
One way to construct line-free sets is to start with <math>D_n</math> and remove some additional points. We also have the variants <math>D_{n,0}=D_n, D_{n,1}, D_{n,2}</math> defined as<br />
<br />
:<math>D_{n,j} := \{ (x_1,\ldots,x_n) \in [3]^n: \sum_{i=1}^n x_i \neq j \ \operatorname{mod}\ 3 \}</math>. (1')<br />
<br />
When n is not a multiple of 3, then <math>D_{n,0}, D_{n,1}, D_{n,2}</math> are all cyclic permutations of each other; but when n is a multiple of 3, then <math>D_{n,0}</math> plays a special role (though <math>D_{n,1}, D_{n,2}</math> are still interchangeable).<br />
<br />
Another useful construction proceeds by using the slices <math>\Gamma_{a,b,c} \subset [3]^n</math> for <math>(a,b,c)</math> in the triangular grid<br />
<br />
:<math>\Delta_n := \{ (a,b,c) \in {\Bbb Z}_+^3: a+b+c = n \},</math>. (2)<br />
<br />
where <math>\Gamma_{a,b,c}</math> is defined as the strings in <math>[3]^n</math> with <math>a</math> 1s, <math>b</math> 2s, and <math>c</math> 3s. Note that<br />
<br />
:<math>|\Gamma_{a,b,c}| = \frac{n!}{a! b! c!}.</math> (3)<br />
<br />
Given any set <math>B \subset \Delta_n</math> that avoids equilateral triangles <math> (a+r,b,c), (a,b+r,c), (a,b,c+r)</math>, the set<br />
<br />
:<math>\Gamma_B := \bigcup_{(a,b,c) \in B} \Gamma_{a,b,c}</math> (4)<br />
<br />
is line-free and has cardinality<br />
<br />
:<math>|\Gamma_B| = \sum_{(a,b,c) \in B} \frac{n!}{a! b! c!},</math> (5)<br />
<br />
and thus provides a lower bound for <math>c_n</math>:<br />
<br />
:<math>c_n \geq \sum_{(a,b,c) \in B} \frac{n!}{a! b! c!}.</math> (6)<br />
<br />
All lower bounds on <math>c_n</math> have proceeded so far by choosing a good set of B and applying (6). Note that <math>D_n</math> is the same as <math>\Gamma_{B_n}</math>, where <math>B_n</math> consists of those triples <math>(a,b,c) \in \Delta_n</math> in which <math>a \neq b\ \operatorname{mod}\ 3</math>.<br />
<br />
Note that if one takes a line-free set and permutes the alphabet <math>\{1,2,3\}</math> in any fashion (e.g. replacing all 1s by 2s and vice versa), one also gets a line-free set. This potentially gives six examples from any given starting example of a line-free set, though in practice there is enough symmetry that the total number of examples produced this way is less than six. (These six examples also correspond to the six symmetries of the triangular grid <math>\Delta_n</math> formed by rotation and reflection.)<br />
<br />
Another symmetry comes from permuting the <math>n</math> indices in the strings of <math>[3]^n</math> (e.g. replacing every string by its reversal). But the sets <math>\Gamma_B</math> are automatically invariant under such permutations and thus do not produce new line-free sets via this symmetry.<br />
<br />
== The basic upper bound ==<br />
<br />
Because <math>[3]^{n+1}</math> can be expressed as the union of three copies of <math>[3]^n</math>, we have the basic upper bound<br />
<br />
:<math>c_{n+1} \leq 3 c_n.</math> (7)<br />
<br />
Note that equality only occurs if one can find an <math>n+1</math>-dimensional line-free set such that every n-dimensional slice has the maximum possible cardinality of <math>c_n</math>.<br />
<br />
== n=0 ==<br />
<br />
:<math>c_0=1</math>:<br />
<br />
This is clear.<br />
<br />
== n=1 ==<br />
<br />
:<math>c_1=2</math>:<br />
<br />
The three sets <math>D_1 = \{1,2\}</math>, <math>D_{1,1} = \{2,3\}</math>, and <math>D_{1,2} = \{1,3\}</math> are the only two-element sets which are line-free in <math>[3]^1</math>, and there are no three-element sets.<br />
<br />
== n=2 ==<br />
<br />
:<math>c_2=6</math>:<br />
<br />
There are four six-element sets in <math>[3]^2</math> which are line-free, which we denote <math>x = D_{2,2}</math>, <math>y=D_{2,1}</math>, <math>z=D_2</math>, and <math>w</math> and are displayed graphically as follows.<br />
<br />
13 .. 33 .. 23 33 13 23 .. 13 23 ..<br />
x = 12 22 .. y = 12 .. 32 z = .. 22 32 w = 12 .. 32<br />
.. 21 31 11 21 .. 11 .. 31 .. 21 31<br />
<br />
Combining this with the basic upper bound (7) we see that <math>c_2=6</math>.<br />
<br />
== n=3 ==<br />
<br />
:<math>c_3=18</math>:<br />
<br />
We describe a subset <math>A</math> of <math>[3]^3</math> as a string <math>abc</math>, where <math>a, b, c \subset [3]^2</math> correspond to strings of the form <math>1**</math>, <math>2**</math>, <math>3**</math> in <math>[3]^3</math> respectively. Thus for instance <math>D_3 = xyz</math>, and so from (7) we have <math>c_3=18</math>.<br />
<br />
'''Lemma 1.'''<br />
* The only 18-element line-free subset of <math>[3]^3</math> is <math>D_3 = xyz</math>.<br />
* The only 17-element line-free subsets of <math>[3]^3</math> are formed by removing a point from <math>D_3=xyz</math>, or by removing either 111, 222, or 333 from <math>D_{3,2} = yzx</math> or <math>D_{3,3}=zxy</math>.<br />
<br />
'''Proof'''. We prove the second claim. As <math>17=6+6+5</math>, and <math>c_2=6</math>, at least two of the slices of a 17-element line-free set must be from x, y, z, w, with the third slice having 5 points. If two of the slices are identical, the last slice can have only 3 points, a contradiction. If one of the slices is a w, then the 5-point slice will contain a diagonal, contradiction. By symmetry we may now assume that two of the slices are x and y, which force the last slice to be z with one point removed. Now one sees that the slices must be in the order xyz, yzx, or zxy, because any other combination has too many lines that need to be removed. The sets yzx, zxy contain the diagonal {111,222,333} and so one additional point needs to be removed. <br />
<br />
The first claim follows by a similar argument to the second.<br />
<math>\Box</math><br />
<br />
== n=4 ==<br />
<br />
:<math>c_4=52</math>:<br />
<br />
Indeed, divide a line-free set in <math>[3]^4</math> into three blocks <math>1***, 2***, 3***</math> of <math>[3]^3</math>. If two of them are of size 18, then they must both be xyz, and the third block can have at most 6 elements, leading to an inferior bound of 42. So the best one can do is <math>18+17+17=52</math> which can be attained by deleting the diagonal {1111,2222,3333} from <math>D_{4,1} = xyz\ yzx\ xzy</math>, <math>D_4 = yzx\ zxy\ xyz</math>, or <math>D_{4,2} = zxy\ xyz\ yzx</math>. In fact,<br />
<br />
'''Lemma 2.'''<br />
<br />
* The only 52-element line-free sets in <math>[3]^4</math> are formed by removing the diagonal {1111,2222,3333} from <math>D_{4,j}</math> for some j=0,1,2.<br />
* The only 51-element line-free sets in <math>[3]^4</math> are formed by removing the diagonal and one further point from <math>D_{4,j}</math> for some j=0,1,2.<br />
* The only 50-element line-free sets in <math>[3]^4</math> are formed by removing the diagonal and two further points from <math>D_{4,j}</math> for some j=0,1,2 OR is equal to one of the three permutations of the set <math>X := \Gamma_{3,1,0} \cup \Gamma_{3,0,1} \cup \Gamma_{2,2,0} \cup \Gamma_{2,0,2} \cup \Gamma_{1,1,2} \cup \Gamma_{1,2,1} \cup \Gamma_{0,2,2}</math>.<br />
<br />
'''Proof''' It suffices to prove the third claim. In fact it suffices to show that every 50-point line-free set is either contained in the 54-point set <math>D_{4,j}</math> for some j=0,1,2, or is some permutation of the set X. Indeed, if a 50-point line-free set is contained in, say, <math>D_4</math>, then it cannot contain 2222, since otherwise it must omit one point from each of the four pairs formed from {2333, 2111} by permuting the indices, and must also omit one of {1111, 1222, 1333}, leading to at most 49 points in all; similarly, it cannot contain 1111, and so omits the entire diagonal {1111,2222,3333}, with two more points to be omitted. Similarly when <math>D_4</math> is replaced by one of the other <math>D_{4,j}</math><br />
<br />
Next, observe that every three-dimensional slice of a line-free set can have at most <math>c_3=18</math> points; thus when one partitions a 50-point line-free set into three such slices, it must divide either as 18+16+16, 17+17+16, or some permutation of these.<br />
<br />
Suppose that we can slice the set into two slices of 17 points and one slice of 16 points. By the various symmetries, we may assume that the 1*** slice and 2*** slices have 17 points, and the 3*** slice has 16 points. By Lemma 1, the 1-slice is <math>\{1\} \times D_{3,j}</math> with one point removed, and the 2-slice is <math>\{2\} \times D_{3,k}</math> with one point removed, for some <math>j,k \in \{0,1,2\}</math>.<br />
<br />
If j=k, then the 1-slice and 2-slice have at least 15 points in common, so the 3-slice can have at most <math>27-15=12</math> points, a contradiction. If jk = 01, 12, or 20, then observe that from Lemma 1 the *1**, *2**, *3** slices cannot equal a 17-point or 18-point line-free set, so each have at most 16 points, leading to only 48 points in all, a contradiction. Thus we must have jk = 10, 21, or 02.<br />
<br />
Let's first suppose that jk=02. Then by Lemma 1, the 2*** slice contains the nine points formed from {2211, 2322, 2331} and permuting the last three indices, while the 1*** slice contains at least eight of the nine points formed from {1211, 1322, 1311} and permuting the last three indices. Thus the 3*** slice can contain at most one of the nine points formed from {3211, 3322, 3311} and permuting the last three indices. If it does contain one of these points, say 3211, then it must omit one point from each of the four pairs {3222, 3233}, {3212, 3213}, {3221, 3231}, {3111, 3311}, leading to at most 15 points on this slice, a contradiction. So the 3*** slice must omit all nine points, and is therefore contained in <math>\{3\} \times D_{4,1}</math>, and so the 50-point set is contained in <math>D_{4,1}</math>, and we are done by the discussion at the beginning of the proof.<br />
<br />
The case jk=10 is similar to the jk=02 case (indeed one can get from one case to the other by swapping the 1 and 2 indices). Now suppose instead that jk=12. Then by Lemma 1, the 1*** slice contains the six points from permuting the last three indices of 1123, and similarly the 2*** slice contains the six points from permuting the last three indices of 2123. Thus the 3*** slice must avoid all six points formed by permuting the last three indices of 3123. Similarly, as 1133 lies in the 1*** slice and 2233 lies in the 2*** slice, 3333 must be avoided in the 3*** slice.<br />
<br />
Now we claim that 3111 must be avoided also; for if 3111 was in the set, then one point from each of the six pairs formed from {3311, 3211}, {3331, 3221} and permuting the last three indices must lie outside the 3*** slice, which reduces the size of that slice to at most <math>27-6-1-6=14</math>, which is too small. Similarly, 3222 must be avoided, which puts the 3*** slice inside <math>\{3\} \times D_3</math> and then places the 50-point set inside <math>D_4</math>, and we are done by the discussion at the beginning of the proof.<br />
<br />
We have handled the case in which at least one of the slicings of the 50-point set is of the form 50=17+17+16. The only remaining case is when all slicings of the 50-point set are of the form 18+17+16 (or a permutation thereof). By the symmetries of the situation, we may assume that the 1*** slice has 18 points, and thus by Lemma 1 takes the form <math>\{1\} \times D_3</math>. Inspecting the *1**, *2**, *3** slices, we then see (from Lemma 1) that only the *1** slice can have 18 points; since we are assuming that this slicing is some permutation of 50=18+17+16, we conclude that the *1** slice must have exactly 18 points, and is thus described precisely by Lemma 1. Similarly for the **1* and ***1 slices. Indeed, by Lemma 1, we see that the 50-point set must agree exactly with <math>D_{4,1}</math> on any of these slices. In particular, on the remaining portion <math>\{2,3\}^4</math> of the cube, there are exactly 6 points of the 50-point set in <math>\{2,3\}^4</math>.<br />
<br />
Suppose that 3333 was in the set; then since all permutations of 3311, 3331 are known to lie in the set, then 3322, 3332 must lie outside the set. Also, as 1222 lies in the set, at least one of 2222, 3222 lie outside the set. This leaves only 5 points in <math>\{2,3\}^4</math>, a contradiction. Thus 3333 lies outside the set; similarly 2222 lies outside the set.<br />
<br />
Let a be the number of points in the 50-point set which are some permutation of 2233, thus <math>0 \leq a \leq 6</math>. If a=0 then the set lies in <math>D_{4,1}</math> and we are done. If a=6 then the set is exactly X and we are done. Now suppose a=1,2,3. By symmetry we may assume that 2233 lies in the set. Then (since 2133, 1233 2231, 2213 are known to lie in the set) 2333, 3233, 2223, 2232 lie outside the set, which leaves at most 5 points inside <math>\{2,3\}^4</math>, a contradiction.<br />
<br />
The remaining case is when a=4,5. Then one of the three pairs {2233, 3322}, {2323, 3232}, {2332, 3223} lie in the set. By symmetry we may assume that {2233, 3322} lie in the set. Then by arguing as before we see that all eight points formed by permuting 2333 or 3222 lie outside the set, leading to at most 5 points inside <math>\{2,3\}^4</math>, a contradiction.<br />
<math>\Box</math><br />
<br />
== n=5 ==<br />
<br />
:<math>c_5=150</math>:<br />
<br />
'''Lemma 3'''. Any line-free subset of <math>D_{5,j}</math> can have at most 150 points.<br />
<br />
'''Proof'''. By rotation we may work with <math>D_5</math>. This set has 162 points. By looking at the triplets {10000, 11110, 12220} and cyclic permutations we must lose 5 points; similarly from the triplets {20000,22220, 21110} and cyclic permutations. Finally from {11000,11111,11222} and {22000,22222,22111} we lose two more points. <math>\Box</math><br />
<br />
Equality can be attained by removing <math>\Gamma_{0,4,1}, \Gamma_{0,5,0}, \Gamma_{4,0,1}, \Gamma_{5,0,0}</math> from <math>D_5</math>. Thus <math>c_5 \geq 150</math>.<br />
<br />
Another pattern of 150 points is this: Take the 450 points<br />
in <math>{}[3]^6</math> which are (1,2,3), (0,2,4) and permutations,<br />
then select the 150 whose final coordinate is 1. That gives<br />
this many points in each cube:<br />
<br />
17 18 17<br />
<br />
17 17 18<br />
<br />
12 17 17<br />
<br />
'''Lemma 4'''. A line-free subset of <math>[3]^5</math> with over 150 points cannot have two parallel <math>[3]^4</math> slices, each of which contain at least 51 points.<br />
<br />
'''Proof'''. Suppose not. By symmetry, we may assume that the 1**** and 2**** slices have at least 51 points, and that the whole set has at least 151 points, which force the third slice to have at least <math>151-2c_4 = 47</math> points.<br />
<br />
By Lemma 2, the 1**** slice takes the form <math>\{1\} \times D_{4,j}</math> for some <math>j=0,1,2</math> with the diagonal {11111,12222,13333} and possibly one more point removed, and similarly the 2**** slice takes the form <math>\{2\} \times D_{4,k}</math> for some <math>k=0,1,2</math> with the diagonal {21111,22222,23333} and possibly one more point removed.<br />
<br />
Suppose first that j=k. Then the 1-slice and 2-slice have at least 50 points in common, leaving at most 31 points for the 3-slice, a contradiction. Next, suppose that jk=01. Then observe that the *i*** slice cannot look like any of the configurations in Lemma 2 and so must have at most 50 points for i=1,2,3, leading to 150 points in all, a contradiction. Similarly if jk=12 or 20. Thus we must have jk equal to 10, 21, or 02.<br />
<br />
Let's suppose first that jk=10. The first slice then is equal to <math>\{1\} \times D_{4,1}</math> with the diagonal and possibly one more point removed, while the second slice is equal to <math>\{2\} \times D_{4,0}</math> with the diagonal and possibly one more point removed. Superimposing these slices, we thus see that the third slice is contained in <math>\{3\} \times D_{4,2}</math> except possibly for two additional points, together with the one point 32222 of the diagonal that lies outside of <math>\{3\} \times D_{4,2}</math>.<br />
<br />
The lines x12xx, x13xx (plus permutations of the last four digits) must each contain one point outside the set. The first two slices can only absorb two of these, and so at least 14 of the 16 points formed by permuting the last four digits of 31233, 31333 must lie outside the set. These points all lie in <math>\{3\} \times D_{4,2}</math>, and so the 3**** slice can have at most <math>|D_{4,2}|-14+3=43</math> points, a contradiction.<br />
<br />
The case jk=02 is similar to the case jk=10 (indeed one can obtain one from the other by swapping 1 and 2). Now we turn to the case jk=21. Arguing as before we see that the third slice is contained in <math>\{3\} \times D_4</math> except possibly for two points, together with 33333. <br />
<br />
If 33333 was in the set, then each of the lines xx333, xxx33 (and permutations of the last four digits) must have a point missing from the first two slices, which cannot be absorbed by the two points we are permitted to remove; thus 33333 is not in the set. For similar reasons, 33331 is not in the set, as can be seen by looking at xxx31 and permutations of the last four digits. Indeed, any string containing four threes does not lie in the set; this means that at least 8 points are missing from <math>\{3\} \times D_4</math>, leaving only at most 46 points inside that set. Furthermore, any point in the 3**** slice outside of <math>\{3\} \times D_4</math> can only be created by removing a point from the first two slices, so the total cardinality is at most <math>46+52+52 = 150</math>, a contradiction.<math>\Box</math><br />
<br />
'''Corollary'''. <math>c_5 \leq 152</math><br />
<br />
'''Proof'''. By Lemma 4 and the bound <math>c_4=52</math>, any line-free set with over 150 points can have one slice of cardinality 52, but then the other two slices can have at most 50 points. <math>\Box</math><br />
<br />
<br />
'''Lemma 5''' Any solution with 151 or more points has a slice with at most 49 points.<br />
<br />
'''Proof''' Suppose we have 151 points without a line, and each of three slices has at least 50 points.<br />
<br />
Using earlier notation, we split subsets of <math>[3]^4</math> into nine subsets of <math>[3]^2</math>. <br />
So we think of x,y,z,a,b and c as subsets of a square. Each slice is one of the following.<br />
*<math>D_4 = y'zx,zx'y,xyz</math> (with one or two points removed)<br />
*<math>D_{4,2} = z'xy,xyz,yzx'</math> (with one or two points removed)<br />
*<math>D_{4,1} = xyz,yz'x,zxy'</math> (with one or two points removed)<br />
*<math>X = xyz, ybw, zwc</math><br />
*<math>Y = axw, xyz, wzc</math><br />
*<math>Z = awx, wby, xyz</math><br />
<br />
where a, b and c have four points each.<br />
<br />
.. 32 33 31 .. 33 .. .. ..<br />
a = .. 22 23 b = .. .. .. c = 21 22 ..<br />
.. .. .. 11 .. 13 11 12 ..<br />
<br />
x', y' and z' are subsets of x, y and z respectively, and have five points each.<br />
<br />
Suppose all three slices are subsets of <math>D_{4,j}</math>. <br />
We can remove at most five points from the full set of three D_{4,j}. <br />
Consider columns 2,3,4,6,7,8. At most two of these columns contain xyz, so one point must be removed from the other four.<br />
This uses up all but one of the removals.<br />
So the slices must be <math>D_{4,2},D_{4,1},D_{4,0}</math> or a cyclic permutation of that.<br />
Then the cube, which contains the first square of slice 1; the fifth square of slice 2; <br />
and the ninth square of slice 3, contains three copies of the same square. <br />
It takes more than one point removed to remove all lines from that cube.<br />
So we can't have all three slices subsets of <math>D_{4,j}</math>.<br />
<br />
Suppose one slice is X,Y or Z, and two others are subsets of <math>D_{4,j}</math>. <br />
We can remove at most three points from the full <math>D_{4,j}</math><br />
By symmetry, suppose one slice is X. Consider columns 2,3,4 and 7. They must be cyclic permutations of x,y,z,<br />
and two of them are not xyz, so must lose a point. <br />
Columns 6 and 8 must both lose a point, and we only have 150 points left.<br />
So if one slice is X,Y or Z, the full set contains a line.<br />
<br />
Suppose two slices are from X,Y and Z, and the other is a subset of <math>D_{4,j}</math>. <br />
By symmetry, suppose two slices are X and Y. Columns 3,6,7 and 8 all contain w, and therefore at most 16 points each.<br />
Columns 1,5 and 9 contain a,b, or c, and therefore at most 16 points. <br />
So the total number of points is at most 7*16+2*18 = 148. This contradicts the assumption of 151 points.<br />
<math>\Box</math><br />
<br />
'''Corollary''' <math>c_5 \leq 151 </math><br />
<br />
'''Proof''' By Lemmas 2 and 4, the maximum number of points is 52+50+49=151. <math>\Box</math><br />
<br />
'''Lemma 5.1''' No solution with 151 points contains as a slice the X defined in Lemma 2<br />
<br />
'''Proof''' Suppose one row is X. Another row is <math>D_{4,j}</math>.<br />
<br />
Suppose X is in the first row. Label the other rows with letters from the alphabet.<br />
<br />
xyz ybw zwc<br />
<br />
mno pqr stu<br />
<br />
def ghi jkl<br />
<br />
Reslice the array into a left nine, middle nine and right nine. One of these squares<br />
contains 52 points, and it can only be the left nine. One of its three columns contains<br />
18 points, and it can only be its left-hand column, xmd. So m=y and d=z. But none of the {math>D_{4,j}</math> begins with y or z, which is a contradiction. So X is not in the first row.<br />
<br />
So X is in the second or third row. By symmetry, suppose it is in the second row<br />
<br />
def ghi jkl<br />
<br />
xyz ybw zwc<br />
<br />
mno pqr stu<br />
<br />
Again, the left-hand nine must contain 52 points, so it is <math>D_{4,2}</math>.<br />
So either the first row is <math>D_{4,2}</math> or the third row is <math>D_{4,0}</math>.<br />
If the first row is <math>D_{4,2}</math> then the only way to have 50 points in the middle or right-hand nine is if the middle nine is X<br />
<br />
z'xy xyz yzx'<br />
<br />
xyz ybw zwc<br />
<br />
yzx' zwc stu<br />
<br />
In the seventh column, s contains 5 points and in the eighth column, t contains 4 points.<br />
The final row can now contain at most 48 points, and the whole array contains only 52+50+48 = 150 points.<br />
<br />
If the third row is <math>D_{4,0}</math>, then neither the middle nine nor the right-hand nine contains 50 points, by the classification of Lemma 4 and the formulas at the start of Lemma 5.<br />
Again, only 52+49+49 = 150 points are possible.<br />
<br />
A similar argument is possible if X is in the third row; or if X is replaced by Y or Z.<br />
<br />
So when a 151-point set is sliced into three, one slice is <math>D_{4,j}</math> and another slice is 50 points contained in <math>D_{4,k}</math>. <math>\Box</math><br />
<br />
'''Lemma 5.2''' There is no 151-point solution<br />
<br />
'''Proof''' Assume by symmetry that the first row contains 52 points and the second row contains 50.<br />
<br />
If <math>D_{4,1}</math> is in the first row, then the second row must be contained in <math>D_{4,0}</math>. <br />
<br />
xyz yz'x zxy'<br />
<br />
y'zx zx'y xyz<br />
<br />
def ghi jkl<br />
<br />
But then none of the left nine, middle nine or right nine can contain 52 points, which contradicts the corollary to Lemma 5.<br />
<br />
Suppose the first row contains D_{4,0}. Then the second row is contained in <math>D_{4,2}</math>, otherwise the cubes formed from the nine columns of the diagram would need to remove too many points.<br />
<br />
y'zx zx'y xyz<br />
<br />
z'xy xyz yzx'<br />
<br />
def ghi jkl<br />
<br />
But then neither the left nine, middle nine or right nine contains 52 points.<br />
<br />
So the first row contains <math>D_{4,2}</math>, and the second row is contained in <math>D_{4,1}</math>. Two points may be removed from the second row of this diagram.<br />
<br />
z'xy xyz yzx'<br />
<br />
xyz yz'x zxy'<br />
<br />
def ghi jkl<br />
<br />
Slice it into the left nine, middle nine and right nine. Two of them are contained in <math>D_{4,j}</math><br />
so at least two of def, ghi, and jkl are contained in the corresponding slice of <math>D_{4,0}</math>.<br />
Slice along a different axis, and at least two of dgj,ehk,fil are contained in the corresponding slice of <br />
<math>D_{4,0}</math>. <br />
So eight of the nine squares in the bottom row are contained in the corresponding square of <math>D_{4,0}</math>.<br />
Indeed, slice along other axes, and all points except one are contained within <math>D_{4,0}</math>. <br />
This point is the intersection of all the 49-point slices. <br />
<br />
So, if there is a 151-point solution, then after removal of the specified point, <br />
there is a 150-point solution, within <math>D_{5,j}</math>, whose slices in each direction are 52+50+48.<br />
<br />
z'xy xyz yzx'<br />
<br />
xyz yz'x zxy'<br />
<br />
y'zx zx'y xyz<br />
<br />
One point must be lost from columns 3, 6, 7 and 8, and four more from the major diagonal z'z'z. That leaves 148 points instead of 150.<br />
<br />
So the 150-point solution does not exist with 52+50+48 slices; so the 151 point solution does not exist.<math>\Box</math><br />
<br />
<br />
An integer programming method has established the upper bound <math>c_5\leq 150</math>, with 12 extremal solutions.<br />
<br />
[http://abel.math.umu.se/~klasm/extremal-c5 This file] contains the extermisers. One point per line and different extermisers separated by a line with “—”<br />
<br />
[http://abel.math.umu.se/~klasm/linprog-d=5-t=3.lpt This is the linear program], readable by Gnu’s glpsol linear programing solver, which also quickly proves that 150 is the optimum.<br />
<br />
Each variable corresponds to a point in the cube, numbered according to their lexicografic ordering. If a variable is 1 then the point is in the set, if it is 0 then it is not in the set.<br />
There is one linear inequality for each combinatorial line, stating that at least one point must be missing from the line.<br />
<br />
== n=6 ==<br />
<br />
:<math>c_6=450</math>:<br />
<br />
The upper bound follows since <math>c_6 \leq 3 c_5</math>. The lower bound can be formed by gluing together all the [[slice]]s <math>\Gamma_{a,b,c}</math> where (a,b,c) is a permutation of (0,2,4) or (1,2,3).<br />
<br />
Computer verification, using the <math>c_5=150</math> extremals, has shown that there is exactly one extremiser for <math>c_6=450</math>.<br />
<br />
== n=7 ==<br />
<br />
:<math>1302 \leq c_7 \leq 1348</math>:<br />
<br />
To see the upper bound <math>c_7 \leq 3c_6-2</math>, observe that if two parallel six-dimensional slices had <math>c_6</math> points, then by uniqueness they are identical, and the third slice can have at most <math>3^6-c_6=279</math> points, far too few to get anywhere close to <math>1348</math>. Thus there can be at most one slice with <math>c_6</math> points, and the other two have at most <math>c_6-1</math>, giving the claim.<br />
<br />
The lower bound can be formed by removing 016,106,052,502,151,511,160,610 from <math>D_7</math>.<br />
<br />
'''Lemma 6''' Any line-free subset of <math>D_7</math> has at most 1302 points.<br />
<br />
'''Proof''' Start with the 1458 points of <math>D_7</math>. You must lose:<br />
<br />
* 42 points from (1,2,4),(1,5,1),(4,2,1)<br />
* 42 points from (2,1,4),(2,4,1),(5,1,1)<br />
* 21 points from (0,2,5),(0,5,2),(3,2,2)<br />
* 21 points from (2,0,5),(2,3,2),(5,0,2)<br />
* 15 points from (0,1,6),(0,4,3),(3,1,3),(0,7,0),(3,4,0),(6,1,0)<br />
* 15 points from (1,0,6),(1,3,3),(4,0,3),(7,0,0),(4,3,0),(1,6,0)<br />
<br />
where (a,b,c) is shorthand for the [[slice]] <math>\Gamma_{a,b,c}</math>.<br />
<math>\Box</math><br />
<br />
== Larger n ==<br />
<br />
The following construction gives lower bounds for the number of triangle-free points, <br />
There are of the order <math>2.7 \sqrt{log(N)/N}3^N</math> points for large N (N ~ 5000)<br />
<br />
It applies when N is a multiple of 3. <br />
* For N=3M-1, restrict the first digit of a 3M sequence to be 1. So this construction has exactly one-third as many points for N=3M-1 as it has for N=3M. <br />
* For N=3M-2, restrict the first two digits of a 3M sequence to be 12. This leaves roughly one ninth of the points for N=3M-2 as for N=3M.<br />
<br />
The current lower bounds for <math>c_{3m}</math> are built like this, with abc being shorthand for <math>\Gamma_{a,b,c}</math>:<br />
<br />
* <math>c_3</math> from (012) and permutations<br />
* <math>c_6</math> from (123,024) and perms<br />
* <math>c_9</math> from (234,135,045) and perms<br />
* <math>c_{12}</math> from (345,246,156,02A,057) and perms (A=10)<br />
* <math>c_{15}</math> from (456,357,267,13B,168,04B,078) and perms (B=11)<br />
<br />
To get the triples in each row, add 1 to the triples in the previous row; then include new triples that have a zero.<br />
<br />
A general formula for these points is given below. I think that they are triangle-free. (For N<21, ignore any triple with a negative entry.)<br />
<br />
* There are thirteen groups of points in the centre, formed from adding one of the following points, or its permutation, to (M,M,M), when N=3M:<br />
** (-7,-3,+10), (-7, 0,+7),(-7,+3,+4),(-6,-4,+10),(-6,-1,+7),(-6,+2,+4),(-5,-1,+6),(-5,+2,+3),(-4,-2,+6),(-4,+1,+3),(-3,+1,+2),(-2,0,+2),(-1,0,+1) <br />
* There are also eight string of points, stretching to the edges of the (abc) triangle:<br />
** For N=6K = 3M<br />
*** M+(-8-2x,-6-2x,14+4x),M+(-8-2x,-3-2x,11+4x),M+(-8-2x,x,8+x),M+(-8-2x,3+x,5+x) and permutations (x>=0, M-8-2x>=0)<br />
*** M+(-9-2x,-5-2x,14+4x),M+(-9-2x,-2-2x,11+4x),M+(-9-2x,1+x,8+x),M+(-9-2x,4+x,5+x) and permutations (x>=0, M-9-2x>=0)<br />
<br />
<br />
An alternate construction:<br />
<br />
First define a sequence, of all positive numbers which, in base 3, do not contain a 1. Add 1 to all multiples of 3 in this sequence. This sequence does not contain a length-3 arithmetic progression.<br />
<br />
It starts 1,2,7,8,19,20,25,26,55, …<br />
<br />
Second, list all the (abc) triples for which the larger two differ by a number<br />
from the sequence, excluding the case when the smaller two differ by 1, but then including the case when (a,b,c) is a permutation of N/3+(-1,0,1)<br />
<br />
== Asymptotics ==<br />
<br />
DHJ(3) is equivalent to the upper bound<br />
<br />
:<math>c_n \leq o(3^n)</math><br />
<br />
In the opposite direction, observe that if we take a set <math>S \subset [3n]</math> that contains no 3-term arithmetic progressions, then the set <math>\bigcup_{(a,b,c) \in \Delta_n: a+2b \in S} \Gamma_{a,b,c}</math> is line-free. From this and the Behrend construction it appears that we have the lower bound<br />
<br />
:<math>c_n \geq 3^{n-O(\sqrt{\log n})}.</math><br />
<br />
More precisely, we have<br />
<br />
:<math>c_n > C 3^{n - 4\sqrt{\log 2}\sqrt{\log n}+\frac 12 \log \log n}</math><br />
for some absolute constant C, and where all logarithms are base-3.<br />
<br />
'''Proof''' For convenience, let n be a multiple of 3. Elkin’s bound gives <math>r_3(\sqrt{n}) > C \sqrt{n} (\log n)^{1/4} \exp_2(-2 \sqrt{\log_2 n})</math>, and let <math>R</math> be a subset of <math>(-3\sqrt{n}/2,3\sqrt{n}/2)</math> without 3-term APs and with size <math>r_3(\sqrt{n})</math>, and with all elements being integer multiples of 3 (again as a matter of convenience). For each <math>r,s\in R</math>, let <math>a = (n-r-s)/3</math>. The set <math>A</math> is the union of all <math>\Gamma_{a,a+r,a+s}</math>. Since all of <math>a, a+r,a+s</math> are between <math>n/3-2\sqrt{n}</math> and <math>n/3+2\sqrt{n}</math>, the size of <math>\Gamma_{a,a+r,a+s}</math> is at least <math>C 3^n / n</math>. Since there are <math>r_3(\sqrt{n})^2</math> choices for r and s, we have a set with size at least<br />
<br />
:<math>C (\sqrt{n} (\log n)^{1/4} \exp_2(-2 \sqrt{\log_2 n}))^2 3^n / n</math>.<br />
<br />
This simplifies to <math>C \sqrt{\log n} \exp_3(n-\alpha \sqrt{\log_3(n)})</math>, where <math>\alpha=4 \sqrt{\log_3(2)}</math>.<br />
<br />
Now suppose that <math>x_i\in \Gamma_{a_i,a_i+r_i,a_i+s_i}</math> is a combinatorial line in the set A. Then <math>(a_i+s_i)-(a_i)=s_i</math> is a 3-term AP contained in R, so the <math>s_i</math> are all the same. Similarly, all of the <math>r_i</math> are the same, and therefore all of the <math>a_i</math> are the same, too. But this implies that the <math>x_i</math> sequence is constant, which means the line is degenerate. <math>\Box</math><br />
<br />
[http://terrytao.wordpress.com/2009/02/05/upper-and-lower-bounds-for-the-density-hales-jewett-problem/#comment-35652 Numerics suggest] that the first large n construction given above above give a lower bound of roughly <math>2.7 \sqrt{\log(n)/n} \times 3^n</math>, which would asymptotically be inferior to the Behrend bound.<br />
<br />
The second large n construction had numerical asymptotics for <math>\log(c_n/3^n)</math> close to <math>1.2-\sqrt{\log(n)}</math> between n=1000 and n=10000, consistent with the Behrend bound.<br />
<br />
== Other k values ==<br />
<br />
A [http://abel.math.umu.se/~klasm/Data/HJ/ computer search] has found the following <math>c_n</math> values for different values of dimension n and edgelength k. Several of these values reach the upper bound of <math>(k-1)k^{n-1}</math>.<br />
<br />
{|<br />
|n\k || 2 || 3 || 4 || 5 || 6 || 7<br />
|-<br />
| 2 || 2 || 6 || 12 || 20 || 30 || 42<br />
|-<br />
| 3 || 3 || 18 || 48 || 100 || 180 || 294<br />
|-<br />
| 4 || 6 || 52 || 183 || 500 || 1029-1079 || 2058<br />
|-<br />
| 5 || 10 || 150 || 683-732 || 2500<br />
|}<br />
<br />
If k is prime and <math>k \ge n</math>, then one can remove all combinatorial lines by deleting all points whose coordinates sum to a multiple of k. So the density of deleted points in the optimal configuration is 1/k when k is prime.<br />
<br />
Let p be the smallest prime greater than or equal to k. One can remove all combinatorial lines by deleting all points whose coordinates sum to <math>0\le x\le p-k</math> (mod p), So the density of deleted points is at most (p-k+1)/p. This approaches zero as <math>k\rightarrow\infty</math>. For example, the following paper shows there is a prime between x-x^0.525 and x.<br />
<br />
Baker, R. C.(1-BYU); Harman, G.(4-LNDHB); Pintz, J.(H-AOS)<br />
The difference between consecutive primes. II.<br />
Proc. London Math. Soc. (3) 83 (2001), no. 3, 532–562.<br />
<br />
== Numerical methods ==<br />
<br />
A greedy algorithm [http://thetangentspace.com/wiki/Hales-Jewett_Theorem was implemented here]. The results were sharp for <math>n \leq 3</math> but were slightly inferior to the constructions above for larger n.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Upper_and_lower_boundsUpper and lower bounds2009-03-24T13:30:56Z<p>121.220.134.232: /* Other k values */</p>
<hr />
<div><center>'''Upper and lower bounds for <math>c_n</math> for small values of n.'''</center><br />
<br />
<math>c_n</math> is the size of the largest subset of <math>[3]^n</math> that does not contain a combinatorial line (OEIS [http://www.research.att.com/~njas/sequences/A156762 A156762]. A spreadsheet for all the latest bounds on <math>c_n</math> [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DsU-uZ1tK7VEg can be found here]. In this page we record the proofs justifying these bounds.<br />
<br />
<br />
{|<br />
| n || 0 || 1 || 2 || 3 || 4 || 5 || 6 || 7<br />
|-<br />
| <math>c_n</math> || 1 || 2 || 6 || 18 || 52 || 150 || 450 || [1302,1348]<br />
|}<br />
<br />
== Basic constructions ==<br />
<br />
For all <math>n \geq 1</math>, a basic example of a mostly line-free set is<br />
<br />
:<math>D_n := \{ (x_1,\ldots,x_n) \in [3]^n: \sum_{i=1}^n x_i \neq 0 \ \operatorname{mod}\ 3 \}</math>. (1)<br />
<br />
This has cardinality <math>|D_n| = 2 \times 3^{n-1}</math>. The only lines in <math>D_n</math> are those with<br />
<br />
# A number of wildcards equal to a multiple of three;<br />
# The number of 1s unequal to the number of 2s modulo 3.<br />
<br />
One way to construct line-free sets is to start with <math>D_n</math> and remove some additional points. We also have the variants <math>D_{n,0}=D_n, D_{n,1}, D_{n,2}</math> defined as<br />
<br />
:<math>D_{n,j} := \{ (x_1,\ldots,x_n) \in [3]^n: \sum_{i=1}^n x_i \neq j \ \operatorname{mod}\ 3 \}</math>. (1')<br />
<br />
When n is not a multiple of 3, then <math>D_{n,0}, D_{n,1}, D_{n,2}</math> are all cyclic permutations of each other; but when n is a multiple of 3, then <math>D_{n,0}</math> plays a special role (though <math>D_{n,1}, D_{n,2}</math> are still interchangeable).<br />
<br />
Another useful construction proceeds by using the slices <math>\Gamma_{a,b,c} \subset [3]^n</math> for <math>(a,b,c)</math> in the triangular grid<br />
<br />
:<math>\Delta_n := \{ (a,b,c) \in {\Bbb Z}_+^3: a+b+c = n \},</math>. (2)<br />
<br />
where <math>\Gamma_{a,b,c}</math> is defined as the strings in <math>[3]^n</math> with <math>a</math> 1s, <math>b</math> 2s, and <math>c</math> 3s. Note that<br />
<br />
:<math>|\Gamma_{a,b,c}| = \frac{n!}{a! b! c!}.</math> (3)<br />
<br />
Given any set <math>B \subset \Delta_n</math> that avoids equilateral triangles <math> (a+r,b,c), (a,b+r,c), (a,b,c+r)</math>, the set<br />
<br />
:<math>\Gamma_B := \bigcup_{(a,b,c) \in B} \Gamma_{a,b,c}</math> (4)<br />
<br />
is line-free and has cardinality<br />
<br />
:<math>|\Gamma_B| = \sum_{(a,b,c) \in B} \frac{n!}{a! b! c!},</math> (5)<br />
<br />
and thus provides a lower bound for <math>c_n</math>:<br />
<br />
:<math>c_n \geq \sum_{(a,b,c) \in B} \frac{n!}{a! b! c!}.</math> (6)<br />
<br />
All lower bounds on <math>c_n</math> have proceeded so far by choosing a good set of B and applying (6). Note that <math>D_n</math> is the same as <math>\Gamma_{B_n}</math>, where <math>B_n</math> consists of those triples <math>(a,b,c) \in \Delta_n</math> in which <math>a \neq b\ \operatorname{mod}\ 3</math>.<br />
<br />
Note that if one takes a line-free set and permutes the alphabet <math>\{1,2,3\}</math> in any fashion (e.g. replacing all 1s by 2s and vice versa), one also gets a line-free set. This potentially gives six examples from any given starting example of a line-free set, though in practice there is enough symmetry that the total number of examples produced this way is less than six. (These six examples also correspond to the six symmetries of the triangular grid <math>\Delta_n</math> formed by rotation and reflection.)<br />
<br />
Another symmetry comes from permuting the <math>n</math> indices in the strings of <math>[3]^n</math> (e.g. replacing every string by its reversal). But the sets <math>\Gamma_B</math> are automatically invariant under such permutations and thus do not produce new line-free sets via this symmetry.<br />
<br />
== The basic upper bound ==<br />
<br />
Because <math>[3]^{n+1}</math> can be expressed as the union of three copies of <math>[3]^n</math>, we have the basic upper bound<br />
<br />
:<math>c_{n+1} \leq 3 c_n.</math> (7)<br />
<br />
Note that equality only occurs if one can find an <math>n+1</math>-dimensional line-free set such that every n-dimensional slice has the maximum possible cardinality of <math>c_n</math>.<br />
<br />
== n=0 ==<br />
<br />
:<math>c_0=1</math>:<br />
<br />
This is clear.<br />
<br />
== n=1 ==<br />
<br />
:<math>c_1=2</math>:<br />
<br />
The three sets <math>D_1 = \{1,2\}</math>, <math>D_{1,1} = \{2,3\}</math>, and <math>D_{1,2} = \{1,3\}</math> are the only two-element sets which are line-free in <math>[3]^1</math>, and there are no three-element sets.<br />
<br />
== n=2 ==<br />
<br />
:<math>c_2=6</math>:<br />
<br />
There are four six-element sets in <math>[3]^2</math> which are line-free, which we denote <math>x = D_{2,2}</math>, <math>y=D_{2,1}</math>, <math>z=D_2</math>, and <math>w</math> and are displayed graphically as follows.<br />
<br />
13 .. 33 .. 23 33 13 23 .. 13 23 ..<br />
x = 12 22 .. y = 12 .. 32 z = .. 22 32 w = 12 .. 32<br />
.. 21 31 11 21 .. 11 .. 31 .. 21 31<br />
<br />
Combining this with the basic upper bound (7) we see that <math>c_2=6</math>.<br />
<br />
== n=3 ==<br />
<br />
:<math>c_3=18</math>:<br />
<br />
We describe a subset <math>A</math> of <math>[3]^3</math> as a string <math>abc</math>, where <math>a, b, c \subset [3]^2</math> correspond to strings of the form <math>1**</math>, <math>2**</math>, <math>3**</math> in <math>[3]^3</math> respectively. Thus for instance <math>D_3 = xyz</math>, and so from (7) we have <math>c_3=18</math>.<br />
<br />
'''Lemma 1.'''<br />
* The only 18-element line-free subset of <math>[3]^3</math> is <math>D_3 = xyz</math>.<br />
* The only 17-element line-free subsets of <math>[3]^3</math> are formed by removing a point from <math>D_3=xyz</math>, or by removing either 111, 222, or 333 from <math>D_{3,2} = yzx</math> or <math>D_{3,3}=zxy</math>.<br />
<br />
'''Proof'''. We prove the second claim. As <math>17=6+6+5</math>, and <math>c_2=6</math>, at least two of the slices of a 17-element line-free set must be from x, y, z, w, with the third slice having 5 points. If two of the slices are identical, the last slice can have only 3 points, a contradiction. If one of the slices is a w, then the 5-point slice will contain a diagonal, contradiction. By symmetry we may now assume that two of the slices are x and y, which force the last slice to be z with one point removed. Now one sees that the slices must be in the order xyz, yzx, or zxy, because any other combination has too many lines that need to be removed. The sets yzx, zxy contain the diagonal {111,222,333} and so one additional point needs to be removed. <br />
<br />
The first claim follows by a similar argument to the second.<br />
<math>\Box</math><br />
<br />
== n=4 ==<br />
<br />
:<math>c_4=52</math>:<br />
<br />
Indeed, divide a line-free set in <math>[3]^4</math> into three blocks <math>1***, 2***, 3***</math> of <math>[3]^3</math>. If two of them are of size 18, then they must both be xyz, and the third block can have at most 6 elements, leading to an inferior bound of 42. So the best one can do is <math>18+17+17=52</math> which can be attained by deleting the diagonal {1111,2222,3333} from <math>D_{4,1} = xyz\ yzx\ xzy</math>, <math>D_4 = yzx\ zxy\ xyz</math>, or <math>D_{4,2} = zxy\ xyz\ yzx</math>. In fact,<br />
<br />
'''Lemma 2.'''<br />
<br />
* The only 52-element line-free sets in <math>[3]^4</math> are formed by removing the diagonal {1111,2222,3333} from <math>D_{4,j}</math> for some j=0,1,2.<br />
* The only 51-element line-free sets in <math>[3]^4</math> are formed by removing the diagonal and one further point from <math>D_{4,j}</math> for some j=0,1,2.<br />
* The only 50-element line-free sets in <math>[3]^4</math> are formed by removing the diagonal and two further points from <math>D_{4,j}</math> for some j=0,1,2 OR is equal to one of the three permutations of the set <math>X := \Gamma_{3,1,0} \cup \Gamma_{3,0,1} \cup \Gamma_{2,2,0} \cup \Gamma_{2,0,2} \cup \Gamma_{1,1,2} \cup \Gamma_{1,2,1} \cup \Gamma_{0,2,2}</math>.<br />
<br />
'''Proof''' It suffices to prove the third claim. In fact it suffices to show that every 50-point line-free set is either contained in the 54-point set <math>D_{4,j}</math> for some j=0,1,2, or is some permutation of the set X. Indeed, if a 50-point line-free set is contained in, say, <math>D_4</math>, then it cannot contain 2222, since otherwise it must omit one point from each of the four pairs formed from {2333, 2111} by permuting the indices, and must also omit one of {1111, 1222, 1333}, leading to at most 49 points in all; similarly, it cannot contain 1111, and so omits the entire diagonal {1111,2222,3333}, with two more points to be omitted. Similarly when <math>D_4</math> is replaced by one of the other <math>D_{4,j}</math><br />
<br />
Next, observe that every three-dimensional slice of a line-free set can have at most <math>c_3=18</math> points; thus when one partitions a 50-point line-free set into three such slices, it must divide either as 18+16+16, 17+17+16, or some permutation of these.<br />
<br />
Suppose that we can slice the set into two slices of 17 points and one slice of 16 points. By the various symmetries, we may assume that the 1*** slice and 2*** slices have 17 points, and the 3*** slice has 16 points. By Lemma 1, the 1-slice is <math>\{1\} \times D_{3,j}</math> with one point removed, and the 2-slice is <math>\{2\} \times D_{3,k}</math> with one point removed, for some <math>j,k \in \{0,1,2\}</math>.<br />
<br />
If j=k, then the 1-slice and 2-slice have at least 15 points in common, so the 3-slice can have at most <math>27-15=12</math> points, a contradiction. If jk = 01, 12, or 20, then observe that from Lemma 1 the *1**, *2**, *3** slices cannot equal a 17-point or 18-point line-free set, so each have at most 16 points, leading to only 48 points in all, a contradiction. Thus we must have jk = 10, 21, or 02.<br />
<br />
Let's first suppose that jk=02. Then by Lemma 1, the 2*** slice contains the nine points formed from {2211, 2322, 2331} and permuting the last three indices, while the 1*** slice contains at least eight of the nine points formed from {1211, 1322, 1311} and permuting the last three indices. Thus the 3*** slice can contain at most one of the nine points formed from {3211, 3322, 3311} and permuting the last three indices. If it does contain one of these points, say 3211, then it must omit one point from each of the four pairs {3222, 3233}, {3212, 3213}, {3221, 3231}, {3111, 3311}, leading to at most 15 points on this slice, a contradiction. So the 3*** slice must omit all nine points, and is therefore contained in <math>\{3\} \times D_{4,1}</math>, and so the 50-point set is contained in <math>D_{4,1}</math>, and we are done by the discussion at the beginning of the proof.<br />
<br />
The case jk=10 is similar to the jk=02 case (indeed one can get from one case to the other by swapping the 1 and 2 indices). Now suppose instead that jk=12. Then by Lemma 1, the 1*** slice contains the six points from permuting the last three indices of 1123, and similarly the 2*** slice contains the six points from permuting the last three indices of 2123. Thus the 3*** slice must avoid all six points formed by permuting the last three indices of 3123. Similarly, as 1133 lies in the 1*** slice and 2233 lies in the 2*** slice, 3333 must be avoided in the 3*** slice.<br />
<br />
Now we claim that 3111 must be avoided also; for if 3111 was in the set, then one point from each of the six pairs formed from {3311, 3211}, {3331, 3221} and permuting the last three indices must lie outside the 3*** slice, which reduces the size of that slice to at most <math>27-6-1-6=14</math>, which is too small. Similarly, 3222 must be avoided, which puts the 3*** slice inside <math>\{3\} \times D_3</math> and then places the 50-point set inside <math>D_4</math>, and we are done by the discussion at the beginning of the proof.<br />
<br />
We have handled the case in which at least one of the slicings of the 50-point set is of the form 50=17+17+16. The only remaining case is when all slicings of the 50-point set are of the form 18+17+16 (or a permutation thereof). By the symmetries of the situation, we may assume that the 1*** slice has 18 points, and thus by Lemma 1 takes the form <math>\{1\} \times D_3</math>. Inspecting the *1**, *2**, *3** slices, we then see (from Lemma 1) that only the *1** slice can have 18 points; since we are assuming that this slicing is some permutation of 50=18+17+16, we conclude that the *1** slice must have exactly 18 points, and is thus described precisely by Lemma 1. Similarly for the **1* and ***1 slices. Indeed, by Lemma 1, we see that the 50-point set must agree exactly with <math>D_{4,1}</math> on any of these slices. In particular, on the remaining portion <math>\{2,3\}^4</math> of the cube, there are exactly 6 points of the 50-point set in <math>\{2,3\}^4</math>.<br />
<br />
Suppose that 3333 was in the set; then since all permutations of 3311, 3331 are known to lie in the set, then 3322, 3332 must lie outside the set. Also, as 1222 lies in the set, at least one of 2222, 3222 lie outside the set. This leaves only 5 points in <math>\{2,3\}^4</math>, a contradiction. Thus 3333 lies outside the set; similarly 2222 lies outside the set.<br />
<br />
Let a be the number of points in the 50-point set which are some permutation of 2233, thus <math>0 \leq a \leq 6</math>. If a=0 then the set lies in <math>D_{4,1}</math> and we are done. If a=6 then the set is exactly X and we are done. Now suppose a=1,2,3. By symmetry we may assume that 2233 lies in the set. Then (since 2133, 1233 2231, 2213 are known to lie in the set) 2333, 3233, 2223, 2232 lie outside the set, which leaves at most 5 points inside <math>\{2,3\}^4</math>, a contradiction.<br />
<br />
The remaining case is when a=4,5. Then one of the three pairs {2233, 3322}, {2323, 3232}, {2332, 3223} lie in the set. By symmetry we may assume that {2233, 3322} lie in the set. Then by arguing as before we see that all eight points formed by permuting 2333 or 3222 lie outside the set, leading to at most 5 points inside <math>\{2,3\}^4</math>, a contradiction.<br />
<math>\Box</math><br />
<br />
== n=5 ==<br />
<br />
:<math>c_5=150</math>:<br />
<br />
'''Lemma 3'''. Any line-free subset of <math>D_{5,j}</math> can have at most 150 points.<br />
<br />
'''Proof'''. By rotation we may work with <math>D_5</math>. This set has 162 points. By looking at the triplets {10000, 11110, 12220} and cyclic permutations we must lose 5 points; similarly from the triplets {20000,22220, 21110} and cyclic permutations. Finally from {11000,11111,11222} and {22000,22222,22111} we lose two more points. <math>\Box</math><br />
<br />
Equality can be attained by removing <math>\Gamma_{0,4,1}, \Gamma_{0,5,0}, \Gamma_{4,0,1}, \Gamma_{5,0,0}</math> from <math>D_5</math>. Thus <math>c_5 \geq 150</math>.<br />
<br />
Another pattern of 150 points is this: Take the 450 points<br />
in <math>{}[3]^6</math> which are (1,2,3), (0,2,4) and permutations,<br />
then select the 150 whose final coordinate is 1. That gives<br />
this many points in each cube:<br />
<br />
17 18 17<br />
<br />
17 17 18<br />
<br />
12 17 17<br />
<br />
'''Lemma 4'''. A line-free subset of <math>[3]^5</math> with over 150 points cannot have two parallel <math>[3]^4</math> slices, each of which contain at least 51 points.<br />
<br />
'''Proof'''. Suppose not. By symmetry, we may assume that the 1**** and 2**** slices have at least 51 points, and that the whole set has at least 151 points, which force the third slice to have at least <math>151-2c_4 = 47</math> points.<br />
<br />
By Lemma 2, the 1**** slice takes the form <math>\{1\} \times D_{4,j}</math> for some <math>j=0,1,2</math> with the diagonal {11111,12222,13333} and possibly one more point removed, and similarly the 2**** slice takes the form <math>\{2\} \times D_{4,k}</math> for some <math>k=0,1,2</math> with the diagonal {21111,22222,23333} and possibly one more point removed.<br />
<br />
Suppose first that j=k. Then the 1-slice and 2-slice have at least 50 points in common, leaving at most 31 points for the 3-slice, a contradiction. Next, suppose that jk=01. Then observe that the *i*** slice cannot look like any of the configurations in Lemma 2 and so must have at most 50 points for i=1,2,3, leading to 150 points in all, a contradiction. Similarly if jk=12 or 20. Thus we must have jk equal to 10, 21, or 02.<br />
<br />
Let's suppose first that jk=10. The first slice then is equal to <math>\{1\} \times D_{4,1}</math> with the diagonal and possibly one more point removed, while the second slice is equal to <math>\{2\} \times D_{4,0}</math> with the diagonal and possibly one more point removed. Superimposing these slices, we thus see that the third slice is contained in <math>\{3\} \times D_{4,2}</math> except possibly for two additional points, together with the one point 32222 of the diagonal that lies outside of <math>\{3\} \times D_{4,2}</math>.<br />
<br />
The lines x12xx, x13xx (plus permutations of the last four digits) must each contain one point outside the set. The first two slices can only absorb two of these, and so at least 14 of the 16 points formed by permuting the last four digits of 31233, 31333 must lie outside the set. These points all lie in <math>\{3\} \times D_{4,2}</math>, and so the 3**** slice can have at most <math>|D_{4,2}|-14+3=43</math> points, a contradiction.<br />
<br />
The case jk=02 is similar to the case jk=10 (indeed one can obtain one from the other by swapping 1 and 2). Now we turn to the case jk=21. Arguing as before we see that the third slice is contained in <math>\{3\} \times D_4</math> except possibly for two points, together with 33333. <br />
<br />
If 33333 was in the set, then each of the lines xx333, xxx33 (and permutations of the last four digits) must have a point missing from the first two slices, which cannot be absorbed by the two points we are permitted to remove; thus 33333 is not in the set. For similar reasons, 33331 is not in the set, as can be seen by looking at xxx31 and permutations of the last four digits. Indeed, any string containing four threes does not lie in the set; this means that at least 8 points are missing from <math>\{3\} \times D_4</math>, leaving only at most 46 points inside that set. Furthermore, any point in the 3**** slice outside of <math>\{3\} \times D_4</math> can only be created by removing a point from the first two slices, so the total cardinality is at most <math>46+52+52 = 150</math>, a contradiction.<math>\Box</math><br />
<br />
'''Corollary'''. <math>c_5 \leq 152</math><br />
<br />
'''Proof'''. By Lemma 4 and the bound <math>c_4=52</math>, any line-free set with over 150 points can have one slice of cardinality 52, but then the other two slices can have at most 50 points. <math>\Box</math><br />
<br />
<br />
'''Lemma 5''' Any solution with 151 or more points has a slice with at most 49 points.<br />
<br />
'''Proof''' Suppose we have 151 points without a line, and each of three slices has at least 50 points.<br />
<br />
Using earlier notation, we split subsets of <math>[3]^4</math> into nine subsets of <math>[3]^2</math>. <br />
So we think of x,y,z,a,b and c as subsets of a square. Each slice is one of the following.<br />
*<math>D_4 = y'zx,zx'y,xyz</math> (with one or two points removed)<br />
*<math>D_{4,2} = z'xy,xyz,yzx'</math> (with one or two points removed)<br />
*<math>D_{4,1} = xyz,yz'x,zxy'</math> (with one or two points removed)<br />
*<math>X = xyz, ybw, zwc</math><br />
*<math>Y = axw, xyz, wzc</math><br />
*<math>Z = awx, wby, xyz</math><br />
<br />
where a, b and c have four points each.<br />
<br />
.. 32 33 31 .. 33 .. .. ..<br />
a = .. 22 23 b = .. .. .. c = 21 22 ..<br />
.. .. .. 11 .. 13 11 12 ..<br />
<br />
x', y' and z' are subsets of x, y and z respectively, and have five points each.<br />
<br />
Suppose all three slices are subsets of <math>D_{4,j}</math>. <br />
We can remove at most five points from the full set of three D_{4,j}. <br />
Consider columns 2,3,4,6,7,8. At most two of these columns contain xyz, so one point must be removed from the other four.<br />
This uses up all but one of the removals.<br />
So the slices must be <math>D_{4,2},D_{4,1},D_{4,0}</math> or a cyclic permutation of that.<br />
Then the cube, which contains the first square of slice 1; the fifth square of slice 2; <br />
and the ninth square of slice 3, contains three copies of the same square. <br />
It takes more than one point removed to remove all lines from that cube.<br />
So we can't have all three slices subsets of <math>D_{4,j}</math>.<br />
<br />
Suppose one slice is X,Y or Z, and two others are subsets of <math>D_{4,j}</math>. <br />
We can remove at most three points from the full <math>D_{4,j}</math><br />
By symmetry, suppose one slice is X. Consider columns 2,3,4 and 7. They must be cyclic permutations of x,y,z,<br />
and two of them are not xyz, so must lose a point. <br />
Columns 6 and 8 must both lose a point, and we only have 150 points left.<br />
So if one slice is X,Y or Z, the full set contains a line.<br />
<br />
Suppose two slices are from X,Y and Z, and the other is a subset of <math>D_{4,j}</math>. <br />
By symmetry, suppose two slices are X and Y. Columns 3,6,7 and 8 all contain w, and therefore at most 16 points each.<br />
Columns 1,5 and 9 contain a,b, or c, and therefore at most 16 points. <br />
So the total number of points is at most 7*16+2*18 = 148. This contradicts the assumption of 151 points.<br />
<math>\Box</math><br />
<br />
'''Corollary''' <math>c_5 \leq 151 </math><br />
<br />
'''Proof''' By Lemmas 2 and 4, the maximum number of points is 52+50+49=151. <math>\Box</math><br />
<br />
'''Lemma 5.1''' No solution with 151 points contains as a slice the X defined in Lemma 2<br />
<br />
'''Proof''' Suppose one row is X. Another row is <math>D_{4,j}</math>.<br />
<br />
Suppose X is in the first row. Label the other rows with letters from the alphabet.<br />
<br />
xyz ybw zwc<br />
<br />
mno pqr stu<br />
<br />
def ghi jkl<br />
<br />
Reslice the array into a left nine, middle nine and right nine. One of these squares<br />
contains 52 points, and it can only be the left nine. One of its three columns contains<br />
18 points, and it can only be its left-hand column, xmd. So m=y and d=z. But none of the {math>D_{4,j}</math> begins with y or z, which is a contradiction. So X is not in the first row.<br />
<br />
So X is in the second or third row. By symmetry, suppose it is in the second row<br />
<br />
def ghi jkl<br />
<br />
xyz ybw zwc<br />
<br />
mno pqr stu<br />
<br />
Again, the left-hand nine must contain 52 points, so it is <math>D_{4,2}</math>.<br />
So either the first row is <math>D_{4,2}</math> or the third row is <math>D_{4,0}</math>.<br />
If the first row is <math>D_{4,2}</math> then the only way to have 50 points in the middle or right-hand nine is if the middle nine is X<br />
<br />
z'xy xyz yzx'<br />
<br />
xyz ybw zwc<br />
<br />
yzx' zwc stu<br />
<br />
In the seventh column, s contains 5 points and in the eighth column, t contains 4 points.<br />
The final row can now contain at most 48 points, and the whole array contains only 52+50+48 = 150 points.<br />
<br />
If the third row is <math>D_{4,0}</math>, then neither the middle nine nor the right-hand nine contains 50 points, by the classification of Lemma 4 and the formulas at the start of Lemma 5.<br />
Again, only 52+49+49 = 150 points are possible.<br />
<br />
A similar argument is possible if X is in the third row; or if X is replaced by Y or Z.<br />
<br />
So when a 151-point set is sliced into three, one slice is <math>D_{4,j}</math> and another slice is 50 points contained in <math>D_{4,k}</math>. <math>\Box</math><br />
<br />
'''Lemma 5.2''' There is no 151-point solution<br />
<br />
'''Proof''' Assume by symmetry that the first row contains 52 points and the second row contains 50.<br />
<br />
If <math>D_{4,1}</math> is in the first row, then the second row must be contained in <math>D_{4,0}</math>. <br />
<br />
xyz yz'x zxy'<br />
<br />
y'zx zx'y xyz<br />
<br />
def ghi jkl<br />
<br />
But then none of the left nine, middle nine or right nine can contain 52 points, which contradicts the corollary to Lemma 5.<br />
<br />
Suppose the first row contains D_{4,0}. Then the second row is contained in <math>D_{4,2}</math>, otherwise the cubes formed from the nine columns of the diagram would need to remove too many points.<br />
<br />
y'zx zx'y xyz<br />
<br />
z'xy xyz yzx'<br />
<br />
def ghi jkl<br />
<br />
But then neither the left nine, middle nine or right nine contains 52 points.<br />
<br />
So the first row contains <math>D_{4,2}</math>, and the second row is contained in <math>D_{4,1}</math>. Two points may be removed from the second row of this diagram.<br />
<br />
z'xy xyz yzx'<br />
<br />
xyz yz'x zxy'<br />
<br />
def ghi jkl<br />
<br />
Slice it into the left nine, middle nine and right nine. Two of them are contained in <math>D_{4,j}</math><br />
so at least two of def, ghi, and jkl are contained in the corresponding slice of <math>D_{4,0}</math>.<br />
Slice along a different axis, and at least two of dgj,ehk,fil are contained in the corresponding slice of <br />
<math>D_{4,0}</math>. <br />
So eight of the nine squares in the bottom row are contained in the corresponding square of <math>D_{4,0}</math>.<br />
Indeed, slice along other axes, and all points except one are contained within <math>D_{4,0}</math>. <br />
This point is the intersection of all the 49-point slices. <br />
<br />
So, if there is a 151-point solution, then after removal of the specified point, <br />
there is a 150-point solution, within <math>D_{5,j}</math>, whose slices in each direction are 52+50+48.<br />
<br />
z'xy xyz yzx'<br />
<br />
xyz yz'x zxy'<br />
<br />
y'zx zx'y xyz<br />
<br />
One point must be lost from columns 3, 6, 7 and 8, and four more from the major diagonal z'z'z. That leaves 148 points instead of 150.<br />
<br />
So the 150-point solution does not exist with 52+50+48 slices; so the 151 point solution does not exist.<math>\Box</math><br />
<br />
<br />
An integer programming method has established the upper bound <math>c_5\leq 150</math>, with 12 extremal solutions.<br />
<br />
[http://abel.math.umu.se/~klasm/extremal-c5 This file] contains the extermisers. One point per line and different extermisers separated by a line with “—”<br />
<br />
[http://abel.math.umu.se/~klasm/linprog-d=5-t=3.lpt This is the linear program], readable by Gnu’s glpsol linear programing solver, which also quickly proves that 150 is the optimum.<br />
<br />
Each variable corresponds to a point in the cube, numbered according to their lexicografic ordering. If a variable is 1 then the point is in the set, if it is 0 then it is not in the set.<br />
There is one linear inequality for each combinatorial line, stating that at least one point must be missing from the line.<br />
<br />
== n=6 ==<br />
<br />
:<math>c_6=450</math>:<br />
<br />
The upper bound follows since <math>c_6 \leq 3 c_5</math>. The lower bound can be formed by gluing together all the [[slice]]s <math>\Gamma_{a,b,c}</math> where (a,b,c) is a permutation of (0,2,4) or (1,2,3).<br />
<br />
Computer verification, using the <math>c_5=150</math> extremals, has shown that there is exactly one extremiser for <math>c_6=450</math>.<br />
<br />
== n=7 ==<br />
<br />
:<math>1302 \leq c_7 \leq 1348</math>:<br />
<br />
To see the upper bound <math>c_7 \leq 3c_6-2</math>, observe that if two parallel six-dimensional slices had <math>c_6</math> points, then by uniqueness they are identical, and the third slice can have at most <math>3^6-c_6=279</math> points, far too few to get anywhere close to <math>1348</math>. Thus there can be at most one slice with <math>c_6</math> points, and the other two have at most <math>c_6-1</math>, giving the claim.<br />
<br />
The lower bound can be formed by removing 016,106,052,502,151,511,160,610 from <math>D_7</math>.<br />
<br />
'''Lemma 6''' Any line-free subset of <math>D_7</math> has at most 1302 points.<br />
<br />
'''Proof''' Start with the 1458 points of <math>D_7</math>. You must lose:<br />
<br />
* 42 points from (1,2,4),(1,5,1),(4,2,1)<br />
* 42 points from (2,1,4),(2,4,1),(5,1,1)<br />
* 21 points from (0,2,5),(0,5,2),(3,2,2)<br />
* 21 points from (2,0,5),(2,3,2),(5,0,2)<br />
* 15 points from (0,1,6),(0,4,3),(3,1,3),(0,7,0),(3,4,0),(6,1,0)<br />
* 15 points from (1,0,6),(1,3,3),(4,0,3),(7,0,0),(4,3,0),(1,6,0)<br />
<br />
where (a,b,c) is shorthand for the [[slice]] <math>\Gamma_{a,b,c}</math>.<br />
<math>\Box</math><br />
<br />
== Larger n ==<br />
<br />
The following construction gives lower bounds for the number of triangle-free points, <br />
There are of the order <math>2.7 \sqrt{log(N)/N}3^N</math> points for large N (N ~ 5000)<br />
<br />
It applies when N is a multiple of 3. <br />
* For N=3M-1, restrict the first digit of a 3M sequence to be 1. So this construction has exactly one-third as many points for N=3M-1 as it has for N=3M. <br />
* For N=3M-2, restrict the first two digits of a 3M sequence to be 12. This leaves roughly one ninth of the points for N=3M-2 as for N=3M.<br />
<br />
The current lower bounds for <math>c_{3m}</math> are built like this, with abc being shorthand for <math>\Gamma_{a,b,c}</math>:<br />
<br />
* <math>c_3</math> from (012) and permutations<br />
* <math>c_6</math> from (123,024) and perms<br />
* <math>c_9</math> from (234,135,045) and perms<br />
* <math>c_{12}</math> from (345,246,156,02A,057) and perms (A=10)<br />
* <math>c_{15}</math> from (456,357,267,13B,168,04B,078) and perms (B=11)<br />
<br />
To get the triples in each row, add 1 to the triples in the previous row; then include new triples that have a zero.<br />
<br />
A general formula for these points is given below. I think that they are triangle-free. (For N<21, ignore any triple with a negative entry.)<br />
<br />
* There are thirteen groups of points in the centre, formed from adding one of the following points, or its permutation, to (M,M,M), when N=3M:<br />
** (-7,-3,+10), (-7, 0,+7),(-7,+3,+4),(-6,-4,+10),(-6,-1,+7),(-6,+2,+4),(-5,-1,+6),(-5,+2,+3),(-4,-2,+6),(-4,+1,+3),(-3,+1,+2),(-2,0,+2),(-1,0,+1) <br />
* There are also eight string of points, stretching to the edges of the (abc) triangle:<br />
** For N=6K = 3M<br />
*** M+(-8-2x,-6-2x,14+4x),M+(-8-2x,-3-2x,11+4x),M+(-8-2x,x,8+x),M+(-8-2x,3+x,5+x) and permutations (x>=0, M-8-2x>=0)<br />
*** M+(-9-2x,-5-2x,14+4x),M+(-9-2x,-2-2x,11+4x),M+(-9-2x,1+x,8+x),M+(-9-2x,4+x,5+x) and permutations (x>=0, M-9-2x>=0)<br />
<br />
<br />
An alternate construction:<br />
<br />
First define a sequence, of all positive numbers which, in base 3, do not contain a 1. Add 1 to all multiples of 3 in this sequence. This sequence does not contain a length-3 arithmetic progression.<br />
<br />
It starts 1,2,7,8,19,20,25,26,55, …<br />
<br />
Second, list all the (abc) triples for which the larger two differ by a number<br />
from the sequence, excluding the case when the smaller two differ by 1, but then including the case when (a,b,c) is a permutation of N/3+(-1,0,1)<br />
<br />
== Asymptotics ==<br />
<br />
DHJ(3) is equivalent to the upper bound<br />
<br />
:<math>c_n \leq o(3^n)</math><br />
<br />
In the opposite direction, observe that if we take a set <math>S \subset [3n]</math> that contains no 3-term arithmetic progressions, then the set <math>\bigcup_{(a,b,c) \in \Delta_n: a+2b \in S} \Gamma_{a,b,c}</math> is line-free. From this and the Behrend construction it appears that we have the lower bound<br />
<br />
:<math>c_n \geq 3^{n-O(\sqrt{\log n})}.</math><br />
<br />
More precisely, we have<br />
<br />
:<math>c_n > C 3^{n - 4\sqrt{\log 2}\sqrt{\log n}+\frac 12 \log \log n}</math><br />
for some absolute constant C, and where all logarithms are base-3.<br />
<br />
'''Proof''' For convenience, let n be a multiple of 3. Elkin’s bound gives <math>r_3(\sqrt{n}) > C \sqrt{n} (\log n)^{1/4} \exp_2(-2 \sqrt{\log_2 n})</math>, and let <math>R</math> be a subset of <math>(-3\sqrt{n}/2,3\sqrt{n}/2)</math> without 3-term APs and with size <math>r_3(\sqrt{n})</math>, and with all elements being integer multiples of 3 (again as a matter of convenience). For each <math>r,s\in R</math>, let <math>a = (n-r-s)/3</math>. The set <math>A</math> is the union of all <math>\Gamma_{a,a+r,a+s}</math>. Since all of <math>a, a+r,a+s</math> are between <math>n/3-2\sqrt{n}</math> and <math>n/3+2\sqrt{n}</math>, the size of <math>\Gamma_{a,a+r,a+s}</math> is at least <math>C 3^n / n</math>. Since there are <math>r_3(\sqrt{n})^2</math> choices for r and s, we have a set with size at least<br />
<br />
:<math>C (\sqrt{n} (\log n)^{1/4} \exp_2(-2 \sqrt{\log_2 n}))^2 3^n / n</math>.<br />
<br />
This simplifies to <math>C \sqrt{\log n} \exp_3(n-\alpha \sqrt{\log_3(n)})</math>, where <math>\alpha=4 \sqrt{\log_3(2)}</math>.<br />
<br />
Now suppose that <math>x_i\in \Gamma_{a_i,a_i+r_i,a_i+s_i}</math> is a combinatorial line in the set A. Then <math>(a_i+s_i)-(a_i)=s_i</math> is a 3-term AP contained in R, so the <math>s_i</math> are all the same. Similarly, all of the <math>r_i</math> are the same, and therefore all of the <math>a_i</math> are the same, too. But this implies that the <math>x_i</math> sequence is constant, which means the line is degenerate. <math>\Box</math><br />
<br />
[http://terrytao.wordpress.com/2009/02/05/upper-and-lower-bounds-for-the-density-hales-jewett-problem/#comment-35652 Numerics suggest] that the first large n construction given above above give a lower bound of roughly <math>2.7 \sqrt{\log(n)/n} \times 3^n</math>, which would asymptotically be inferior to the Behrend bound.<br />
<br />
The second large n construction had numerical asymptotics for <math>\log(c_n/3^n)</math> close to <math>1.2-\sqrt{\log(n)}</math> between n=1000 and n=10000, consistent with the Behrend bound.<br />
<br />
== Other k values ==<br />
<br />
A [http://abel.math.umu.se/~klasm/Data/HJ/ computer search] has found the following <math>c_n</math> values for different values of dimension n and edgelength k. Several of these values reach the upper bound of <math>(k-1)k^{n-1}</math>.<br />
<br />
{|<br />
|n\k || 2 || 3 || 4 || 5 || 6 || 7<br />
|-<br />
| 2 || 2 || 6 || 12 || 20 || 30 || 42<br />
|-<br />
| 3 || 3 || 18 || 48 || 100 || 180 || 294<br />
|-<br />
| 4 || 6 || 52 || 183 || 500 || 1029-1079 || 2058<br />
|-<br />
| 5 || 10 || 150 || 683-732 || 2500<br />
|}<br />
<br />
If k is prime and <math>k \ge d</math>, then one can remove all combinatorial lines by deleting all points whose coordinates sum to a multiple of k.<br />
<br />
== Numerical methods ==<br />
<br />
A greedy algorithm [http://thetangentspace.com/wiki/Hales-Jewett_Theorem was implemented here]. The results were sharp for <math>n \leq 3</math> but were slightly inferior to the constructions above for larger n.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Upper_and_lower_boundsUpper and lower bounds2009-03-23T20:20:11Z<p>121.220.134.232: /* Other k values */</p>
<hr />
<div><center>'''Upper and lower bounds for <math>c_n</math> for small values of n.'''</center><br />
<br />
<math>c_n</math> is the size of the largest subset of <math>[3]^n</math> that does not contain a combinatorial line (OEIS [http://www.research.att.com/~njas/sequences/A156762 A156762]. A spreadsheet for all the latest bounds on <math>c_n</math> [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DsU-uZ1tK7VEg can be found here]. In this page we record the proofs justifying these bounds.<br />
<br />
<br />
{|<br />
| n || 0 || 1 || 2 || 3 || 4 || 5 || 6 || 7<br />
|-<br />
| <math>c_n</math> || 1 || 2 || 6 || 18 || 52 || 150 || 450 || [1302,1348]<br />
|}<br />
<br />
== Basic constructions ==<br />
<br />
For all <math>n \geq 1</math>, a basic example of a mostly line-free set is<br />
<br />
:<math>D_n := \{ (x_1,\ldots,x_n) \in [3]^n: \sum_{i=1}^n x_i \neq 0 \ \operatorname{mod}\ 3 \}</math>. (1)<br />
<br />
This has cardinality <math>|D_n| = 2 \times 3^{n-1}</math>. The only lines in <math>D_n</math> are those with<br />
<br />
# A number of wildcards equal to a multiple of three;<br />
# The number of 1s unequal to the number of 2s modulo 3.<br />
<br />
One way to construct line-free sets is to start with <math>D_n</math> and remove some additional points. We also have the variants <math>D_{n,0}=D_n, D_{n,1}, D_{n,2}</math> defined as<br />
<br />
:<math>D_{n,j} := \{ (x_1,\ldots,x_n) \in [3]^n: \sum_{i=1}^n x_i \neq j \ \operatorname{mod}\ 3 \}</math>. (1')<br />
<br />
When n is not a multiple of 3, then <math>D_{n,0}, D_{n,1}, D_{n,2}</math> are all cyclic permutations of each other; but when n is a multiple of 3, then <math>D_{n,0}</math> plays a special role (though <math>D_{n,1}, D_{n,2}</math> are still interchangeable).<br />
<br />
Another useful construction proceeds by using the slices <math>\Gamma_{a,b,c} \subset [3]^n</math> for <math>(a,b,c)</math> in the triangular grid<br />
<br />
:<math>\Delta_n := \{ (a,b,c) \in {\Bbb Z}_+^3: a+b+c = n \},</math>. (2)<br />
<br />
where <math>\Gamma_{a,b,c}</math> is defined as the strings in <math>[3]^n</math> with <math>a</math> 1s, <math>b</math> 2s, and <math>c</math> 3s. Note that<br />
<br />
:<math>|\Gamma_{a,b,c}| = \frac{n!}{a! b! c!}.</math> (3)<br />
<br />
Given any set <math>B \subset \Delta_n</math> that avoids equilateral triangles <math> (a+r,b,c), (a,b+r,c), (a,b,c+r)</math>, the set<br />
<br />
:<math>\Gamma_B := \bigcup_{(a,b,c) \in B} \Gamma_{a,b,c}</math> (4)<br />
<br />
is line-free and has cardinality<br />
<br />
:<math>|\Gamma_B| = \sum_{(a,b,c) \in B} \frac{n!}{a! b! c!},</math> (5)<br />
<br />
and thus provides a lower bound for <math>c_n</math>:<br />
<br />
:<math>c_n \geq \sum_{(a,b,c) \in B} \frac{n!}{a! b! c!}.</math> (6)<br />
<br />
All lower bounds on <math>c_n</math> have proceeded so far by choosing a good set of B and applying (6). Note that <math>D_n</math> is the same as <math>\Gamma_{B_n}</math>, where <math>B_n</math> consists of those triples <math>(a,b,c) \in \Delta_n</math> in which <math>a \neq b\ \operatorname{mod}\ 3</math>.<br />
<br />
Note that if one takes a line-free set and permutes the alphabet <math>\{1,2,3\}</math> in any fashion (e.g. replacing all 1s by 2s and vice versa), one also gets a line-free set. This potentially gives six examples from any given starting example of a line-free set, though in practice there is enough symmetry that the total number of examples produced this way is less than six. (These six examples also correspond to the six symmetries of the triangular grid <math>\Delta_n</math> formed by rotation and reflection.)<br />
<br />
Another symmetry comes from permuting the <math>n</math> indices in the strings of <math>[3]^n</math> (e.g. replacing every string by its reversal). But the sets <math>\Gamma_B</math> are automatically invariant under such permutations and thus do not produce new line-free sets via this symmetry.<br />
<br />
== The basic upper bound ==<br />
<br />
Because <math>[3]^{n+1}</math> can be expressed as the union of three copies of <math>[3]^n</math>, we have the basic upper bound<br />
<br />
:<math>c_{n+1} \leq 3 c_n.</math> (7)<br />
<br />
Note that equality only occurs if one can find an <math>n+1</math>-dimensional line-free set such that every n-dimensional slice has the maximum possible cardinality of <math>c_n</math>.<br />
<br />
== n=0 ==<br />
<br />
:<math>c_0=1</math>:<br />
<br />
This is clear.<br />
<br />
== n=1 ==<br />
<br />
:<math>c_1=2</math>:<br />
<br />
The three sets <math>D_1 = \{1,2\}</math>, <math>D_{1,1} = \{2,3\}</math>, and <math>D_{1,2} = \{1,3\}</math> are the only two-element sets which are line-free in <math>[3]^1</math>, and there are no three-element sets.<br />
<br />
== n=2 ==<br />
<br />
:<math>c_2=6</math>:<br />
<br />
There are four six-element sets in <math>[3]^2</math> which are line-free, which we denote <math>x = D_{2,2}</math>, <math>y=D_{2,1}</math>, <math>z=D_2</math>, and <math>w</math> and are displayed graphically as follows.<br />
<br />
13 .. 33 .. 23 33 13 23 .. 13 23 ..<br />
x = 12 22 .. y = 12 .. 32 z = .. 22 32 w = 12 .. 32<br />
.. 21 31 11 21 .. 11 .. 31 .. 21 31<br />
<br />
Combining this with the basic upper bound (7) we see that <math>c_2=6</math>.<br />
<br />
== n=3 ==<br />
<br />
:<math>c_3=18</math>:<br />
<br />
We describe a subset <math>A</math> of <math>[3]^3</math> as a string <math>abc</math>, where <math>a, b, c \subset [3]^2</math> correspond to strings of the form <math>1**</math>, <math>2**</math>, <math>3**</math> in <math>[3]^3</math> respectively. Thus for instance <math>D_3 = xyz</math>, and so from (7) we have <math>c_3=18</math>.<br />
<br />
'''Lemma 1.'''<br />
* The only 18-element line-free subset of <math>[3]^3</math> is <math>D_3 = xyz</math>.<br />
* The only 17-element line-free subsets of <math>[3]^3</math> are formed by removing a point from <math>D_3=xyz</math>, or by removing either 111, 222, or 333 from <math>D_{3,2} = yzx</math> or <math>D_{3,3}=zxy</math>.<br />
<br />
'''Proof'''. We prove the second claim. As <math>17=6+6+5</math>, and <math>c_2=6</math>, at least two of the slices of a 17-element line-free set must be from x, y, z, w, with the third slice having 5 points. If two of the slices are identical, the last slice can have only 3 points, a contradiction. If one of the slices is a w, then the 5-point slice will contain a diagonal, contradiction. By symmetry we may now assume that two of the slices are x and y, which force the last slice to be z with one point removed. Now one sees that the slices must be in the order xyz, yzx, or zxy, because any other combination has too many lines that need to be removed. The sets yzx, zxy contain the diagonal {111,222,333} and so one additional point needs to be removed. <br />
<br />
The first claim follows by a similar argument to the second.<br />
<math>\Box</math><br />
<br />
== n=4 ==<br />
<br />
:<math>c_4=52</math>:<br />
<br />
Indeed, divide a line-free set in <math>[3]^4</math> into three blocks <math>1***, 2***, 3***</math> of <math>[3]^3</math>. If two of them are of size 18, then they must both be xyz, and the third block can have at most 6 elements, leading to an inferior bound of 42. So the best one can do is <math>18+17+17=52</math> which can be attained by deleting the diagonal {1111,2222,3333} from <math>D_{4,1} = xyz\ yzx\ xzy</math>, <math>D_4 = yzx\ zxy\ xyz</math>, or <math>D_{4,2} = zxy\ xyz\ yzx</math>. In fact,<br />
<br />
'''Lemma 2.'''<br />
<br />
* The only 52-element line-free sets in <math>[3]^4</math> are formed by removing the diagonal {1111,2222,3333} from <math>D_{4,j}</math> for some j=0,1,2.<br />
* The only 51-element line-free sets in <math>[3]^4</math> are formed by removing the diagonal and one further point from <math>D_{4,j}</math> for some j=0,1,2.<br />
* The only 50-element line-free sets in <math>[3]^4</math> are formed by removing the diagonal and two further points from <math>D_{4,j}</math> for some j=0,1,2 OR is equal to one of the three permutations of the set <math>X := \Gamma_{3,1,0} \cup \Gamma_{3,0,1} \cup \Gamma_{2,2,0} \cup \Gamma_{2,0,2} \cup \Gamma_{1,1,2} \cup \Gamma_{1,2,1} \cup \Gamma_{0,2,2}</math>.<br />
<br />
'''Proof''' It suffices to prove the third claim. In fact it suffices to show that every 50-point line-free set is either contained in the 54-point set <math>D_{4,j}</math> for some j=0,1,2, or is some permutation of the set X. Indeed, if a 50-point line-free set is contained in, say, <math>D_4</math>, then it cannot contain 2222, since otherwise it must omit one point from each of the four pairs formed from {2333, 2111} by permuting the indices, and must also omit one of {1111, 1222, 1333}, leading to at most 49 points in all; similarly, it cannot contain 1111, and so omits the entire diagonal {1111,2222,3333}, with two more points to be omitted. Similarly when <math>D_4</math> is replaced by one of the other <math>D_{4,j}</math><br />
<br />
Next, observe that every three-dimensional slice of a line-free set can have at most <math>c_3=18</math> points; thus when one partitions a 50-point line-free set into three such slices, it must divide either as 18+16+16, 17+17+16, or some permutation of these.<br />
<br />
Suppose that we can slice the set into two slices of 17 points and one slice of 16 points. By the various symmetries, we may assume that the 1*** slice and 2*** slices have 17 points, and the 3*** slice has 16 points. By Lemma 1, the 1-slice is <math>\{1\} \times D_{3,j}</math> with one point removed, and the 2-slice is <math>\{2\} \times D_{3,k}</math> with one point removed, for some <math>j,k \in \{0,1,2\}</math>.<br />
<br />
If j=k, then the 1-slice and 2-slice have at least 15 points in common, so the 3-slice can have at most <math>27-15=12</math> points, a contradiction. If jk = 01, 12, or 20, then observe that from Lemma 1 the *1**, *2**, *3** slices cannot equal a 17-point or 18-point line-free set, so each have at most 16 points, leading to only 48 points in all, a contradiction. Thus we must have jk = 10, 21, or 02.<br />
<br />
Let's first suppose that jk=02. Then by Lemma 1, the 2*** slice contains the nine points formed from {2211, 2322, 2331} and permuting the last three indices, while the 1*** slice contains at least eight of the nine points formed from {1211, 1322, 1311} and permuting the last three indices. Thus the 3*** slice can contain at most one of the nine points formed from {3211, 3322, 3311} and permuting the last three indices. If it does contain one of these points, say 3211, then it must omit one point from each of the four pairs {3222, 3233}, {3212, 3213}, {3221, 3231}, {3111, 3311}, leading to at most 15 points on this slice, a contradiction. So the 3*** slice must omit all nine points, and is therefore contained in <math>\{3\} \times D_{4,1}</math>, and so the 50-point set is contained in <math>D_{4,1}</math>, and we are done by the discussion at the beginning of the proof.<br />
<br />
The case jk=10 is similar to the jk=02 case (indeed one can get from one case to the other by swapping the 1 and 2 indices). Now suppose instead that jk=12. Then by Lemma 1, the 1*** slice contains the six points from permuting the last three indices of 1123, and similarly the 2*** slice contains the six points from permuting the last three indices of 2123. Thus the 3*** slice must avoid all six points formed by permuting the last three indices of 3123. Similarly, as 1133 lies in the 1*** slice and 2233 lies in the 2*** slice, 3333 must be avoided in the 3*** slice.<br />
<br />
Now we claim that 3111 must be avoided also; for if 3111 was in the set, then one point from each of the six pairs formed from {3311, 3211}, {3331, 3221} and permuting the last three indices must lie outside the 3*** slice, which reduces the size of that slice to at most <math>27-6-1-6=14</math>, which is too small. Similarly, 3222 must be avoided, which puts the 3*** slice inside <math>\{3\} \times D_3</math> and then places the 50-point set inside <math>D_4</math>, and we are done by the discussion at the beginning of the proof.<br />
<br />
We have handled the case in which at least one of the slicings of the 50-point set is of the form 50=17+17+16. The only remaining case is when all slicings of the 50-point set are of the form 18+17+16 (or a permutation thereof). By the symmetries of the situation, we may assume that the 1*** slice has 18 points, and thus by Lemma 1 takes the form <math>\{1\} \times D_3</math>. Inspecting the *1**, *2**, *3** slices, we then see (from Lemma 1) that only the *1** slice can have 18 points; since we are assuming that this slicing is some permutation of 50=18+17+16, we conclude that the *1** slice must have exactly 18 points, and is thus described precisely by Lemma 1. Similarly for the **1* and ***1 slices. Indeed, by Lemma 1, we see that the 50-point set must agree exactly with <math>D_{4,1}</math> on any of these slices. In particular, on the remaining portion <math>\{2,3\}^4</math> of the cube, there are exactly 6 points of the 50-point set in <math>\{2,3\}^4</math>.<br />
<br />
Suppose that 3333 was in the set; then since all permutations of 3311, 3331 are known to lie in the set, then 3322, 3332 must lie outside the set. Also, as 1222 lies in the set, at least one of 2222, 3222 lie outside the set. This leaves only 5 points in <math>\{2,3\}^4</math>, a contradiction. Thus 3333 lies outside the set; similarly 2222 lies outside the set.<br />
<br />
Let a be the number of points in the 50-point set which are some permutation of 2233, thus <math>0 \leq a \leq 6</math>. If a=0 then the set lies in <math>D_{4,1}</math> and we are done. If a=6 then the set is exactly X and we are done. Now suppose a=1,2,3. By symmetry we may assume that 2233 lies in the set. Then (since 2133, 1233 2231, 2213 are known to lie in the set) 2333, 3233, 2223, 2232 lie outside the set, which leaves at most 5 points inside <math>\{2,3\}^4</math>, a contradiction.<br />
<br />
The remaining case is when a=4,5. Then one of the three pairs {2233, 3322}, {2323, 3232}, {2332, 3223} lie in the set. By symmetry we may assume that {2233, 3322} lie in the set. Then by arguing as before we see that all eight points formed by permuting 2333 or 3222 lie outside the set, leading to at most 5 points inside <math>\{2,3\}^4</math>, a contradiction.<br />
<math>\Box</math><br />
<br />
== n=5 ==<br />
<br />
:<math>c_5=150</math>:<br />
<br />
'''Lemma 3'''. Any line-free subset of <math>D_{5,j}</math> can have at most 150 points.<br />
<br />
'''Proof'''. By rotation we may work with <math>D_5</math>. This set has 162 points. By looking at the triplets {10000, 11110, 12220} and cyclic permutations we must lose 5 points; similarly from the triplets {20000,22220, 21110} and cyclic permutations. Finally from {11000,11111,11222} and {22000,22222,22111} we lose two more points. <math>\Box</math><br />
<br />
Equality can be attained by removing <math>\Gamma_{0,4,1}, \Gamma_{0,5,0}, \Gamma_{4,0,1}, \Gamma_{5,0,0}</math> from <math>D_5</math>. Thus <math>c_5 \geq 150</math>.<br />
<br />
Another pattern of 150 points is this: Take the 450 points<br />
in <math>{}[3]^6</math> which are (1,2,3), (0,2,4) and permutations,<br />
then select the 150 whose final coordinate is 1. That gives<br />
this many points in each cube:<br />
<br />
17 18 17<br />
<br />
17 17 18<br />
<br />
12 17 17<br />
<br />
'''Lemma 4'''. A line-free subset of <math>[3]^5</math> with over 150 points cannot have two parallel <math>[3]^4</math> slices, each of which contain at least 51 points.<br />
<br />
'''Proof'''. Suppose not. By symmetry, we may assume that the 1**** and 2**** slices have at least 51 points, and that the whole set has at least 151 points, which force the third slice to have at least <math>151-2c_4 = 47</math> points.<br />
<br />
By Lemma 2, the 1**** slice takes the form <math>\{1\} \times D_{4,j}</math> for some <math>j=0,1,2</math> with the diagonal {11111,12222,13333} and possibly one more point removed, and similarly the 2**** slice takes the form <math>\{2\} \times D_{4,k}</math> for some <math>k=0,1,2</math> with the diagonal {21111,22222,23333} and possibly one more point removed.<br />
<br />
Suppose first that j=k. Then the 1-slice and 2-slice have at least 50 points in common, leaving at most 31 points for the 3-slice, a contradiction. Next, suppose that jk=01. Then observe that the *i*** slice cannot look like any of the configurations in Lemma 2 and so must have at most 50 points for i=1,2,3, leading to 150 points in all, a contradiction. Similarly if jk=12 or 20. Thus we must have jk equal to 10, 21, or 02.<br />
<br />
Let's suppose first that jk=10. The first slice then is equal to <math>\{1\} \times D_{4,1}</math> with the diagonal and possibly one more point removed, while the second slice is equal to <math>\{2\} \times D_{4,0}</math> with the diagonal and possibly one more point removed. Superimposing these slices, we thus see that the third slice is contained in <math>\{3\} \times D_{4,2}</math> except possibly for two additional points, together with the one point 32222 of the diagonal that lies outside of <math>\{3\} \times D_{4,2}</math>.<br />
<br />
The lines x12xx, x13xx (plus permutations of the last four digits) must each contain one point outside the set. The first two slices can only absorb two of these, and so at least 14 of the 16 points formed by permuting the last four digits of 31233, 31333 must lie outside the set. These points all lie in <math>\{3\} \times D_{4,2}</math>, and so the 3**** slice can have at most <math>|D_{4,2}|-14+3=43</math> points, a contradiction.<br />
<br />
The case jk=02 is similar to the case jk=10 (indeed one can obtain one from the other by swapping 1 and 2). Now we turn to the case jk=21. Arguing as before we see that the third slice is contained in <math>\{3\} \times D_4</math> except possibly for two points, together with 33333. <br />
<br />
If 33333 was in the set, then each of the lines xx333, xxx33 (and permutations of the last four digits) must have a point missing from the first two slices, which cannot be absorbed by the two points we are permitted to remove; thus 33333 is not in the set. For similar reasons, 33331 is not in the set, as can be seen by looking at xxx31 and permutations of the last four digits. Indeed, any string containing four threes does not lie in the set; this means that at least 8 points are missing from <math>\{3\} \times D_4</math>, leaving only at most 46 points inside that set. Furthermore, any point in the 3**** slice outside of <math>\{3\} \times D_4</math> can only be created by removing a point from the first two slices, so the total cardinality is at most <math>46+52+52 = 150</math>, a contradiction.<math>\Box</math><br />
<br />
'''Corollary'''. <math>c_5 \leq 152</math><br />
<br />
'''Proof'''. By Lemma 4 and the bound <math>c_4=52</math>, any line-free set with over 150 points can have one slice of cardinality 52, but then the other two slices can have at most 50 points. <math>\Box</math><br />
<br />
<br />
'''Lemma 5''' Any solution with 151 or more points has a slice with at most 49 points.<br />
<br />
'''Proof''' Suppose we have 151 points without a line, and each of three slices has at least 50 points.<br />
<br />
Using earlier notation, we split subsets of <math>[3]^4</math> into nine subsets of <math>[3]^2</math>. <br />
So we think of x,y,z,a,b and c as subsets of a square. Each slice is one of the following.<br />
*<math>D_4 = y'zx,zx'y,xyz</math> (with one or two points removed)<br />
*<math>D_{4,2} = z'xy,xyz,yzx'</math> (with one or two points removed)<br />
*<math>D_{4,1} = xyz,yz'x,zxy'</math> (with one or two points removed)<br />
*<math>X = xyz, ybw, zwc</math><br />
*<math>Y = axw, xyz, wzc</math><br />
*<math>Z = awx, wby, xyz</math><br />
<br />
where a, b and c have four points each.<br />
<br />
.. 32 33 31 .. 33 .. .. ..<br />
a = .. 22 23 b = .. .. .. c = 21 22 ..<br />
.. .. .. 11 .. 13 11 12 ..<br />
<br />
x', y' and z' are subsets of x, y and z respectively, and have five points each.<br />
<br />
Suppose all three slices are subsets of <math>D_{4,j}</math>. <br />
We can remove at most five points from the full set of three D_{4,j}. <br />
Consider columns 2,3,4,6,7,8. At most two of these columns contain xyz, so one point must be removed from the other four.<br />
This uses up all but one of the removals.<br />
So the slices must be <math>D_{4,2},D_{4,1},D_{4,0}</math> or a cyclic permutation of that.<br />
Then the cube, which contains the first square of slice 1; the fifth square of slice 2; <br />
and the ninth square of slice 3, contains three copies of the same square. <br />
It takes more than one point removed to remove all lines from that cube.<br />
So we can't have all three slices subsets of <math>D_{4,j}</math>.<br />
<br />
Suppose one slice is X,Y or Z, and two others are subsets of <math>D_{4,j}</math>. <br />
We can remove at most three points from the full <math>D_{4,j}</math><br />
By symmetry, suppose one slice is X. Consider columns 2,3,4 and 7. They must be cyclic permutations of x,y,z,<br />
and two of them are not xyz, so must lose a point. <br />
Columns 6 and 8 must both lose a point, and we only have 150 points left.<br />
So if one slice is X,Y or Z, the full set contains a line.<br />
<br />
Suppose two slices are from X,Y and Z, and the other is a subset of <math>D_{4,j}</math>. <br />
By symmetry, suppose two slices are X and Y. Columns 3,6,7 and 8 all contain w, and therefore at most 16 points each.<br />
Columns 1,5 and 9 contain a,b, or c, and therefore at most 16 points. <br />
So the total number of points is at most 7*16+2*18 = 148. This contradicts the assumption of 151 points.<br />
<math>\Box</math><br />
<br />
'''Corollary''' <math>c_5 \leq 151 </math><br />
<br />
'''Proof''' By Lemmas 2 and 4, the maximum number of points is 52+50+49=151. <math>\Box</math><br />
<br />
'''Lemma 5.1''' No solution with 151 points contains as a slice the X defined in Lemma 2<br />
<br />
'''Proof''' Suppose one row is X. Another row is <math>D_{4,j}</math>.<br />
<br />
Suppose X is in the first row. Label the other rows with letters from the alphabet.<br />
<br />
xyz ybw zwc<br />
<br />
mno pqr stu<br />
<br />
def ghi jkl<br />
<br />
Reslice the array into a left nine, middle nine and right nine. One of these squares<br />
contains 52 points, and it can only be the left nine. One of its three columns contains<br />
18 points, and it can only be its left-hand column, xmd. So m=y and d=z. But none of the {math>D_{4,j}</math> begins with y or z, which is a contradiction. So X is not in the first row.<br />
<br />
So X is in the second or third row. By symmetry, suppose it is in the second row<br />
<br />
def ghi jkl<br />
<br />
xyz ybw zwc<br />
<br />
mno pqr stu<br />
<br />
Again, the left-hand nine must contain 52 points, so it is <math>D_{4,2}</math>.<br />
So either the first row is <math>D_{4,2}</math> or the third row is <math>D_{4,0}</math>.<br />
If the first row is <math>D_{4,2}</math> then the only way to have 50 points in the middle or right-hand nine is if the middle nine is X<br />
<br />
z'xy xyz yzx'<br />
<br />
xyz ybw zwc<br />
<br />
yzx' zwc stu<br />
<br />
In the seventh column, s contains 5 points and in the eighth column, t contains 4 points.<br />
The final row can now contain at most 48 points, and the whole array contains only 52+50+48 = 150 points.<br />
<br />
If the third row is <math>D_{4,0}</math>, then neither the middle nine nor the right-hand nine contains 50 points, by the classification of Lemma 4 and the formulas at the start of Lemma 5.<br />
Again, only 52+49+49 = 150 points are possible.<br />
<br />
A similar argument is possible if X is in the third row; or if X is replaced by Y or Z.<br />
<br />
So when a 151-point set is sliced into three, one slice is <math>D_{4,j}</math> and another slice is 50 points contained in <math>D_{4,k}</math>. <math>\Box</math><br />
<br />
'''Lemma 5.2''' There is no 151-point solution<br />
<br />
'''Proof''' Assume by symmetry that the first row contains 52 points and the second row contains 50.<br />
<br />
If <math>D_{4,1}</math> is in the first row, then the second row must be contained in <math>D_{4,0}</math>. <br />
<br />
xyz yz'x zxy'<br />
<br />
y'zx zx'y xyz<br />
<br />
def ghi jkl<br />
<br />
But then none of the left nine, middle nine or right nine can contain 52 points, which contradicts the corollary to Lemma 5.<br />
<br />
Suppose the first row contains D_{4,0}. Then the second row is contained in <math>D_{4,2}</math>, otherwise the cubes formed from the nine columns of the diagram would need to remove too many points.<br />
<br />
y'zx zx'y xyz<br />
<br />
z'xy xyz yzx'<br />
<br />
def ghi jkl<br />
<br />
But then neither the left nine, middle nine or right nine contains 52 points.<br />
<br />
So the first row contains <math>D_{4,2}</math>, and the second row is contained in <math>D_{4,1}</math>. Two points may be removed from the second row of this diagram.<br />
<br />
z'xy xyz yzx'<br />
<br />
xyz yz'x zxy'<br />
<br />
def ghi jkl<br />
<br />
Slice it into the left nine, middle nine and right nine. Two of them are contained in <math>D_{4,j}</math><br />
so at least two of def, ghi, and jkl are contained in the corresponding slice of <math>D_{4,0}</math>.<br />
Slice along a different axis, and at least two of dgj,ehk,fil are contained in the corresponding slice of <br />
<math>D_{4,0}</math>. <br />
So eight of the nine squares in the bottom row are contained in the corresponding square of <math>D_{4,0}</math>.<br />
Indeed, slice along other axes, and all points except one are contained within <math>D_{4,0}</math>. <br />
This point is the intersection of all the 49-point slices. <br />
<br />
So, if there is a 151-point solution, then after removal of the specified point, <br />
there is a 150-point solution, within <math>D_{5,j}</math>, whose slices in each direction are 52+50+48.<br />
<br />
z'xy xyz yzx'<br />
<br />
xyz yz'x zxy'<br />
<br />
y'zx zx'y xyz<br />
<br />
One point must be lost from columns 3, 6, 7 and 8, and four more from the major diagonal z'z'z. That leaves 148 points instead of 150.<br />
<br />
So the 150-point solution does not exist with 52+50+48 slices; so the 151 point solution does not exist.<math>\Box</math><br />
<br />
<br />
An integer programming method has established the upper bound <math>c_5\leq 150</math>, with 12 extremal solutions.<br />
<br />
[http://abel.math.umu.se/~klasm/extremal-c5 This file] contains the extermisers. One point per line and different extermisers separated by a line with “—”<br />
<br />
[http://abel.math.umu.se/~klasm/linprog-d=5-t=3.lpt This is the linear program], readable by Gnu’s glpsol linear programing solver, which also quickly proves that 150 is the optimum.<br />
<br />
Each variable corresponds to a point in the cube, numbered according to their lexicografic ordering. If a variable is 1 then the point is in the set, if it is 0 then it is not in the set.<br />
There is one linear inequality for each combinatorial line, stating that at least one point must be missing from the line.<br />
<br />
== n=6 ==<br />
<br />
:<math>c_6=450</math>:<br />
<br />
The upper bound follows since <math>c_6 \leq 3 c_5</math>. The lower bound can be formed by gluing together all the [[slice]]s <math>\Gamma_{a,b,c}</math> where (a,b,c) is a permutation of (0,2,4) or (1,2,3).<br />
<br />
Computer verification, using the <math>c_5=150</math> extremals, has shown that there is exactly one extremiser for <math>c_6=450</math>.<br />
<br />
== n=7 ==<br />
<br />
:<math>1302 \leq c_7 \leq 1348</math>:<br />
<br />
To see the upper bound <math>c_7 \leq 3c_6-2</math>, observe that if two parallel six-dimensional slices had <math>c_6</math> points, then by uniqueness they are identical, and the third slice can have at most <math>3^6-c_6=279</math> points, far too few to get anywhere close to <math>1348</math>. Thus there can be at most one slice with <math>c_6</math> points, and the other two have at most <math>c_6-1</math>, giving the claim.<br />
<br />
The lower bound can be formed by removing 016,106,052,502,151,511,160,610 from <math>D_7</math>.<br />
<br />
'''Lemma 6''' Any line-free subset of <math>D_7</math> has at most 1302 points.<br />
<br />
'''Proof''' Start with the 1458 points of <math>D_7</math>. You must lose:<br />
<br />
* 42 points from (1,2,4),(1,5,1),(4,2,1)<br />
* 42 points from (2,1,4),(2,4,1),(5,1,1)<br />
* 21 points from (0,2,5),(0,5,2),(3,2,2)<br />
* 21 points from (2,0,5),(2,3,2),(5,0,2)<br />
* 15 points from (0,1,6),(0,4,3),(3,1,3),(0,7,0),(3,4,0),(6,1,0)<br />
* 15 points from (1,0,6),(1,3,3),(4,0,3),(7,0,0),(4,3,0),(1,6,0)<br />
<br />
where (a,b,c) is shorthand for the [[slice]] <math>\Gamma_{a,b,c}</math>.<br />
<math>\Box</math><br />
<br />
== Larger n ==<br />
<br />
The following construction gives lower bounds for the number of triangle-free points, <br />
There are of the order <math>2.7 \sqrt{log(N)/N}3^N</math> points for large N (N ~ 5000)<br />
<br />
It applies when N is a multiple of 3. <br />
* For N=3M-1, restrict the first digit of a 3M sequence to be 1. So this construction has exactly one-third as many points for N=3M-1 as it has for N=3M. <br />
* For N=3M-2, restrict the first two digits of a 3M sequence to be 12. This leaves roughly one ninth of the points for N=3M-2 as for N=3M.<br />
<br />
The current lower bounds for <math>c_{3m}</math> are built like this, with abc being shorthand for <math>\Gamma_{a,b,c}</math>:<br />
<br />
* <math>c_3</math> from (012) and permutations<br />
* <math>c_6</math> from (123,024) and perms<br />
* <math>c_9</math> from (234,135,045) and perms<br />
* <math>c_{12}</math> from (345,246,156,02A,057) and perms (A=10)<br />
* <math>c_{15}</math> from (456,357,267,13B,168,04B,078) and perms (B=11)<br />
<br />
To get the triples in each row, add 1 to the triples in the previous row; then include new triples that have a zero.<br />
<br />
A general formula for these points is given below. I think that they are triangle-free. (For N<21, ignore any triple with a negative entry.)<br />
<br />
* There are thirteen groups of points in the centre, formed from adding one of the following points, or its permutation, to (M,M,M), when N=3M:<br />
** (-7,-3,+10), (-7, 0,+7),(-7,+3,+4),(-6,-4,+10),(-6,-1,+7),(-6,+2,+4),(-5,-1,+6),(-5,+2,+3),(-4,-2,+6),(-4,+1,+3),(-3,+1,+2),(-2,0,+2),(-1,0,+1) <br />
* There are also eight string of points, stretching to the edges of the (abc) triangle:<br />
** For N=6K = 3M<br />
*** M+(-8-2x,-6-2x,14+4x),M+(-8-2x,-3-2x,11+4x),M+(-8-2x,x,8+x),M+(-8-2x,3+x,5+x) and permutations (x>=0, M-8-2x>=0)<br />
*** M+(-9-2x,-5-2x,14+4x),M+(-9-2x,-2-2x,11+4x),M+(-9-2x,1+x,8+x),M+(-9-2x,4+x,5+x) and permutations (x>=0, M-9-2x>=0)<br />
<br />
<br />
An alternate construction:<br />
<br />
First define a sequence, of all positive numbers which, in base 3, do not contain a 1. Add 1 to all multiples of 3 in this sequence. This sequence does not contain a length-3 arithmetic progression.<br />
<br />
It starts 1,2,7,8,19,20,25,26,55, …<br />
<br />
Second, list all the (abc) triples for which the larger two differ by a number<br />
from the sequence, excluding the case when the smaller two differ by 1, but then including the case when (a,b,c) is a permutation of N/3+(-1,0,1)<br />
<br />
== Asymptotics ==<br />
<br />
DHJ(3) is equivalent to the upper bound<br />
<br />
:<math>c_n \leq o(3^n)</math><br />
<br />
In the opposite direction, observe that if we take a set <math>S \subset [3n]</math> that contains no 3-term arithmetic progressions, then the set <math>\bigcup_{(a,b,c) \in \Delta_n: a+2b \in S} \Gamma_{a,b,c}</math> is line-free. From this and the Behrend construction it appears that we have the lower bound<br />
<br />
:<math>c_n \geq 3^{n-O(\sqrt{\log n})}.</math><br />
<br />
More precisely, we have<br />
<br />
:<math>c_n > C 3^{n - 4\sqrt{\log 2}\sqrt{\log n}+\frac 12 \log \log n}</math><br />
for some absolute constant C, and where all logarithms are base-3.<br />
<br />
'''Proof''' For convenience, let n be a multiple of 3. Elkin’s bound gives <math>r_3(\sqrt{n}) > C \sqrt{n} (\log n)^{1/4} \exp_2(-2 \sqrt{\log_2 n})</math>, and let <math>R</math> be a subset of <math>(-3\sqrt{n}/2,3\sqrt{n}/2)</math> without 3-term APs and with size <math>r_3(\sqrt{n})</math>, and with all elements being integer multiples of 3 (again as a matter of convenience). For each <math>r,s\in R</math>, let <math>a = (n-r-s)/3</math>. The set <math>A</math> is the union of all <math>\Gamma_{a,a+r,a+s}</math>. Since all of <math>a, a+r,a+s</math> are between <math>n/3-2\sqrt{n}</math> and <math>n/3+2\sqrt{n}</math>, the size of <math>\Gamma_{a,a+r,a+s}</math> is at least <math>C 3^n / n</math>. Since there are <math>r_3(\sqrt{n})^2</math> choices for r and s, we have a set with size at least<br />
<br />
:<math>C (\sqrt{n} (\log n)^{1/4} \exp_2(-2 \sqrt{\log_2 n}))^2 3^n / n</math>.<br />
<br />
This simplifies to <math>C \sqrt{\log n} \exp_3(n-\alpha \sqrt{\log_3(n)})</math>, where <math>\alpha=4 \sqrt{\log_3(2)}</math>.<br />
<br />
Now suppose that <math>x_i\in \Gamma_{a_i,a_i+r_i,a_i+s_i}</math> is a combinatorial line in the set A. Then <math>(a_i+s_i)-(a_i)=s_i</math> is a 3-term AP contained in R, so the <math>s_i</math> are all the same. Similarly, all of the <math>r_i</math> are the same, and therefore all of the <math>a_i</math> are the same, too. But this implies that the <math>x_i</math> sequence is constant, which means the line is degenerate. <math>\Box</math><br />
<br />
[http://terrytao.wordpress.com/2009/02/05/upper-and-lower-bounds-for-the-density-hales-jewett-problem/#comment-35652 Numerics suggest] that the first large n construction given above above give a lower bound of roughly <math>2.7 \sqrt{\log(n)/n} \times 3^n</math>, which would asymptotically be inferior to the Behrend bound.<br />
<br />
The second large n construction had numerical asymptotics for <math>\log(c_n/3^n)</math> close to <math>1.2-\sqrt{\log(n)}</math> between n=1000 and n=10000, consistent with the Behrend bound.<br />
<br />
== Other k values ==<br />
<br />
A [http://abel.math.umu.se/~klasm/Data/HJ/ computer search] has found the following <math>c_n</math> values for different values of dimension n and edgelength k. Several of these values reach the upper bound of <math>(k-1)k^{n-1}</math>.<br />
<br />
{|<br />
|n\k || 2 || 3 || 4 || 5 || 6 || 7<br />
|-<br />
| 2 || 2 || 6 || 12 || 20 || 30 || 42<br />
|-<br />
| 3 || 3 || 18 || 48 || 100 || 180 || 294<br />
|-<br />
| 4 || 6 || 52 || 183 || 500 || 1029-1079 || 2058<br />
|-<br />
| 5 || 10 || 150 || 683-732 || 2500<br />
|}<br />
<br />
== Numerical methods ==<br />
<br />
A greedy algorithm [http://thetangentspace.com/wiki/Hales-Jewett_Theorem was implemented here]. The results were sharp for <math>n \leq 3</math> but were slightly inferior to the constructions above for larger n.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Upper_and_lower_boundsUpper and lower bounds2009-03-23T20:17:40Z<p>121.220.134.232: </p>
<hr />
<div><center>'''Upper and lower bounds for <math>c_n</math> for small values of n.'''</center><br />
<br />
<math>c_n</math> is the size of the largest subset of <math>[3]^n</math> that does not contain a combinatorial line (OEIS [http://www.research.att.com/~njas/sequences/A156762 A156762]. A spreadsheet for all the latest bounds on <math>c_n</math> [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DsU-uZ1tK7VEg can be found here]. In this page we record the proofs justifying these bounds.<br />
<br />
<br />
{|<br />
| n || 0 || 1 || 2 || 3 || 4 || 5 || 6 || 7<br />
|-<br />
| <math>c_n</math> || 1 || 2 || 6 || 18 || 52 || 150 || 450 || [1302,1348]<br />
|}<br />
<br />
== Basic constructions ==<br />
<br />
For all <math>n \geq 1</math>, a basic example of a mostly line-free set is<br />
<br />
:<math>D_n := \{ (x_1,\ldots,x_n) \in [3]^n: \sum_{i=1}^n x_i \neq 0 \ \operatorname{mod}\ 3 \}</math>. (1)<br />
<br />
This has cardinality <math>|D_n| = 2 \times 3^{n-1}</math>. The only lines in <math>D_n</math> are those with<br />
<br />
# A number of wildcards equal to a multiple of three;<br />
# The number of 1s unequal to the number of 2s modulo 3.<br />
<br />
One way to construct line-free sets is to start with <math>D_n</math> and remove some additional points. We also have the variants <math>D_{n,0}=D_n, D_{n,1}, D_{n,2}</math> defined as<br />
<br />
:<math>D_{n,j} := \{ (x_1,\ldots,x_n) \in [3]^n: \sum_{i=1}^n x_i \neq j \ \operatorname{mod}\ 3 \}</math>. (1')<br />
<br />
When n is not a multiple of 3, then <math>D_{n,0}, D_{n,1}, D_{n,2}</math> are all cyclic permutations of each other; but when n is a multiple of 3, then <math>D_{n,0}</math> plays a special role (though <math>D_{n,1}, D_{n,2}</math> are still interchangeable).<br />
<br />
Another useful construction proceeds by using the slices <math>\Gamma_{a,b,c} \subset [3]^n</math> for <math>(a,b,c)</math> in the triangular grid<br />
<br />
:<math>\Delta_n := \{ (a,b,c) \in {\Bbb Z}_+^3: a+b+c = n \},</math>. (2)<br />
<br />
where <math>\Gamma_{a,b,c}</math> is defined as the strings in <math>[3]^n</math> with <math>a</math> 1s, <math>b</math> 2s, and <math>c</math> 3s. Note that<br />
<br />
:<math>|\Gamma_{a,b,c}| = \frac{n!}{a! b! c!}.</math> (3)<br />
<br />
Given any set <math>B \subset \Delta_n</math> that avoids equilateral triangles <math> (a+r,b,c), (a,b+r,c), (a,b,c+r)</math>, the set<br />
<br />
:<math>\Gamma_B := \bigcup_{(a,b,c) \in B} \Gamma_{a,b,c}</math> (4)<br />
<br />
is line-free and has cardinality<br />
<br />
:<math>|\Gamma_B| = \sum_{(a,b,c) \in B} \frac{n!}{a! b! c!},</math> (5)<br />
<br />
and thus provides a lower bound for <math>c_n</math>:<br />
<br />
:<math>c_n \geq \sum_{(a,b,c) \in B} \frac{n!}{a! b! c!}.</math> (6)<br />
<br />
All lower bounds on <math>c_n</math> have proceeded so far by choosing a good set of B and applying (6). Note that <math>D_n</math> is the same as <math>\Gamma_{B_n}</math>, where <math>B_n</math> consists of those triples <math>(a,b,c) \in \Delta_n</math> in which <math>a \neq b\ \operatorname{mod}\ 3</math>.<br />
<br />
Note that if one takes a line-free set and permutes the alphabet <math>\{1,2,3\}</math> in any fashion (e.g. replacing all 1s by 2s and vice versa), one also gets a line-free set. This potentially gives six examples from any given starting example of a line-free set, though in practice there is enough symmetry that the total number of examples produced this way is less than six. (These six examples also correspond to the six symmetries of the triangular grid <math>\Delta_n</math> formed by rotation and reflection.)<br />
<br />
Another symmetry comes from permuting the <math>n</math> indices in the strings of <math>[3]^n</math> (e.g. replacing every string by its reversal). But the sets <math>\Gamma_B</math> are automatically invariant under such permutations and thus do not produce new line-free sets via this symmetry.<br />
<br />
== The basic upper bound ==<br />
<br />
Because <math>[3]^{n+1}</math> can be expressed as the union of three copies of <math>[3]^n</math>, we have the basic upper bound<br />
<br />
:<math>c_{n+1} \leq 3 c_n.</math> (7)<br />
<br />
Note that equality only occurs if one can find an <math>n+1</math>-dimensional line-free set such that every n-dimensional slice has the maximum possible cardinality of <math>c_n</math>.<br />
<br />
== n=0 ==<br />
<br />
:<math>c_0=1</math>:<br />
<br />
This is clear.<br />
<br />
== n=1 ==<br />
<br />
:<math>c_1=2</math>:<br />
<br />
The three sets <math>D_1 = \{1,2\}</math>, <math>D_{1,1} = \{2,3\}</math>, and <math>D_{1,2} = \{1,3\}</math> are the only two-element sets which are line-free in <math>[3]^1</math>, and there are no three-element sets.<br />
<br />
== n=2 ==<br />
<br />
:<math>c_2=6</math>:<br />
<br />
There are four six-element sets in <math>[3]^2</math> which are line-free, which we denote <math>x = D_{2,2}</math>, <math>y=D_{2,1}</math>, <math>z=D_2</math>, and <math>w</math> and are displayed graphically as follows.<br />
<br />
13 .. 33 .. 23 33 13 23 .. 13 23 ..<br />
x = 12 22 .. y = 12 .. 32 z = .. 22 32 w = 12 .. 32<br />
.. 21 31 11 21 .. 11 .. 31 .. 21 31<br />
<br />
Combining this with the basic upper bound (7) we see that <math>c_2=6</math>.<br />
<br />
== n=3 ==<br />
<br />
:<math>c_3=18</math>:<br />
<br />
We describe a subset <math>A</math> of <math>[3]^3</math> as a string <math>abc</math>, where <math>a, b, c \subset [3]^2</math> correspond to strings of the form <math>1**</math>, <math>2**</math>, <math>3**</math> in <math>[3]^3</math> respectively. Thus for instance <math>D_3 = xyz</math>, and so from (7) we have <math>c_3=18</math>.<br />
<br />
'''Lemma 1.'''<br />
* The only 18-element line-free subset of <math>[3]^3</math> is <math>D_3 = xyz</math>.<br />
* The only 17-element line-free subsets of <math>[3]^3</math> are formed by removing a point from <math>D_3=xyz</math>, or by removing either 111, 222, or 333 from <math>D_{3,2} = yzx</math> or <math>D_{3,3}=zxy</math>.<br />
<br />
'''Proof'''. We prove the second claim. As <math>17=6+6+5</math>, and <math>c_2=6</math>, at least two of the slices of a 17-element line-free set must be from x, y, z, w, with the third slice having 5 points. If two of the slices are identical, the last slice can have only 3 points, a contradiction. If one of the slices is a w, then the 5-point slice will contain a diagonal, contradiction. By symmetry we may now assume that two of the slices are x and y, which force the last slice to be z with one point removed. Now one sees that the slices must be in the order xyz, yzx, or zxy, because any other combination has too many lines that need to be removed. The sets yzx, zxy contain the diagonal {111,222,333} and so one additional point needs to be removed. <br />
<br />
The first claim follows by a similar argument to the second.<br />
<math>\Box</math><br />
<br />
== n=4 ==<br />
<br />
:<math>c_4=52</math>:<br />
<br />
Indeed, divide a line-free set in <math>[3]^4</math> into three blocks <math>1***, 2***, 3***</math> of <math>[3]^3</math>. If two of them are of size 18, then they must both be xyz, and the third block can have at most 6 elements, leading to an inferior bound of 42. So the best one can do is <math>18+17+17=52</math> which can be attained by deleting the diagonal {1111,2222,3333} from <math>D_{4,1} = xyz\ yzx\ xzy</math>, <math>D_4 = yzx\ zxy\ xyz</math>, or <math>D_{4,2} = zxy\ xyz\ yzx</math>. In fact,<br />
<br />
'''Lemma 2.'''<br />
<br />
* The only 52-element line-free sets in <math>[3]^4</math> are formed by removing the diagonal {1111,2222,3333} from <math>D_{4,j}</math> for some j=0,1,2.<br />
* The only 51-element line-free sets in <math>[3]^4</math> are formed by removing the diagonal and one further point from <math>D_{4,j}</math> for some j=0,1,2.<br />
* The only 50-element line-free sets in <math>[3]^4</math> are formed by removing the diagonal and two further points from <math>D_{4,j}</math> for some j=0,1,2 OR is equal to one of the three permutations of the set <math>X := \Gamma_{3,1,0} \cup \Gamma_{3,0,1} \cup \Gamma_{2,2,0} \cup \Gamma_{2,0,2} \cup \Gamma_{1,1,2} \cup \Gamma_{1,2,1} \cup \Gamma_{0,2,2}</math>.<br />
<br />
'''Proof''' It suffices to prove the third claim. In fact it suffices to show that every 50-point line-free set is either contained in the 54-point set <math>D_{4,j}</math> for some j=0,1,2, or is some permutation of the set X. Indeed, if a 50-point line-free set is contained in, say, <math>D_4</math>, then it cannot contain 2222, since otherwise it must omit one point from each of the four pairs formed from {2333, 2111} by permuting the indices, and must also omit one of {1111, 1222, 1333}, leading to at most 49 points in all; similarly, it cannot contain 1111, and so omits the entire diagonal {1111,2222,3333}, with two more points to be omitted. Similarly when <math>D_4</math> is replaced by one of the other <math>D_{4,j}</math><br />
<br />
Next, observe that every three-dimensional slice of a line-free set can have at most <math>c_3=18</math> points; thus when one partitions a 50-point line-free set into three such slices, it must divide either as 18+16+16, 17+17+16, or some permutation of these.<br />
<br />
Suppose that we can slice the set into two slices of 17 points and one slice of 16 points. By the various symmetries, we may assume that the 1*** slice and 2*** slices have 17 points, and the 3*** slice has 16 points. By Lemma 1, the 1-slice is <math>\{1\} \times D_{3,j}</math> with one point removed, and the 2-slice is <math>\{2\} \times D_{3,k}</math> with one point removed, for some <math>j,k \in \{0,1,2\}</math>.<br />
<br />
If j=k, then the 1-slice and 2-slice have at least 15 points in common, so the 3-slice can have at most <math>27-15=12</math> points, a contradiction. If jk = 01, 12, or 20, then observe that from Lemma 1 the *1**, *2**, *3** slices cannot equal a 17-point or 18-point line-free set, so each have at most 16 points, leading to only 48 points in all, a contradiction. Thus we must have jk = 10, 21, or 02.<br />
<br />
Let's first suppose that jk=02. Then by Lemma 1, the 2*** slice contains the nine points formed from {2211, 2322, 2331} and permuting the last three indices, while the 1*** slice contains at least eight of the nine points formed from {1211, 1322, 1311} and permuting the last three indices. Thus the 3*** slice can contain at most one of the nine points formed from {3211, 3322, 3311} and permuting the last three indices. If it does contain one of these points, say 3211, then it must omit one point from each of the four pairs {3222, 3233}, {3212, 3213}, {3221, 3231}, {3111, 3311}, leading to at most 15 points on this slice, a contradiction. So the 3*** slice must omit all nine points, and is therefore contained in <math>\{3\} \times D_{4,1}</math>, and so the 50-point set is contained in <math>D_{4,1}</math>, and we are done by the discussion at the beginning of the proof.<br />
<br />
The case jk=10 is similar to the jk=02 case (indeed one can get from one case to the other by swapping the 1 and 2 indices). Now suppose instead that jk=12. Then by Lemma 1, the 1*** slice contains the six points from permuting the last three indices of 1123, and similarly the 2*** slice contains the six points from permuting the last three indices of 2123. Thus the 3*** slice must avoid all six points formed by permuting the last three indices of 3123. Similarly, as 1133 lies in the 1*** slice and 2233 lies in the 2*** slice, 3333 must be avoided in the 3*** slice.<br />
<br />
Now we claim that 3111 must be avoided also; for if 3111 was in the set, then one point from each of the six pairs formed from {3311, 3211}, {3331, 3221} and permuting the last three indices must lie outside the 3*** slice, which reduces the size of that slice to at most <math>27-6-1-6=14</math>, which is too small. Similarly, 3222 must be avoided, which puts the 3*** slice inside <math>\{3\} \times D_3</math> and then places the 50-point set inside <math>D_4</math>, and we are done by the discussion at the beginning of the proof.<br />
<br />
We have handled the case in which at least one of the slicings of the 50-point set is of the form 50=17+17+16. The only remaining case is when all slicings of the 50-point set are of the form 18+17+16 (or a permutation thereof). By the symmetries of the situation, we may assume that the 1*** slice has 18 points, and thus by Lemma 1 takes the form <math>\{1\} \times D_3</math>. Inspecting the *1**, *2**, *3** slices, we then see (from Lemma 1) that only the *1** slice can have 18 points; since we are assuming that this slicing is some permutation of 50=18+17+16, we conclude that the *1** slice must have exactly 18 points, and is thus described precisely by Lemma 1. Similarly for the **1* and ***1 slices. Indeed, by Lemma 1, we see that the 50-point set must agree exactly with <math>D_{4,1}</math> on any of these slices. In particular, on the remaining portion <math>\{2,3\}^4</math> of the cube, there are exactly 6 points of the 50-point set in <math>\{2,3\}^4</math>.<br />
<br />
Suppose that 3333 was in the set; then since all permutations of 3311, 3331 are known to lie in the set, then 3322, 3332 must lie outside the set. Also, as 1222 lies in the set, at least one of 2222, 3222 lie outside the set. This leaves only 5 points in <math>\{2,3\}^4</math>, a contradiction. Thus 3333 lies outside the set; similarly 2222 lies outside the set.<br />
<br />
Let a be the number of points in the 50-point set which are some permutation of 2233, thus <math>0 \leq a \leq 6</math>. If a=0 then the set lies in <math>D_{4,1}</math> and we are done. If a=6 then the set is exactly X and we are done. Now suppose a=1,2,3. By symmetry we may assume that 2233 lies in the set. Then (since 2133, 1233 2231, 2213 are known to lie in the set) 2333, 3233, 2223, 2232 lie outside the set, which leaves at most 5 points inside <math>\{2,3\}^4</math>, a contradiction.<br />
<br />
The remaining case is when a=4,5. Then one of the three pairs {2233, 3322}, {2323, 3232}, {2332, 3223} lie in the set. By symmetry we may assume that {2233, 3322} lie in the set. Then by arguing as before we see that all eight points formed by permuting 2333 or 3222 lie outside the set, leading to at most 5 points inside <math>\{2,3\}^4</math>, a contradiction.<br />
<math>\Box</math><br />
<br />
== n=5 ==<br />
<br />
:<math>c_5=150</math>:<br />
<br />
'''Lemma 3'''. Any line-free subset of <math>D_{5,j}</math> can have at most 150 points.<br />
<br />
'''Proof'''. By rotation we may work with <math>D_5</math>. This set has 162 points. By looking at the triplets {10000, 11110, 12220} and cyclic permutations we must lose 5 points; similarly from the triplets {20000,22220, 21110} and cyclic permutations. Finally from {11000,11111,11222} and {22000,22222,22111} we lose two more points. <math>\Box</math><br />
<br />
Equality can be attained by removing <math>\Gamma_{0,4,1}, \Gamma_{0,5,0}, \Gamma_{4,0,1}, \Gamma_{5,0,0}</math> from <math>D_5</math>. Thus <math>c_5 \geq 150</math>.<br />
<br />
Another pattern of 150 points is this: Take the 450 points<br />
in <math>{}[3]^6</math> which are (1,2,3), (0,2,4) and permutations,<br />
then select the 150 whose final coordinate is 1. That gives<br />
this many points in each cube:<br />
<br />
17 18 17<br />
<br />
17 17 18<br />
<br />
12 17 17<br />
<br />
'''Lemma 4'''. A line-free subset of <math>[3]^5</math> with over 150 points cannot have two parallel <math>[3]^4</math> slices, each of which contain at least 51 points.<br />
<br />
'''Proof'''. Suppose not. By symmetry, we may assume that the 1**** and 2**** slices have at least 51 points, and that the whole set has at least 151 points, which force the third slice to have at least <math>151-2c_4 = 47</math> points.<br />
<br />
By Lemma 2, the 1**** slice takes the form <math>\{1\} \times D_{4,j}</math> for some <math>j=0,1,2</math> with the diagonal {11111,12222,13333} and possibly one more point removed, and similarly the 2**** slice takes the form <math>\{2\} \times D_{4,k}</math> for some <math>k=0,1,2</math> with the diagonal {21111,22222,23333} and possibly one more point removed.<br />
<br />
Suppose first that j=k. Then the 1-slice and 2-slice have at least 50 points in common, leaving at most 31 points for the 3-slice, a contradiction. Next, suppose that jk=01. Then observe that the *i*** slice cannot look like any of the configurations in Lemma 2 and so must have at most 50 points for i=1,2,3, leading to 150 points in all, a contradiction. Similarly if jk=12 or 20. Thus we must have jk equal to 10, 21, or 02.<br />
<br />
Let's suppose first that jk=10. The first slice then is equal to <math>\{1\} \times D_{4,1}</math> with the diagonal and possibly one more point removed, while the second slice is equal to <math>\{2\} \times D_{4,0}</math> with the diagonal and possibly one more point removed. Superimposing these slices, we thus see that the third slice is contained in <math>\{3\} \times D_{4,2}</math> except possibly for two additional points, together with the one point 32222 of the diagonal that lies outside of <math>\{3\} \times D_{4,2}</math>.<br />
<br />
The lines x12xx, x13xx (plus permutations of the last four digits) must each contain one point outside the set. The first two slices can only absorb two of these, and so at least 14 of the 16 points formed by permuting the last four digits of 31233, 31333 must lie outside the set. These points all lie in <math>\{3\} \times D_{4,2}</math>, and so the 3**** slice can have at most <math>|D_{4,2}|-14+3=43</math> points, a contradiction.<br />
<br />
The case jk=02 is similar to the case jk=10 (indeed one can obtain one from the other by swapping 1 and 2). Now we turn to the case jk=21. Arguing as before we see that the third slice is contained in <math>\{3\} \times D_4</math> except possibly for two points, together with 33333. <br />
<br />
If 33333 was in the set, then each of the lines xx333, xxx33 (and permutations of the last four digits) must have a point missing from the first two slices, which cannot be absorbed by the two points we are permitted to remove; thus 33333 is not in the set. For similar reasons, 33331 is not in the set, as can be seen by looking at xxx31 and permutations of the last four digits. Indeed, any string containing four threes does not lie in the set; this means that at least 8 points are missing from <math>\{3\} \times D_4</math>, leaving only at most 46 points inside that set. Furthermore, any point in the 3**** slice outside of <math>\{3\} \times D_4</math> can only be created by removing a point from the first two slices, so the total cardinality is at most <math>46+52+52 = 150</math>, a contradiction.<math>\Box</math><br />
<br />
'''Corollary'''. <math>c_5 \leq 152</math><br />
<br />
'''Proof'''. By Lemma 4 and the bound <math>c_4=52</math>, any line-free set with over 150 points can have one slice of cardinality 52, but then the other two slices can have at most 50 points. <math>\Box</math><br />
<br />
<br />
'''Lemma 5''' Any solution with 151 or more points has a slice with at most 49 points.<br />
<br />
'''Proof''' Suppose we have 151 points without a line, and each of three slices has at least 50 points.<br />
<br />
Using earlier notation, we split subsets of <math>[3]^4</math> into nine subsets of <math>[3]^2</math>. <br />
So we think of x,y,z,a,b and c as subsets of a square. Each slice is one of the following.<br />
*<math>D_4 = y'zx,zx'y,xyz</math> (with one or two points removed)<br />
*<math>D_{4,2} = z'xy,xyz,yzx'</math> (with one or two points removed)<br />
*<math>D_{4,1} = xyz,yz'x,zxy'</math> (with one or two points removed)<br />
*<math>X = xyz, ybw, zwc</math><br />
*<math>Y = axw, xyz, wzc</math><br />
*<math>Z = awx, wby, xyz</math><br />
<br />
where a, b and c have four points each.<br />
<br />
.. 32 33 31 .. 33 .. .. ..<br />
a = .. 22 23 b = .. .. .. c = 21 22 ..<br />
.. .. .. 11 .. 13 11 12 ..<br />
<br />
x', y' and z' are subsets of x, y and z respectively, and have five points each.<br />
<br />
Suppose all three slices are subsets of <math>D_{4,j}</math>. <br />
We can remove at most five points from the full set of three D_{4,j}. <br />
Consider columns 2,3,4,6,7,8. At most two of these columns contain xyz, so one point must be removed from the other four.<br />
This uses up all but one of the removals.<br />
So the slices must be <math>D_{4,2},D_{4,1},D_{4,0}</math> or a cyclic permutation of that.<br />
Then the cube, which contains the first square of slice 1; the fifth square of slice 2; <br />
and the ninth square of slice 3, contains three copies of the same square. <br />
It takes more than one point removed to remove all lines from that cube.<br />
So we can't have all three slices subsets of <math>D_{4,j}</math>.<br />
<br />
Suppose one slice is X,Y or Z, and two others are subsets of <math>D_{4,j}</math>. <br />
We can remove at most three points from the full <math>D_{4,j}</math><br />
By symmetry, suppose one slice is X. Consider columns 2,3,4 and 7. They must be cyclic permutations of x,y,z,<br />
and two of them are not xyz, so must lose a point. <br />
Columns 6 and 8 must both lose a point, and we only have 150 points left.<br />
So if one slice is X,Y or Z, the full set contains a line.<br />
<br />
Suppose two slices are from X,Y and Z, and the other is a subset of <math>D_{4,j}</math>. <br />
By symmetry, suppose two slices are X and Y. Columns 3,6,7 and 8 all contain w, and therefore at most 16 points each.<br />
Columns 1,5 and 9 contain a,b, or c, and therefore at most 16 points. <br />
So the total number of points is at most 7*16+2*18 = 148. This contradicts the assumption of 151 points.<br />
<math>\Box</math><br />
<br />
'''Corollary''' <math>c_5 \leq 151 </math><br />
<br />
'''Proof''' By Lemmas 2 and 4, the maximum number of points is 52+50+49=151. <math>\Box</math><br />
<br />
'''Lemma 5.1''' No solution with 151 points contains as a slice the X defined in Lemma 2<br />
<br />
'''Proof''' Suppose one row is X. Another row is <math>D_{4,j}</math>.<br />
<br />
Suppose X is in the first row. Label the other rows with letters from the alphabet.<br />
<br />
xyz ybw zwc<br />
<br />
mno pqr stu<br />
<br />
def ghi jkl<br />
<br />
Reslice the array into a left nine, middle nine and right nine. One of these squares<br />
contains 52 points, and it can only be the left nine. One of its three columns contains<br />
18 points, and it can only be its left-hand column, xmd. So m=y and d=z. But none of the {math>D_{4,j}</math> begins with y or z, which is a contradiction. So X is not in the first row.<br />
<br />
So X is in the second or third row. By symmetry, suppose it is in the second row<br />
<br />
def ghi jkl<br />
<br />
xyz ybw zwc<br />
<br />
mno pqr stu<br />
<br />
Again, the left-hand nine must contain 52 points, so it is <math>D_{4,2}</math>.<br />
So either the first row is <math>D_{4,2}</math> or the third row is <math>D_{4,0}</math>.<br />
If the first row is <math>D_{4,2}</math> then the only way to have 50 points in the middle or right-hand nine is if the middle nine is X<br />
<br />
z'xy xyz yzx'<br />
<br />
xyz ybw zwc<br />
<br />
yzx' zwc stu<br />
<br />
In the seventh column, s contains 5 points and in the eighth column, t contains 4 points.<br />
The final row can now contain at most 48 points, and the whole array contains only 52+50+48 = 150 points.<br />
<br />
If the third row is <math>D_{4,0}</math>, then neither the middle nine nor the right-hand nine contains 50 points, by the classification of Lemma 4 and the formulas at the start of Lemma 5.<br />
Again, only 52+49+49 = 150 points are possible.<br />
<br />
A similar argument is possible if X is in the third row; or if X is replaced by Y or Z.<br />
<br />
So when a 151-point set is sliced into three, one slice is <math>D_{4,j}</math> and another slice is 50 points contained in <math>D_{4,k}</math>. <math>\Box</math><br />
<br />
'''Lemma 5.2''' There is no 151-point solution<br />
<br />
'''Proof''' Assume by symmetry that the first row contains 52 points and the second row contains 50.<br />
<br />
If <math>D_{4,1}</math> is in the first row, then the second row must be contained in <math>D_{4,0}</math>. <br />
<br />
xyz yz'x zxy'<br />
<br />
y'zx zx'y xyz<br />
<br />
def ghi jkl<br />
<br />
But then none of the left nine, middle nine or right nine can contain 52 points, which contradicts the corollary to Lemma 5.<br />
<br />
Suppose the first row contains D_{4,0}. Then the second row is contained in <math>D_{4,2}</math>, otherwise the cubes formed from the nine columns of the diagram would need to remove too many points.<br />
<br />
y'zx zx'y xyz<br />
<br />
z'xy xyz yzx'<br />
<br />
def ghi jkl<br />
<br />
But then neither the left nine, middle nine or right nine contains 52 points.<br />
<br />
So the first row contains <math>D_{4,2}</math>, and the second row is contained in <math>D_{4,1}</math>. Two points may be removed from the second row of this diagram.<br />
<br />
z'xy xyz yzx'<br />
<br />
xyz yz'x zxy'<br />
<br />
def ghi jkl<br />
<br />
Slice it into the left nine, middle nine and right nine. Two of them are contained in <math>D_{4,j}</math><br />
so at least two of def, ghi, and jkl are contained in the corresponding slice of <math>D_{4,0}</math>.<br />
Slice along a different axis, and at least two of dgj,ehk,fil are contained in the corresponding slice of <br />
<math>D_{4,0}</math>. <br />
So eight of the nine squares in the bottom row are contained in the corresponding square of <math>D_{4,0}</math>.<br />
Indeed, slice along other axes, and all points except one are contained within <math>D_{4,0}</math>. <br />
This point is the intersection of all the 49-point slices. <br />
<br />
So, if there is a 151-point solution, then after removal of the specified point, <br />
there is a 150-point solution, within <math>D_{5,j}</math>, whose slices in each direction are 52+50+48.<br />
<br />
z'xy xyz yzx'<br />
<br />
xyz yz'x zxy'<br />
<br />
y'zx zx'y xyz<br />
<br />
One point must be lost from columns 3, 6, 7 and 8, and four more from the major diagonal z'z'z. That leaves 148 points instead of 150.<br />
<br />
So the 150-point solution does not exist with 52+50+48 slices; so the 151 point solution does not exist.<math>\Box</math><br />
<br />
<br />
An integer programming method has established the upper bound <math>c_5\leq 150</math>, with 12 extremal solutions.<br />
<br />
[http://abel.math.umu.se/~klasm/extremal-c5 This file] contains the extermisers. One point per line and different extermisers separated by a line with “—”<br />
<br />
[http://abel.math.umu.se/~klasm/linprog-d=5-t=3.lpt This is the linear program], readable by Gnu’s glpsol linear programing solver, which also quickly proves that 150 is the optimum.<br />
<br />
Each variable corresponds to a point in the cube, numbered according to their lexicografic ordering. If a variable is 1 then the point is in the set, if it is 0 then it is not in the set.<br />
There is one linear inequality for each combinatorial line, stating that at least one point must be missing from the line.<br />
<br />
== n=6 ==<br />
<br />
:<math>c_6=450</math>:<br />
<br />
The upper bound follows since <math>c_6 \leq 3 c_5</math>. The lower bound can be formed by gluing together all the [[slice]]s <math>\Gamma_{a,b,c}</math> where (a,b,c) is a permutation of (0,2,4) or (1,2,3).<br />
<br />
Computer verification, using the <math>c_5=150</math> extremals, has shown that there is exactly one extremiser for <math>c_6=450</math>.<br />
<br />
== n=7 ==<br />
<br />
:<math>1302 \leq c_7 \leq 1348</math>:<br />
<br />
To see the upper bound <math>c_7 \leq 3c_6-2</math>, observe that if two parallel six-dimensional slices had <math>c_6</math> points, then by uniqueness they are identical, and the third slice can have at most <math>3^6-c_6=279</math> points, far too few to get anywhere close to <math>1348</math>. Thus there can be at most one slice with <math>c_6</math> points, and the other two have at most <math>c_6-1</math>, giving the claim.<br />
<br />
The lower bound can be formed by removing 016,106,052,502,151,511,160,610 from <math>D_7</math>.<br />
<br />
'''Lemma 6''' Any line-free subset of <math>D_7</math> has at most 1302 points.<br />
<br />
'''Proof''' Start with the 1458 points of <math>D_7</math>. You must lose:<br />
<br />
* 42 points from (1,2,4),(1,5,1),(4,2,1)<br />
* 42 points from (2,1,4),(2,4,1),(5,1,1)<br />
* 21 points from (0,2,5),(0,5,2),(3,2,2)<br />
* 21 points from (2,0,5),(2,3,2),(5,0,2)<br />
* 15 points from (0,1,6),(0,4,3),(3,1,3),(0,7,0),(3,4,0),(6,1,0)<br />
* 15 points from (1,0,6),(1,3,3),(4,0,3),(7,0,0),(4,3,0),(1,6,0)<br />
<br />
where (a,b,c) is shorthand for the [[slice]] <math>\Gamma_{a,b,c}</math>.<br />
<math>\Box</math><br />
<br />
== Larger n ==<br />
<br />
The following construction gives lower bounds for the number of triangle-free points, <br />
There are of the order <math>2.7 \sqrt{log(N)/N}3^N</math> points for large N (N ~ 5000)<br />
<br />
It applies when N is a multiple of 3. <br />
* For N=3M-1, restrict the first digit of a 3M sequence to be 1. So this construction has exactly one-third as many points for N=3M-1 as it has for N=3M. <br />
* For N=3M-2, restrict the first two digits of a 3M sequence to be 12. This leaves roughly one ninth of the points for N=3M-2 as for N=3M.<br />
<br />
The current lower bounds for <math>c_{3m}</math> are built like this, with abc being shorthand for <math>\Gamma_{a,b,c}</math>:<br />
<br />
* <math>c_3</math> from (012) and permutations<br />
* <math>c_6</math> from (123,024) and perms<br />
* <math>c_9</math> from (234,135,045) and perms<br />
* <math>c_{12}</math> from (345,246,156,02A,057) and perms (A=10)<br />
* <math>c_{15}</math> from (456,357,267,13B,168,04B,078) and perms (B=11)<br />
<br />
To get the triples in each row, add 1 to the triples in the previous row; then include new triples that have a zero.<br />
<br />
A general formula for these points is given below. I think that they are triangle-free. (For N<21, ignore any triple with a negative entry.)<br />
<br />
* There are thirteen groups of points in the centre, formed from adding one of the following points, or its permutation, to (M,M,M), when N=3M:<br />
** (-7,-3,+10), (-7, 0,+7),(-7,+3,+4),(-6,-4,+10),(-6,-1,+7),(-6,+2,+4),(-5,-1,+6),(-5,+2,+3),(-4,-2,+6),(-4,+1,+3),(-3,+1,+2),(-2,0,+2),(-1,0,+1) <br />
* There are also eight string of points, stretching to the edges of the (abc) triangle:<br />
** For N=6K = 3M<br />
*** M+(-8-2x,-6-2x,14+4x),M+(-8-2x,-3-2x,11+4x),M+(-8-2x,x,8+x),M+(-8-2x,3+x,5+x) and permutations (x>=0, M-8-2x>=0)<br />
*** M+(-9-2x,-5-2x,14+4x),M+(-9-2x,-2-2x,11+4x),M+(-9-2x,1+x,8+x),M+(-9-2x,4+x,5+x) and permutations (x>=0, M-9-2x>=0)<br />
<br />
<br />
An alternate construction:<br />
<br />
First define a sequence, of all positive numbers which, in base 3, do not contain a 1. Add 1 to all multiples of 3 in this sequence. This sequence does not contain a length-3 arithmetic progression.<br />
<br />
It starts 1,2,7,8,19,20,25,26,55, …<br />
<br />
Second, list all the (abc) triples for which the larger two differ by a number<br />
from the sequence, excluding the case when the smaller two differ by 1, but then including the case when (a,b,c) is a permutation of N/3+(-1,0,1)<br />
<br />
== Asymptotics ==<br />
<br />
DHJ(3) is equivalent to the upper bound<br />
<br />
:<math>c_n \leq o(3^n)</math><br />
<br />
In the opposite direction, observe that if we take a set <math>S \subset [3n]</math> that contains no 3-term arithmetic progressions, then the set <math>\bigcup_{(a,b,c) \in \Delta_n: a+2b \in S} \Gamma_{a,b,c}</math> is line-free. From this and the Behrend construction it appears that we have the lower bound<br />
<br />
:<math>c_n \geq 3^{n-O(\sqrt{\log n})}.</math><br />
<br />
More precisely, we have<br />
<br />
:<math>c_n > C 3^{n - 4\sqrt{\log 2}\sqrt{\log n}+\frac 12 \log \log n}</math><br />
for some absolute constant C, and where all logarithms are base-3.<br />
<br />
'''Proof''' For convenience, let n be a multiple of 3. Elkin’s bound gives <math>r_3(\sqrt{n}) > C \sqrt{n} (\log n)^{1/4} \exp_2(-2 \sqrt{\log_2 n})</math>, and let <math>R</math> be a subset of <math>(-3\sqrt{n}/2,3\sqrt{n}/2)</math> without 3-term APs and with size <math>r_3(\sqrt{n})</math>, and with all elements being integer multiples of 3 (again as a matter of convenience). For each <math>r,s\in R</math>, let <math>a = (n-r-s)/3</math>. The set <math>A</math> is the union of all <math>\Gamma_{a,a+r,a+s}</math>. Since all of <math>a, a+r,a+s</math> are between <math>n/3-2\sqrt{n}</math> and <math>n/3+2\sqrt{n}</math>, the size of <math>\Gamma_{a,a+r,a+s}</math> is at least <math>C 3^n / n</math>. Since there are <math>r_3(\sqrt{n})^2</math> choices for r and s, we have a set with size at least<br />
<br />
:<math>C (\sqrt{n} (\log n)^{1/4} \exp_2(-2 \sqrt{\log_2 n}))^2 3^n / n</math>.<br />
<br />
This simplifies to <math>C \sqrt{\log n} \exp_3(n-\alpha \sqrt{\log_3(n)})</math>, where <math>\alpha=4 \sqrt{\log_3(2)}</math>.<br />
<br />
Now suppose that <math>x_i\in \Gamma_{a_i,a_i+r_i,a_i+s_i}</math> is a combinatorial line in the set A. Then <math>(a_i+s_i)-(a_i)=s_i</math> is a 3-term AP contained in R, so the <math>s_i</math> are all the same. Similarly, all of the <math>r_i</math> are the same, and therefore all of the <math>a_i</math> are the same, too. But this implies that the <math>x_i</math> sequence is constant, which means the line is degenerate. <math>\Box</math><br />
<br />
[http://terrytao.wordpress.com/2009/02/05/upper-and-lower-bounds-for-the-density-hales-jewett-problem/#comment-35652 Numerics suggest] that the first large n construction given above above give a lower bound of roughly <math>2.7 \sqrt{\log(n)/n} \times 3^n</math>, which would asymptotically be inferior to the Behrend bound.<br />
<br />
The second large n construction had numerical asymptotics for <math>\log(c_n/3^n)</math> close to <math>1.2-\sqrt{\log(n)}</math> between n=1000 and n=10000, consistent with the Behrend bound.<br />
<br />
== Other k values ==<br />
<br />
A [http://abel.math.umu.se/~klasm/Data/HJ/ computer search] has found the following <math>c_n</math> values for different values of dimension n and edgelength k.<br />
<br />
{|<br />
|n\k || 2 || 3 || 4 || 5 || 6 || 7<br />
|-<br />
| 2 || 2 || 6 || 12 || 20 || 30 || 42<br />
|-<br />
| 3 || 3 || 18 || 48 || 100 || 180 || 294<br />
|-<br />
| 4 || 6 || 52 || 183 || 500 || 1029-1079 || 2058<br />
|-<br />
| 5 || 10 || 150 || 683-732 || 2500<br />
|}<br />
<br />
== Numerical methods ==<br />
<br />
A greedy algorithm [http://thetangentspace.com/wiki/Hales-Jewett_Theorem was implemented here]. The results were sharp for <math>n \leq 3</math> but were slightly inferior to the constructions above for larger n.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Moser%27s_cube_problemMoser's cube problem2009-03-22T12:22:32Z<p>121.220.134.232: /* n=4 */</p>
<hr />
<div>Define a ''Moser set'' to be a subset of <math>[3]^n</math> which does not contain any [[geometric line]], and let <math>c'_n</math> denote the size of the largest Moser set in <math>[3]^n</math>. The first few values are (see [http://www.research.att.com/~njas/sequences/A003142 OEIS A003142]):<br />
<br />
: <math>c'_0 = 1; c'_1 = 2; c'_2 = 6; c'_3 = 16; c'_4 = 43; c'_5 = 124.</math><br />
<br />
Beyond this point, we only have some upper and lower bounds, in particular <math>353 \leq c'_6 \leq 361</math>; see [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DsU-uZ1tK7VEg this spreadsheet] for the latest bounds.<br />
<br />
The best known asymptotic lower bound for <math>c'_n</math> is<br />
<br />
: <math>c'_n \gg 3^n/\sqrt{n}</math>,<br />
<br />
formed by fixing the number of 2s to a single value near n/3. Compare this to the DHJ(2) or Sperner limit of <math>2^n/\sqrt{n}</math>. Is it possible to do any better? Note that we have a [[Upper_and_lower_bounds#Asymptotics|significantly better bound]] for the DHJ(3) <math>c_n</math>:<br />
<br />
: <math>c_n \geq 3^{n-O(\sqrt{\log n})}</math>.<br />
<br />
A more precise lower bound is<br />
<br />
: <math>c'_n \geq \binom{n+1}{q} 2^{n-q}</math><br />
<br />
where q is the nearest integer to <math>n/3</math>, formed by taking all strings with q 2s, together with all strings with q-1 2s and an odd number of 1s. This for instance gives the lower bound <math>c'_5 \geq 120</math>, which compares with the upper bound <math>c'_5 \leq 3 c'_4 = 129</math>.<br />
<br />
Using [[DHJ(3)]], we have the upper bound<br />
<br />
: <math>c'_n = o(3^n)</math>,<br />
<br />
but no effective decay rate is known. It would be good to have a combinatorial proof of this fact (which is weaker than [[DHJ(3)]], but implies [[Roth's theorem]]).<br />
<br />
== Notation ==<br />
<br />
Given a Moser set A in <math>[3]^n</math>, we let a be the number of points in A with no 2s, b be the number of points in A with one 2, c the number of points with two 2s, etc. We call (a,b,c,...) the ''statistics'' of A. Given a slice S of <math>[3]^n</math>, we let a(S), b(S), etc. denote the statistics of that slice (dropping the fixed coordinates). Thus for instance if A = {11, 12, 22, 32}, then (a,b,c) = (1,2,1), and (a(1*),b(1*)) = (1,1).<br />
<br />
We call a statistic (a,b,c,...) ''attainable'' if it is attained by a Moser set. We say that an attainable statistic is ''Pareto-optimal'' if it cannot be pointwise dominated by any other attainable statistic (a',b',c',...) (thus <math>a' \geq a</math>, <math>b' \geq b</math>, etc.) We say that it is ''extremal'' if it is not a convex combination of any other attainable statistic (this is a stronger property than Pareto-optimal). For the purposes of maximising linear scores of attainable statistics, it suffices to check extremal statistics.<br />
<br />
We let <math>(\alpha_0,\alpha_1,\ldots)</math> be the normalized version of <math>(a,b,\ldots)</math>, in which one divides the number of points of a certain type in the set by the total number of points in the set. Thus for instance <math>\alpha_0 = a/2^n, \alpha_1 = b/(n 2^{n-1}), \alpha_3 = c/(\binom{n}{2} 2^{n-2})</math>, etc., and the <math>\alpha_i</math> range between 0 and 1. Averaging arguments show that any linear inequality obeyed by the <math>\alpha_i</math> at one dimension is automatically inherited by higher dimensions, as are shifted versions of this inequality (in which <math>\alpha_i</math> is replaced by <math>\alpha_{i+1}</math>.<br />
<br />
The idea in 'c-statistics' is to identify 1s and 3s but leave the 2s intact. Let’s use x to denote letters that are either 1 or 3, then the 81 points in <math>[3]^4</math> get split up into 16 groups: 2222 (1 point), 222x, 22×2, 2×22, x222 (two points each), 22xx, 2×2x, x22x, 2xx2, x2×2, xx22 (four points each), 2xxx, x2xx, xx2x, xxx2 (eight points each), xxxx (sixteen points). Let c(w) denote the number of points inside a group w, e.g. c(xx22) is the number of points of the form xx22 inside the set, and is thus an integer from 0 to 4.<br />
<br />
== n=0 ==<br />
<br />
We trivially have <math>c'_0=1.</math><br />
<br />
== n=1 ==<br />
<br />
We trivially have <math>c'_1=2.</math> The Pareto-optimal values of the statistics (a,b) are (1,1) and (2,0); these are also the extremals. We thus have the inequality <math>a+b \leq 2</math>, or in normalized notation<br />
<br />
:<math>2\alpha_0 + \alpha_1 \leq 2</math>.<br />
<br />
== n=2 ==<br />
<br />
We have <math>c'_2 = 6</math>; the upper bound follows since <math>c'_2 \leq 3 c'_1</math>, and the lower bound follows by deleting one of the two diagonals from <math>[3]^2</math> (these are the only extremisers).<br />
<br />
The extremiser has statistics (a,b,c) = (2,4,0), which are of course Pareto-optimal. If c=1 then we must have a, b at most 2 (look at the lines through 22). This is attainable (e.g. {11, 12, 22, 23, 31}, and so (2,2,1) is another Pareto-optimal statistic. If a=4, then b and c must be 0, so we get another Pareto-optimal statistic (4,0,0) (attainable by {11, 13, 31, 33} of course). If a=3, then c=0 and b is at most 2, giving another Pareto-optimal statistic (3,2,0); but this is a convex combination of (4,0,0) and (2,4,0) and is thus not extremal. Thus the complete set of extremal statistics are<br />
<br />
(4,0,0), (2,4,0), (2,2,1).<br />
<br />
The sharp linear inequalities obeyed by a,b,c (other than the trivial ones <math>a,b,c \geq 0</math>) are then<br />
<br />
*<math>2a+b+2c \leq 8</math><br />
*<math>b+2c \leq 4</math><br />
*<math>a+2c \leq 4</math><br />
*<math>c \leq 1</math>.<br />
<br />
In normalized notation, we have<br />
*<math>4\alpha_0 + 2\alpha_1 + \alpha_2 \leq 4</math><br />
*<math>2\alpha_1 + \alpha_2 \leq 2</math><br />
*<math>2\alpha_0 + \alpha_2 \leq 2</math><br />
*<math>\alpha_2 \leq 1</math>.<br />
<br />
The Pareto optimizers for c-statistics are (c(22),c(2x),c(x2),c(xx)) = (1112),(0222),(0004), which are covered by these linear inequalities: <br />
*<math>c(22)+c(2x)+c(xx) \le 4</math>, <br />
*<math>c(22)+c(x2)+c(xx) \le 4</math>.<br />
<br />
== n=3 ==<br />
<br />
We have <math>c'_3 = 16</math>. The lower bound can be seen for instance by taking all the strings with one 2, and half the strings with no 2 (e.g. the strings with an odd number of 1s). The upper bound can be deduced from the corresponding [[upper and lower bounds]] for <math>c_3 = 18</math>; the 17-point and 18-point line-free sets each contain a geometric line.<br />
<br />
If a Moser set in <math>[3]^3</math> contains 222, then it can have at most 14 points, since the remaining 26 points in the cube split into 13 antipodal pairs, and at most one of each pair can lie in the set. By exhausting over the <math>2^{13} = 8192</math> possibilities, it can be shown that it is impossible for a 14-point set to exist; any Moser set containing 222 must in fact omit at least one antipodal pair completely and thus have only 13 points. (A human proof of this fact can be [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
The Pareto-optimal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(3,6,5,0),(4,4,5,0),(3,7,4,0),(4,6,4,0), (3,9,3,0),(4,7,3,0),(5,4,3,0),(4,9,2,0),(5,6,2,0),(6,3,2,0),(3,10,1,0),(5,7,1,0), (6,4,1,0),(4,12,0,0),(5,9,0,0),(6,6,0,0),(7,3,0,0),(8,0,0,0).<br />
<br />
These were found from a search of the <math>2^{27}</math> subsets of the cube.<br />
A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here].<br />
<br />
The extremal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(4,4,5,0),(4,6,4,0),(4,12,0,0),(8,0,0,0)<br />
<br />
The sharp linear bounds are :<br />
<br />
* <math>2a+b+2c+4d \leq 22</math><br />
* <math>3a+2b+3c+6d \leq 36</math><br />
* <math>7a+2b+4c+8d \leq 56</math><br />
* <math>6a+2b+3c+6d \leq 48</math><br />
* <math>b+c+3d \leq 12</math><br />
* <math>a+2c+4d \leq 14</math><br />
* <math>5a+4c+8d \leq 40</math><br />
* <math>a+4d \leq 8</math><br />
* <math>b+6d \leq 12</math><br />
* <math>c+3d \leq 6</math><br />
* <math>d \leq 1</math><br />
<br />
In normalized notation,<br />
* <math>8\alpha_0+ 6\alpha_1 + 6\alpha_2 + 2\alpha_3 \leq 11</math><br />
* <math>4\alpha_0+4\alpha_1+3\alpha_2+\alpha_3 \leq 6</math><br />
* <math>7\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 7</math><br />
* <math>8\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 8</math><br />
* <math>4\alpha_1+2\alpha_2+\alpha_3 \leq 4</math><br />
* <math>4\alpha_0+6\alpha_2+2\alpha_3 \leq 7</math><br />
* <math>5\alpha_0+3\alpha_2+\alpha_3 \leq 5</math><br />
* <math>2\alpha_0+\alpha_3 \leq 2</math><br />
* <math>2\alpha_1+\alpha_3 \leq 2</math><br />
* <math>2\alpha_2+\alpha_3 \leq 2</math><br />
* <math>\alpha_3 \leq 1</math><br />
<br />
The c-statistics can also be found from an exhaustive search of Moser sets. The resulting Pareto sets and linear inequalities can be found on Sheet 8 of [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# this spreadsheet]<br />
<br />
== n=4 ==<br />
<br />
A [http://abel.math.umu.se/~klasm/extremal-moser-n=5-t=3 computer search] has obtained all extremisers to <math>c'_4=43</math>. The 42-point solutions can be found [http://abel.math.umu.se/~klasm/moser-n=4-t=3-p=42.gz here].<br />
<br />
'''Proof that <math>c'_4 \leq 43</math>''': <br />
When e=1 (i.e. the 4D set contains 2222) then we have at most 41 points (in fact at most 39) by counting antipodal points, so assume e=0.<br />
<br />
Define the score of a 3D slice to be a/4+b/3+c/2+d. Observe from double counting that the size of a 4D set is the sum of the scores of its eight side slices.<br />
<br />
But by [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en looking at the extremals] we see that the largest score is 44/8, attained at only one point, namely when (a,b,c,d) = (2,6,6,0). So the only way one can have a 44-point set is if all side slices are (2,6,6,0), or equivalently if the whole set has statistics (a,b,c,d,e) = (4,16,24,0,0). But then we have all the points with two 2s, which means that the four "a" points cannot be separated by Hamming distance 2. We conclude that we must have an antipodal pair among the "a" points with an odd number of 1s, and an antipodal pair among the "a" points with an even number of 1s. By the symmetries of the cube, we may take the a-set to then be 1111, 3333, 1113, 3331. But then the "b" set must exclude both 1112 and 3332, and so can have at most three points in the eight-point set xyz2 (with x,y,z=1,3) rather than four (to get four points one must alternate in a checkerboard pattern). Adding this to the at most four points of the form xy2z, x2yz, 2xyz we see that b is at most 15, a contradiction. <math>\Box</math><br />
<br />
Given a subset of <math>[3]^4</math>, let a be the number of points with no 2s, b be the number of points with 1 2, and so forth. The quintuple (a,b,c,d,e) thus lies between (0,0,0,0,0) and (16,32,24,8,1).<br />
<br />
The 43-point solutions have distributions (a,b,c,d,e) as follows:<br />
<br />
* (5,20,18,0,0) [16 solutions]<br />
* (4,16,23,0,0) [768 solutions]<br />
* (3,16,24,0,0) [512 solutions]<br />
* (4,15,24,0,0) [256 solutions]<br />
<br />
The 42-point solutions are [http://abel.math.umu.se/~klasm/Moser-42-stat.pdf distributed as follows]:<br />
<br />
* (6,24,12,0,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-12 8 solutions]] <br />
* (5,20,17,0,0) [576 solutions]<br />
* (5,19,18,0,0) [384 solutions]<br />
* (6,16,18,2,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-e=2 192 solutions]]<br />
* (4,20,18,0,0) [272 solutions]<br />
* (5,17,20,0,0) [192 solutions]<br />
* (5,16,21,0,0) [3584 solutions]<br />
* (4,17,21,0,0) [768 solutions]<br />
* (4,16,22,0,0) [26880 solutions]<br />
* (5,15,22,0,0) [1536 solutions]<br />
* (4,15,23,0,0) [22272 solutions]<br />
* (3,16,23,0,0) [15744 solutions]<br />
* (4,14,24,0,0) [4224 solutions]<br />
* (3,15,24,0,0) [8704 solutions]<br />
* (2,16,24,0,0) [896 solutions]<br />
<br />
Note how c is usually quite large, and d quite low.<br />
<br />
One of the (6,24,12,0,0) solutions is <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math> (i.e. the set of points containing exactly two 1s, and/or exactly two 3s). The other seven are reflections of this set.<br />
<br />
There are 2,765,200 41-point solutions, [http://abel.math.umu.se/~klasm/solutions-4-t=3-41-moser.gz listed here]. The statistics for such points can be [http://abel.math.umu.se/~klasm/Moser-41-stat.pdf found here]. Noteworthy features of the statistics:<br />
<br />
* d is at most 3 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=3.gz 256 exceptional solutions] of the shape (5,15,18,3,0), have d at most 2; here are the [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=2.gz d=2 solutions] and [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=1.gz d=1 solutions])<br />
* c is at least 6 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-c=6.gz 16 exceptional solutions] of the shape (7,28,6,0,0), have c at least 11).<br />
<br />
Statistics for the 41-point, 42-point, and 43-point solutions can be found [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en here].<br />
<br />
If a Moser set in <math>[3]^4</math> contains 2222, then by the n=3 theory, any middle slice (i.e. 2***, *2**, **2*, or ***2) is missing at least one antipodal pair. But each antipodal pair belongs to at most three middle slices, thus two of the 40 antipodal pairs must be completely missing. As a consequence, any Moser set containing 2222 can have at most 39 points.<br />
(A more refined analysis can be found at [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
We have the following inequalities connecting a,b,c,d,e:<br />
<br />
* <math>4a+b \leq 64</math>: There are 32 lines connecting two "a" points with a "b" point; each "a" point belongs to four of these lines, and each "b" point belongs to one. But each such line can have at most two points in the set, and the claim follows.<br />
* This can be refined to <math>4a+b+\frac{2}{3} c \leq 64</math>: There are 24 planes connecting four "a" points, four "b" points, and one "c" point; each "a" point belongs to six of these, each "b" point belongs to three, and each "c" point belongs to one. For each of these planes, we have <math>2a + b + 2c \leq 8</math> from the n=2 theory, and the claim follows.<br />
* <math>6a+2c \leq 96</math>: There are 96 lines connecting two "a" points with a "c" point; each "a" point belongs to six of these lines, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
* <math>3b+2c \leq 96</math>: There are 48 lines connecting two "b" points to a "c" point; each "b" point belongs to three of these points, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
<br />
The inequalities for n=3 imply inequalities for n=4. Indeed, there are eight side slices of <math>[3]^4</math>; each "a" point belongs to four of these, each "b" point belongs to three, each "c" point belongs to two, and each "d" point belongs to one. Thus, any inequality of the form<br />
:<math> \alpha a + \beta b + \gamma c + \delta d \leq M</math><br />
in three dimensions implies the inequality<br />
:<math> 4 \alpha a + 3 \beta b + 2 \gamma c + \delta d \leq 8M</math><br />
in four dimensions. Thus we have<br />
<br />
* <math>8a+3b+4c+4d \leq 176</math><br />
* <math>2a+b+c+d \leq 48</math><br />
* <math>14a+3b+4c+4d \leq 244</math><br />
* <math>4a+b+c+d \leq 64</math> <br />
* <math>3b+2c+3d \leq 96</math><br />
* <math>a+c+d \leq 28</math><br />
* <math>5a+2c+2d \leq 80</math><br />
* <math>a+d \leq 16</math><br />
* <math>b+2d \leq 32</math><br />
* <math>2c+3d \leq 48</math><br />
<br />
Cubes also sit diagonally in the 4-dimensional cube. They may have coordinates xxyz, where xx runs over (11,22,33) or (13,22,31), while y and z run over (1,2,3). These cubes have a different distribution of 2s than the ordinary slices: (a,b,c,d,e) = (8,8,6,4,1) instead of (8,12,6,1,0) for the side slices and (0,8,12,6,1) for the middle slices. So a different set of inequalities arise. It is possible to run through the same procedure as described above for n=3, and the inequalities that arise are <br />
* <math>2a+b+2c+2d+4e \le 24</math> and <br />
* <math>4a+b+2c+2d+4e \le 32</math> <br />
within the 3D cube, which when averaged becomes <br />
* <math>4a+b+2c+4d+16e \le 96</math> and <br />
* <math>8a+b+2c+4d+16e \le 128</math> <br />
for the 4D cube. A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here] on sheet 2. Other sheets of this spreadsheet contain results for a 3D cube sitting diagonally in a 5D, 6D or 7D cube. Notice the similarity of the equations that arise for the xxxyz, xxyyz and xxxxyz diagonals. Also notice the (a,b,c,...) statistics for the xxxxyyz diagonals have the same Pareto sets and linear inequalities as the cube's c-statistics.<br />
<br />
The c-statistics for the 3D cube are bounded by a set of 20 inequalities. By considering the fourteen ways a 3D cube sits in the 4D cube, the result is a set of 239 inequalities for the 4D cube's c-statistics. However, all 239 inequalities are satisfied by a 44-point solution given by c(xxxx) = c(2xxx) = c(22xx) = 4, and permutations.<br />
<br />
== n=5 ==<br />
<br />
Let (A,B,C,D,E,F) be the statistics of a five-dimensional Moser set, thus (A,B,C,D,E,F) varies between (0,0,0,0,0,0) and (32,80,80,40,10,1).<br />
<br />
There are several Moser sets with the statistics (4,40,80,0,0,0), which thus have 124 points. Indeed, one can take<br />
<br />
* all points with two 2s;<br />
* all points with one 2 and an even number of 1s; and<br />
* (13111),(13113),(31311),(13333). Any two of these four points differ in three places, except for one pair of points that differ in one place.<br />
<br />
For the rest of this section, we assume that the Moser set <math>{\mathcal A}</math> is a 125-point Moser set. We will prove that <math>{\mathcal A}</math> cannot exist.<br />
<br />
:'''Lemma 1''': F=0.<br />
<br />
'''Proof''' If F is non-zero, then the Moser set contains 22222, then each of the 121 antipodal pairs can have at most one point in the set, leading to only 122 points. <math>\Box</math><br />
<br />
:'''Lemma 2''': Every middle slice of <math>{\mathcal A}</math> has at most 41 points.<br />
<br />
'''Proof''' Without loss of generality we may consider the 2**** slice. There are two cases, depending on the value of c(2****).<br />
<br />
Suppose first that c is at least 17; thus there are at least 17 points of the form 222xy, 22x2y, 22xy2, 2x2y2, 2xy22, or 2x22y, where the x, y denote 1 or 3. This gives 34 "xy" wildcards in all in four coordinate slots; by the pigeonhole principle one of the slots sees at least 9 of the wildcards. By symmetry, we may assume that the second coordinate slot sees at least 9 of these wildcards, thus there are at least 9 points of the form 2x22y, 2x2y2, 2xy22. The x=1, x=3 cases can absorb at most six of these, thus each of these cases must absorb at least three points, with at least one absorbing at least five. Let's say that it's the x=3 case that absorbs 5; thus <math>d(*1***) \geq 3</math> and <math>d(*3***) \geq 5</math>. From the n=4 theory this means that the *1*** slice has at most 41 points, and the *3*** slice has at most 40. Meanwhile, the middle slice has at most 43, leading to 41+41+42=124 points in all.<br />
<br />
Now suppose c is less than 17; then by the n=4 theory the middle slice is one of the eight (6,24,12,0,0) sets. Without loss of generality we may take it to be <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math>; in particular, the middle slice contains the points 21122 21212 21221 23322 23232 23223. In particular, the *1*** and *3*** slices have a "d" value of at least three, and so have at most 41 points. If the *2*** slice has at most 42 points, then we are at 41+42+41=124 points as needed, but if we have 43 or more, then we are back in the first case (as <math>c(*2***) \geq 17</math>) after permuting the indices.<br />
<math>\Box</math><br />
<br />
Since 125=41+41+43, we thus have<br />
<br />
:'''Corollary 1''': Every side slice of <math>{\mathcal A}</math> has at least 41 points. If one side slice does have 41 points, then the other has 43.<br />
<br />
Combining this with the n=4 statistics, we conclude<br />
<br />
:'''Corollary 2''': Every side slice of <math>{\mathcal A}</math> has e=0, and <math>d \leq 3</math>. Given two opposite side slices, e.g. 1**** and 3****, we have <math>d(1****)+d(3****) \leq 4</math>.<br />
<br />
:'''Corollary 3''': E=0.<br />
<br />
:'''Corollary 4''': Any middle slice has a "c" value of at most 8.<br />
<br />
'''Proof''' Let's work with the 2**** slice. By Corollary 2, there are at most four contributions to c(2****) of the form 21*** or 23***, and similarly for the other three positions. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 3''': <math>2B+C \leq 160</math>. <br />
<br />
'''Proof''': There are 160 lines connecting one "C" point to two "B" points (e.g. 11112, 11122, 11132); each "C" point lies in two of these, and each "B" point lies on four. A Moser set can have at most two points out of each of these lines. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 4''' Given m A-points with <math>m \geq 5</math>, there exists at least m-4 pairs (a,b) of such A-points with Hamming distance exactly two (i.e. b differs from a in exactly two places).<br />
<br />
'''Proof''' It suffices to check this for m=5, since the case of larger m then follows by locating a pair, removing a point that contributes to that pair, and using the induction hypothesis. Given 5 A-points, we may assume by the pigeonhole principle and symmetry that at least three of them have an odd number of 1s. Suppose 11111 is one of the points, and that no pair has Hamming distance 2. All points with two 3s are excluded, so the only points allowed with an odd number of 1s are those with four 3s. But all those points differ from each other in two positions, so at most one of them is allowed. <math>\Box</math><br />
<br />
:'''Corollary 5''' <math>C \leq 79</math>.<br />
<br />
'''Proof''': Suppose for contradiction that C=80, then D=0 and <math>B \leq 40</math>. From Lemma 4 we also see that A cannot be 5 or more, leading to the contradiction.<br />
<math>\Box</math><br />
<br />
Define the '''score''' of a Moser set in <math>[3]^4</math> to be the quantity <math>a + 5b/4 + 5c/3 + 5d/2 + 5e</math>. Double-counting (and Lemma 2) gives<br />
<br />
:'''Lemma 5''' The total score of all the ten side-slices of <math>{\mathcal A}</math> is equal to <math>5|{\mathcal A}| = 5 \times 125</math>. In particular, there exists a pair of opposite side-slices whose scores add up to at least 125.<br />
<br />
By Lemma 5 and symmetry, we may assume that the 1**** and 3**** slices have score adding up to at least 125. By Lemma 2, the 2**** slice has at most 41 points, which imply that the 1**** and 3**** have 41, 42, or 43 points. From the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] we know that all 41-point, 42-point, 43-point slices have score less than 62, with the following exceptions:<br />
<br />
# (2,16,24,0,0) [42 points, score: 62]<br />
# (4,16,23,0,0) [43 points, score: 62 1/3]<br />
# (4,15,24,0,0) [43 points, score: 62 3/4]<br />
# (3,16,24,0,0) [43 points, score: 63]<br />
<br />
Thus the 1**** and 3**** slices must come from the above list. Furthermore, if one of the slices is of type 1, then the other must be of type 4, and if one slice is of type 2, then the other must be of type 3 or 4.<br />
<br />
:'''Lemma 6''' There exists one cut in which the side slices have total score strictly greater than 125 (i.e. they thus involve only Type 2, Type 3, and Type 4 slices, with at least one side slice not equal to Type 2, and the cut here is of the form 43+39+43).<br />
<br />
'''Proof''' If not, then all cuts have side slices exactly equal to 125, which by the above table implies that one is Type 1 and one is Type 4, in particular all side slices have c=24. But this forces C=80, contradicting Corollary 5. <math>\Box</math><br />
<br />
Adding up the corners we conclude<br />
<br />
:'''Corollary 6''' <math>6 \leq A \leq 8</math>.<br />
<br />
:'''Lemma 7''' <math>D=0</math>.<br />
<br />
'''Proof''' Firstly, from Lemma 6 we have a cut in which the two side slices are omitting at most one C-point between them (and have no D-point), which forces the middle slice to have at most one D-point; thus D is at most 1.<br />
<br />
Now suppose instead that D=1 (e.g. if 11222 was in the set); then there would be two choices of coordinates in which one of the side slices would have d=1 (e.g. 1**** and *1****). But the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] show that such slices have a score of at most 59 7/12, so that the total score from those two coordinates is at most 63 + 59 7/12 = 122 7/12. On the other hand, the other three slices have a net score of at most 63+63 = 126. This averages out to at most 124.633... < 125, a contradiction. <math>\Box</math><br />
<br />
:'''Corollary 7''' Every middle slice has at most 40 points.<br />
<br />
'''Proof''' From Lemma 7 we see that the middle slice must have c=0, but from the n=4 statistics this is not possible for any slice of size 41 or higher (alternatively, one can use the inequalities <math>4a+b \leq 64</math> and <math>b \leq 32</math>). <math>\Box</math><br />
<br />
Each middle slice now has 39 or 40 points, with c=d=e=0, and so from the inequalities <math>4a+b \leq 64</math>, <math>b \leq 32</math> must have statistics (8,32,0,0,0),(7,32,0,0,0), or (8,31,0,0,0). In particular the "a" index of the middle slices is at most 8. Summing over all middle slices we conclude<br />
<br />
:'''Lemma 9''' B is at most 40.<br />
<br />
:'''Lemma 10''' C is equal to 78 or 79.<br />
<br />
'''Proof''' By Corollary 5, it suffices to show that <math>C \geq 78</math>.<br />
<br />
As 125=43+39+43, we see that every center slice must have at least 39 points. By Lemma 2 and Lemma 7 the center slice has c=d=e=0, thus the center slice has a+b >= 39. On the other hand, from the n=4 theory we have 4a+b <= 64, which forces b >= 31.<br />
<br />
By double counting, we see that 2C is equal to the sum of the b's of all the five center slices. Thus C >= 5*31/2 = 77.5 and the claim follows. <math>\Box</math><br />
<br />
From Lemmas 9 and 10 we see that <math>B+C \leq 119</math>, and thus <math>A \geq 6</math>. Also, if <math>A \geq 7</math>, then by Lemma 4 we have at least three pairs of A points with Hamming distance 2. At most two of these pairs eliminate the same C point, so we would have C=78 in that case. <br />
<br />
Putting all the above facts together, we see that (A,B,C,D,E,F) must be one of the following triples:<br />
<br />
* (6,40,79,0,0,0)<br />
* (7,40,78,0,0,0)<br />
* (8,39,78,0,0,0)<br />
<br />
All three cases can be eliminated, giving <math>c'_5=124</math>.<br />
<br />
=== Elimination of (6,40,79,0,0,0) ===<br />
<br />
Look at the 6 A points. From Lemma 4 we have at least two pairs (a,b), (c,d) of A-points that have Hamming separation 2.<br />
<br />
Now look at the midpoints of these two pairs; these midpoints are C-points cannot lie in the set. But we have exactly one C-point missing from the set, thus the midpoints must be the same. By symmetry, we may thus assume that the two pairs are (11111,11133) and (11113,11131). Thus 11111,11133, 11113, 11131 are in the set, and so every C-point other than 11122 is in the set. On the other hand, the B-points 11121, 11123, 11112, 11132 lie outside the set.<br />
<br />
At most one of 11312, 11332 lie in the set (since 11322 lies in the set). Suppose that 11312 lies outside the set, then we have a pair (xy1z2, xy3z2) with x,y,z = 1,3 that is totally omitted from the set, namely (11112,11312). On the other hand, every other pair of this form can have at most one point in the set, thus there are at most seven points in the set of the form (xyzw2) with x,y,z,w = 1,3. Similarly there are at most 8 points of the form xyz2w, or of xy2zw, x2yzw, 2xyzw, leading to at most 39 B-points in all, contradiction.<br />
<br />
=== Elimination of (7,40,78,0,0,0) ===<br />
<br />
By Lemma 4, we have at least three pairs of A-points of distance two apart that lie in the set.<br />
The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs must have the same midpoint, so we may assume as before that 111xy lies in the set for x,y=1,3, which implies that 1112* and 111*2 lie outside the set.<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 316. On the other hand, every one of the 160 lines can have at most two points, and two of these lines (namely 1112*, 111*2) have no points. Thus all the other lines must have exactly two points. <br />
<br />
We know that the C-point 11122 is missing from the set; there is one other missing C-point. Since 1112x, 111x2 lie outside the set, we conclude from the previous paragraph that 1132x, 113x2 and 1312x, 131x2 lie in the set. Taking midpoints we conclude that 11322 and 13122 lie outside the set. But this is now three C-points missing (together with 11122), a contradiction.<br />
<br />
=== Elimination of (8,39,78,0,0,0) ===<br />
<br />
By Lemma 4 we have at least four pairs of A-points of distance two apart that lie in the set. The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs (a,b), (c,d) must have the same midpoint p, and the other two pairs (a',b'), (c',d') must also have the same midpoint p'. (Note that every C-point is the midpoint of at most two such pairs.)<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 312. Every one of the 160 lines can have at most two points, and four of these (those in the plane of (a,b,c,d) or of (a',b',c',d') have no points. Thus all other lines must have exactly two points.<br />
<br />
Without loss of generality we have (a,b)=(11111,11133), (c,d) = (11113,11131), thus p = 11122. By permuting the first three indices, we may assume that p' is not of the form x2y2z, x2yz2, xy22z, xy2z2. Then 1112x lies outside the set and 1122x lies in the set, so by the above paragraph 1132x lies in the set; similarly for 113x2, 1312x, 131x2. This implies that 13122, 11322 lie outside the set, but this (together with 11122) shows that at least three C-points are missing, a contradiction.<br />
<br />
== General n ==<br />
<br />
General solution for <math>c'_N</math>. For any q, the union of the following sets is a Moser set. The size of this Moser set is maximized when q is near N/3, in which case it is <math>O(3^n/\sqrt{n})</math>. Most of the points are in the layers with q 2s and q-1 2s.<br />
<br />
* q 2s, all points from A(N-q,1)<br />
* q-1 2s, points from A(N-q+1,2)<br />
* q-2 2s, points from A(N-q+2,3)<br />
* etc.<br />
<br />
where A(m,d) is a subset of <math>[1,3]^m</math> for which any two points differ from each other in at least d places.<br />
<br />
Mathworld’s entry on error-correcting codes suggests it might be NP-complete to find the maximum size of A(m,d) in general. However, the size of A(m,d) can be bounded by sphere-packing arguments. For example, points in A(m,3) are surrounded by non-intersecting spheres of Hamming radius 1, and points in A(m,5) are surrounded by non-intersecting spheres of Hamming radius 2.<br />
<br />
* <math>|A(m,1)| = 2^m</math> because it includes all points in <math>[1,3]^m</math><br />
* <math>|A(m,2)| = 2^{m-1}</math> because it can include all points in <math>[1,3]^m</math> with an odd number of ones.<br />
* <math>|A(m,3)| \le 2^m/(m+1)</math> because the size of a Hamming sphere is m+1.<br />
<br />
The integer programming routine from Maple 12 was used to obtain upper bounds for <math>c'_6</math> and <math>c'_7</math>. A large number of linear inequalities, such as those described above in sections (n=3) and (n=4), were combined. The details are in [[Maple calculations]]. The results were that <math>c'_6 \le 361</math> and <math>c'_7 \le 1071</math>.<br />
<br />
A [[genetic algorithm]] has provided the following examples:<br />
<br />
* <math>c'_6 \geq 353</math> (26 examples; [http://twofoldgaze.wordpress.com/2009/03/10/353-element-solution/ here is one])<br />
* <math>c'_7 \geq 988</math> [http://twofoldgaze.wordpress.com/2009/03/10/978-element-solution/ Here is the example]<br />
<br />
== Larger sides (k>3) ==<br />
<br />
The following set gives a lower bound for Moser’s cube <math>[4]^n</math> (values 1,2,3,4): Pick all points where q entries are 2 or 3; and also pick those where q-1 entries are 2 or 3 and an odd number of entries are 1. This is maximized when q is near n/2, giving a lower bound of<br />
<br />
<math>\binom{n}{n/2} 2^n + \binom{n}{n/2-1} 2^{n-1}</math><br />
<br />
which is comparable to <math>4^n/\sqrt{n}</math> by [[Stirling's formula]].<br />
<br />
For k=5 (values 1,2,3,4,5) If A, B, C, D, and E denote the numbers of 1-s, 2-s, 3-s, 4-s and 5-s then the first three points of a geometric line form a 3-term arithmetic progression in A+E+2(B+D)+3C. So, for k=5 we have a similar lower bound for the Moser’s problem as for DHJ k=3, i.e. <math>5^{n - O(\sqrt{\log n})}</math>.<br />
<br />
The k=6 version of Moser implies DHJ(3). Indeed, any k=3 combinatorial line-free set can be "doubled up" into a k=6 geometric line-free set of the same density by pulling back the set from the map <math>\phi: [6]^n \to [3]^n</math> that maps 1, 2, 3, 4, 5, 6 to 1, 2, 3, 3, 2, 1 respectively; note that this map sends k=6 geometric lines to k=3 combinatorial lines.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Moser%27s_cube_problemMoser's cube problem2009-03-22T05:19:40Z<p>121.220.134.232: /* General n */</p>
<hr />
<div>Define a ''Moser set'' to be a subset of <math>[3]^n</math> which does not contain any [[geometric line]], and let <math>c'_n</math> denote the size of the largest Moser set in <math>[3]^n</math>. The first few values are (see [http://www.research.att.com/~njas/sequences/A003142 OEIS A003142]):<br />
<br />
: <math>c'_0 = 1; c'_1 = 2; c'_2 = 6; c'_3 = 16; c'_4 = 43; c'_5 = 124.</math><br />
<br />
Beyond this point, we only have some upper and lower bounds, in particular <math>353 \leq c'_6 \leq 361</math>; see [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DsU-uZ1tK7VEg this spreadsheet] for the latest bounds.<br />
<br />
The best known asymptotic lower bound for <math>c'_n</math> is<br />
<br />
: <math>c'_n \gg 3^n/\sqrt{n}</math>,<br />
<br />
formed by fixing the number of 2s to a single value near n/3. Compare this to the DHJ(2) or Sperner limit of <math>2^n/\sqrt{n}</math>. Is it possible to do any better? Note that we have a [[Upper_and_lower_bounds#Asymptotics|significantly better bound]] for the DHJ(3) <math>c_n</math>:<br />
<br />
: <math>c_n \geq 3^{n-O(\sqrt{\log n})}</math>.<br />
<br />
A more precise lower bound is<br />
<br />
: <math>c'_n \geq \binom{n+1}{q} 2^{n-q}</math><br />
<br />
where q is the nearest integer to <math>n/3</math>, formed by taking all strings with q 2s, together with all strings with q-1 2s and an odd number of 1s. This for instance gives the lower bound <math>c'_5 \geq 120</math>, which compares with the upper bound <math>c'_5 \leq 3 c'_4 = 129</math>.<br />
<br />
Using [[DHJ(3)]], we have the upper bound<br />
<br />
: <math>c'_n = o(3^n)</math>,<br />
<br />
but no effective decay rate is known. It would be good to have a combinatorial proof of this fact (which is weaker than [[DHJ(3)]], but implies [[Roth's theorem]]).<br />
<br />
== Notation ==<br />
<br />
Given a Moser set A in <math>[3]^n</math>, we let a be the number of points in A with no 2s, b be the number of points in A with one 2, c the number of points with two 2s, etc. We call (a,b,c,...) the ''statistics'' of A. Given a slice S of <math>[3]^n</math>, we let a(S), b(S), etc. denote the statistics of that slice (dropping the fixed coordinates). Thus for instance if A = {11, 12, 22, 32}, then (a,b,c) = (1,2,1), and (a(1*),b(1*)) = (1,1).<br />
<br />
We call a statistic (a,b,c,...) ''attainable'' if it is attained by a Moser set. We say that an attainable statistic is ''Pareto-optimal'' if it cannot be pointwise dominated by any other attainable statistic (a',b',c',...) (thus <math>a' \geq a</math>, <math>b' \geq b</math>, etc.) We say that it is ''extremal'' if it is not a convex combination of any other attainable statistic (this is a stronger property than Pareto-optimal). For the purposes of maximising linear scores of attainable statistics, it suffices to check extremal statistics.<br />
<br />
We let <math>(\alpha_0,\alpha_1,\ldots)</math> be the normalized version of <math>(a,b,\ldots)</math>, in which one divides the number of points of a certain type in the set by the total number of points in the set. Thus for instance <math>\alpha_0 = a/2^n, \alpha_1 = b/(n 2^{n-1}), \alpha_3 = c/(\binom{n}{2} 2^{n-2})</math>, etc., and the <math>\alpha_i</math> range between 0 and 1. Averaging arguments show that any linear inequality obeyed by the <math>\alpha_i</math> at one dimension is automatically inherited by higher dimensions, as are shifted versions of this inequality (in which <math>\alpha_i</math> is replaced by <math>\alpha_{i+1}</math>.<br />
<br />
The idea in 'c-statistics' is to identify 1s and 3s but leave the 2s intact. Let’s use x to denote letters that are either 1 or 3, then the 81 points in <math>[3]^4</math> get split up into 16 groups: 2222 (1 point), 222x, 22×2, 2×22, x222 (two points each), 22xx, 2×2x, x22x, 2xx2, x2×2, xx22 (four points each), 2xxx, x2xx, xx2x, xxx2 (eight points each), xxxx (sixteen points). Let c(w) denote the number of points inside a group w, e.g. c(xx22) is the number of points of the form xx22 inside the set, and is thus an integer from 0 to 4.<br />
<br />
== n=0 ==<br />
<br />
We trivially have <math>c'_0=1.</math><br />
<br />
== n=1 ==<br />
<br />
We trivially have <math>c'_1=2.</math> The Pareto-optimal values of the statistics (a,b) are (1,1) and (2,0); these are also the extremals. We thus have the inequality <math>a+b \leq 2</math>, or in normalized notation<br />
<br />
:<math>2\alpha_0 + \alpha_1 \leq 2</math>.<br />
<br />
== n=2 ==<br />
<br />
We have <math>c'_2 = 6</math>; the upper bound follows since <math>c'_2 \leq 3 c'_1</math>, and the lower bound follows by deleting one of the two diagonals from <math>[3]^2</math> (these are the only extremisers).<br />
<br />
The extremiser has statistics (a,b,c) = (2,4,0), which are of course Pareto-optimal. If c=1 then we must have a, b at most 2 (look at the lines through 22). This is attainable (e.g. {11, 12, 22, 23, 31}, and so (2,2,1) is another Pareto-optimal statistic. If a=4, then b and c must be 0, so we get another Pareto-optimal statistic (4,0,0) (attainable by {11, 13, 31, 33} of course). If a=3, then c=0 and b is at most 2, giving another Pareto-optimal statistic (3,2,0); but this is a convex combination of (4,0,0) and (2,4,0) and is thus not extremal. Thus the complete set of extremal statistics are<br />
<br />
(4,0,0), (2,4,0), (2,2,1).<br />
<br />
The sharp linear inequalities obeyed by a,b,c (other than the trivial ones <math>a,b,c \geq 0</math>) are then<br />
<br />
*<math>2a+b+2c \leq 8</math><br />
*<math>b+2c \leq 4</math><br />
*<math>a+2c \leq 4</math><br />
*<math>c \leq 1</math>.<br />
<br />
In normalized notation, we have<br />
*<math>4\alpha_0 + 2\alpha_1 + \alpha_2 \leq 4</math><br />
*<math>2\alpha_1 + \alpha_2 \leq 2</math><br />
*<math>2\alpha_0 + \alpha_2 \leq 2</math><br />
*<math>\alpha_2 \leq 1</math>.<br />
<br />
The Pareto optimizers for c-statistics are (c(22),c(2x),c(x2),c(xx)) = (1112),(0222),(0004), which are covered by these linear inequalities: <br />
*<math>c(22)+c(2x)+c(xx) \le 4</math>, <br />
*<math>c(22)+c(x2)+c(xx) \le 4</math>.<br />
<br />
== n=3 ==<br />
<br />
We have <math>c'_3 = 16</math>. The lower bound can be seen for instance by taking all the strings with one 2, and half the strings with no 2 (e.g. the strings with an odd number of 1s). The upper bound can be deduced from the corresponding [[upper and lower bounds]] for <math>c_3 = 18</math>; the 17-point and 18-point line-free sets each contain a geometric line.<br />
<br />
If a Moser set in <math>[3]^3</math> contains 222, then it can have at most 14 points, since the remaining 26 points in the cube split into 13 antipodal pairs, and at most one of each pair can lie in the set. By exhausting over the <math>2^{13} = 8192</math> possibilities, it can be shown that it is impossible for a 14-point set to exist; any Moser set containing 222 must in fact omit at least one antipodal pair completely and thus have only 13 points. (A human proof of this fact can be [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
The Pareto-optimal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(3,6,5,0),(4,4,5,0),(3,7,4,0),(4,6,4,0), (3,9,3,0),(4,7,3,0),(5,4,3,0),(4,9,2,0),(5,6,2,0),(6,3,2,0),(3,10,1,0),(5,7,1,0), (6,4,1,0),(4,12,0,0),(5,9,0,0),(6,6,0,0),(7,3,0,0),(8,0,0,0).<br />
<br />
These were found from a search of the <math>2^{27}</math> subsets of the cube.<br />
A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here].<br />
<br />
The extremal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(4,4,5,0),(4,6,4,0),(4,12,0,0),(8,0,0,0)<br />
<br />
The sharp linear bounds are :<br />
<br />
* <math>2a+b+2c+4d \leq 22</math><br />
* <math>3a+2b+3c+6d \leq 36</math><br />
* <math>7a+2b+4c+8d \leq 56</math><br />
* <math>6a+2b+3c+6d \leq 48</math><br />
* <math>b+c+3d \leq 12</math><br />
* <math>a+2c+4d \leq 14</math><br />
* <math>5a+4c+8d \leq 40</math><br />
* <math>a+4d \leq 8</math><br />
* <math>b+6d \leq 12</math><br />
* <math>c+3d \leq 6</math><br />
* <math>d \leq 1</math><br />
<br />
In normalized notation,<br />
* <math>8\alpha_0+ 6\alpha_1 + 6\alpha_2 + 2\alpha_3 \leq 11</math><br />
* <math>4\alpha_0+4\alpha_1+3\alpha_2+\alpha_3 \leq 6</math><br />
* <math>7\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 7</math><br />
* <math>8\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 8</math><br />
* <math>4\alpha_1+2\alpha_2+\alpha_3 \leq 4</math><br />
* <math>4\alpha_0+6\alpha_2+2\alpha_3 \leq 7</math><br />
* <math>5\alpha_0+3\alpha_2+\alpha_3 \leq 5</math><br />
* <math>2\alpha_0+\alpha_3 \leq 2</math><br />
* <math>2\alpha_1+\alpha_3 \leq 2</math><br />
* <math>2\alpha_2+\alpha_3 \leq 2</math><br />
* <math>\alpha_3 \leq 1</math><br />
<br />
The c-statistics can also be found from an exhaustive search of Moser sets. The resulting Pareto sets and linear inequalities can be found on Sheet 8 of [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# this spreadsheet]<br />
<br />
== n=4 ==<br />
<br />
A [http://abel.math.umu.se/~klasm/extremal-moser-n=5-t=3 computer search] has obtained all extremisers to <math>c'_4=43</math>. The 42-point solutions can be found [http://abel.math.umu.se/~klasm/moser-n=4-t=3-p=42.gz here].<br />
<br />
'''Proof that <math>c'_4 \leq 43</math>''': <br />
When e=1 (i.e. the 4D set contains 2222) then we have at most 41 points (in fact at most 39) by counting antipodal points, so assume e=0.<br />
<br />
Define the score of a 3D slice to be a/4+b/3+c/2+d. Observe from double counting that the size of a 4D set is the sum of the scores of its eight side slices.<br />
<br />
But by [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en looking at the extremals] we see that the largest score is 44/8, attained at only one point, namely when (a,b,c,d) = (2,6,6,0). So the only way one can have a 44-point set is if all side slices are (2,6,6,0), or equivalently if the whole set has statistics (a,b,c,d,e) = (4,16,24,0,0). But then we have all the points with two 2s, which means that the four "a" points cannot be separated by Hamming distance 2. We conclude that we must have an antipodal pair among the "a" points with an odd number of 1s, and an antipodal pair among the "a" points with an even number of 1s. By the symmetries of the cube, we may take the a-set to then be 1111, 3333, 1113, 3331. But then the "b" set must exclude both 1112 and 3332, and so can have at most three points in the eight-point set xyz2 (with x,y,z=1,3) rather than four (to get four points one must alternate in a checkerboard pattern). Adding this to the at most four points of the form xy2z, x2yz, 2xyz we see that b is at most 15, a contradiction. <math>\Box</math><br />
<br />
Given a subset of <math>[3]^4</math>, let a be the number of points with no 2s, b be the number of points with 1 2, and so forth. The quintuple (a,b,c,d,e) thus lies between (0,0,0,0,0) and (16,32,24,8,1).<br />
<br />
The 43-point solutions have distributions (a,b,c,d,e) as follows:<br />
<br />
* (5,20,18,0,0) [16 solutions]<br />
* (4,16,23,0,0) [768 solutions]<br />
* (3,16,24,0,0) [512 solutions]<br />
* (4,15,24,0,0) [256 solutions]<br />
<br />
The 42-point solutions are [http://abel.math.umu.se/~klasm/Moser-42-stat.pdf distributed as follows]:<br />
<br />
* (6,24,12,0,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-12 8 solutions]] <br />
* (5,20,17,0,0) [576 solutions]<br />
* (5,19,18,0,0) [384 solutions]<br />
* (6,16,18,2,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-e=2 192 solutions]]<br />
* (4,20,18,0,0) [272 solutions]<br />
* (5,17,20,0,0) [192 solutions]<br />
* (5,16,21,0,0) [3584 solutions]<br />
* (4,17,21,0,0) [768 solutions]<br />
* (4,16,22,0,0) [26880 solutions]<br />
* (5,15,22,0,0) [1536 solutions]<br />
* (4,15,23,0,0) [22272 solutions]<br />
* (3,16,23,0,0) [15744 solutions]<br />
* (4,14,24,0,0) [4224 solutions]<br />
* (3,15,24,0,0) [8704 solutions]<br />
* (2,16,24,0,0) [896 solutions]<br />
<br />
Note how c is usually quite large, and d quite low.<br />
<br />
One of the (6,24,12,0,0) solutions is <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math> (i.e. the set of points containing exactly two 1s, and/or exactly two 3s). The other seven are reflections of this set.<br />
<br />
There are 2,765,200 41-point solutions, [http://abel.math.umu.se/~klasm/solutions-4-t=3-41-moser.gz listed here]. The statistics for such points can be [http://abel.math.umu.se/~klasm/Moser-41-stat.pdf found here]. Noteworthy features of the statistics:<br />
<br />
* d is at most 3 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=3.gz 256 exceptional solutions] of the shape (5,15,18,3,0), have d at most 2; here are the [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=2.gz d=2 solutions] and [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=1.gz d=1 solutions])<br />
* c is at least 6 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-c=6.gz 16 exceptional solutions] of the shape (7,28,6,0,0), have c at least 11).<br />
<br />
Statistics for the 41-point, 42-point, and 43-point solutions can be found [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en here].<br />
<br />
If a Moser set in <math>[3]^4</math> contains 2222, then by the n=3 theory, any middle slice (i.e. 2***, *2**, **2*, or ***2) is missing at least one antipodal pair. But each antipodal pair belongs to at most three middle slices, thus two of the 40 antipodal pairs must be completely missing. As a consequence, any Moser set containing 2222 can have at most 39 points.<br />
(A more refined analysis can be found at [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
We have the following inequalities connecting a,b,c,d,e:<br />
<br />
* <math>4a+b \leq 64</math>: There are 32 lines connecting two "a" points with a "b" point; each "a" point belongs to four of these lines, and each "b" point belongs to one. But each such line can have at most two points in the set, and the claim follows.<br />
* This can be refined to <math>4a+b+\frac{2}{3} c \leq 64</math>: There are 24 planes connecting four "a" points, four "b" points, and one "c" point; each "a" point belongs to six of these, each "b" point belongs to three, and each "c" point belongs to one. For each of these planes, we have <math>2a + b + 2c \leq 8</math> from the n=2 theory, and the claim follows.<br />
* <math>6a+2c \leq 96</math>: There are 96 lines connecting two "a" points with a "c" point; each "a" point belongs to six of these lines, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
* <math>3b+2c \leq 96</math>: There are 48 lines connecting two "b" points to a "c" point; each "b" point belongs to three of these points, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
<br />
The inequalities for n=3 imply inequalities for n=4. Indeed, there are eight side slices of <math>[3]^4</math>; each "a" point belongs to four of these, each "b" point belongs to three, each "c" point belongs to two, and each "d" point belongs to one. Thus, any inequality of the form<br />
:<math> \alpha a + \beta b + \gamma c + \delta d \leq M</math><br />
in three dimensions implies the inequality<br />
:<math> 4 \alpha a + 3 \beta b + 2 \gamma c + \delta d \leq 8M</math><br />
in four dimensions. Thus we have<br />
<br />
* <math>8a+3b+4c+4d \leq 176</math><br />
* <math>2a+b+c+d \leq 48</math><br />
* <math>14a+3b+4c+4d \leq 244</math><br />
* <math>4a+b+c+d \leq 64</math> <br />
* <math>3b+2c+3d \leq 96</math><br />
* <math>a+c+d \leq 28</math><br />
* <math>5a+2c+2d \leq 80</math><br />
* <math>a+d \leq 16</math><br />
* <math>b+2d \leq 32</math><br />
* <math>2c+3d \leq 48</math><br />
<br />
Cubes also sit diagonally in the 4-dimensional cube. They may have coordinates xxyz, where xx runs over (11,22,33) or (13,22,31), while y and z run over (1,2,3). These cubes have a different distribution of 2s than the ordinary slices: (a,b,c,d,e) = (8,8,6,4,1) instead of (8,12,6,1,0) for the side slices and (0,8,12,6,1) for the middle slices. So a different set of inequalities arise. It is possible to run through the same procedure as described above for n=3, and the inequalities that arise are <math>2a+b+2c+2d+4e \le 24</math> and <math>4a+b+2c+2d+4e \le 32</math> within the 3D cube, which when averaged becomes <math>4a+b+2c+4d+16e \le 96</math> and <math>8a+b+2c+4d+16e \le 128</math> for the 4D cube. A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here] on sheet 2. Other sheets of this spreadsheet contain results for a 3D cube sitting diagonally in a 5D, 6D or 7D cube. Notice the similarity of the equations that arise for the xxxyz, xxyyz and xxxxyz diagonals. Also notice the (a,b,c,...) statistics for the xxxxyyz diagonals have the same Pareto sets and linear inequalities as the cube's c-statistics.<br />
<br />
The c-statistics for the 3D cube are bounded by a set of 20 inequalities. By considering the fourteen ways a 3D cube sits in the 4D cube, the result is a set of 239 inequalities for the 4D cube's c-statistics. However, all 239 inequalities are satisfied by a 44-point solution given by c(xxxx) = c(2xxx) = c(22xx) = 4, and permutations.<br />
<br />
== n=5 ==<br />
<br />
Let (A,B,C,D,E,F) be the statistics of a five-dimensional Moser set, thus (A,B,C,D,E,F) varies between (0,0,0,0,0,0) and (32,80,80,40,10,1).<br />
<br />
There are several Moser sets with the statistics (4,40,80,0,0,0), which thus have 124 points. Indeed, one can take<br />
<br />
* all points with two 2s;<br />
* all points with one 2 and an even number of 1s; and<br />
* (13111),(13113),(31311),(13333). Any two of these four points differ in three places, except for one pair of points that differ in one place.<br />
<br />
For the rest of this section, we assume that the Moser set <math>{\mathcal A}</math> is a 125-point Moser set. We will prove that <math>{\mathcal A}</math> cannot exist.<br />
<br />
:'''Lemma 1''': F=0.<br />
<br />
'''Proof''' If F is non-zero, then the Moser set contains 22222, then each of the 121 antipodal pairs can have at most one point in the set, leading to only 122 points. <math>\Box</math><br />
<br />
:'''Lemma 2''': Every middle slice of <math>{\mathcal A}</math> has at most 41 points.<br />
<br />
'''Proof''' Without loss of generality we may consider the 2**** slice. There are two cases, depending on the value of c(2****).<br />
<br />
Suppose first that c is at least 17; thus there are at least 17 points of the form 222xy, 22x2y, 22xy2, 2x2y2, 2xy22, or 2x22y, where the x, y denote 1 or 3. This gives 34 "xy" wildcards in all in four coordinate slots; by the pigeonhole principle one of the slots sees at least 9 of the wildcards. By symmetry, we may assume that the second coordinate slot sees at least 9 of these wildcards, thus there are at least 9 points of the form 2x22y, 2x2y2, 2xy22. The x=1, x=3 cases can absorb at most six of these, thus each of these cases must absorb at least three points, with at least one absorbing at least five. Let's say that it's the x=3 case that absorbs 5; thus <math>d(*1***) \geq 3</math> and <math>d(*3***) \geq 5</math>. From the n=4 theory this means that the *1*** slice has at most 41 points, and the *3*** slice has at most 40. Meanwhile, the middle slice has at most 43, leading to 41+41+42=124 points in all.<br />
<br />
Now suppose c is less than 17; then by the n=4 theory the middle slice is one of the eight (6,24,12,0,0) sets. Without loss of generality we may take it to be <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math>; in particular, the middle slice contains the points 21122 21212 21221 23322 23232 23223. In particular, the *1*** and *3*** slices have a "d" value of at least three, and so have at most 41 points. If the *2*** slice has at most 42 points, then we are at 41+42+41=124 points as needed, but if we have 43 or more, then we are back in the first case (as <math>c(*2***) \geq 17</math>) after permuting the indices.<br />
<math>\Box</math><br />
<br />
Since 125=41+41+43, we thus have<br />
<br />
:'''Corollary 1''': Every side slice of <math>{\mathcal A}</math> has at least 41 points. If one side slice does have 41 points, then the other has 43.<br />
<br />
Combining this with the n=4 statistics, we conclude<br />
<br />
:'''Corollary 2''': Every side slice of <math>{\mathcal A}</math> has e=0, and <math>d \leq 3</math>. Given two opposite side slices, e.g. 1**** and 3****, we have <math>d(1****)+d(3****) \leq 4</math>.<br />
<br />
:'''Corollary 3''': E=0.<br />
<br />
:'''Corollary 4''': Any middle slice has a "c" value of at most 8.<br />
<br />
'''Proof''' Let's work with the 2**** slice. By Corollary 2, there are at most four contributions to c(2****) of the form 21*** or 23***, and similarly for the other three positions. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 3''': <math>2B+C \leq 160</math>. <br />
<br />
'''Proof''': There are 160 lines connecting one "C" point to two "B" points (e.g. 11112, 11122, 11132); each "C" point lies in two of these, and each "B" point lies on four. A Moser set can have at most two points out of each of these lines. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 4''' Given m A-points with <math>m \geq 5</math>, there exists at least m-4 pairs (a,b) of such A-points with Hamming distance exactly two (i.e. b differs from a in exactly two places).<br />
<br />
'''Proof''' It suffices to check this for m=5, since the case of larger m then follows by locating a pair, removing a point that contributes to that pair, and using the induction hypothesis. Given 5 A-points, we may assume by the pigeonhole principle and symmetry that at least three of them have an odd number of 1s. Suppose 11111 is one of the points, and that no pair has Hamming distance 2. All points with two 3s are excluded, so the only points allowed with an odd number of 1s are those with four 3s. But all those points differ from each other in two positions, so at most one of them is allowed. <math>\Box</math><br />
<br />
:'''Corollary 5''' <math>C \leq 79</math>.<br />
<br />
'''Proof''': Suppose for contradiction that C=80, then D=0 and <math>B \leq 40</math>. From Lemma 4 we also see that A cannot be 5 or more, leading to the contradiction.<br />
<math>\Box</math><br />
<br />
Define the '''score''' of a Moser set in <math>[3]^4</math> to be the quantity <math>a + 5b/4 + 5c/3 + 5d/2 + 5e</math>. Double-counting (and Lemma 2) gives<br />
<br />
:'''Lemma 5''' The total score of all the ten side-slices of <math>{\mathcal A}</math> is equal to <math>5|{\mathcal A}| = 5 \times 125</math>. In particular, there exists a pair of opposite side-slices whose scores add up to at least 125.<br />
<br />
By Lemma 5 and symmetry, we may assume that the 1**** and 3**** slices have score adding up to at least 125. By Lemma 2, the 2**** slice has at most 41 points, which imply that the 1**** and 3**** have 41, 42, or 43 points. From the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] we know that all 41-point, 42-point, 43-point slices have score less than 62, with the following exceptions:<br />
<br />
# (2,16,24,0,0) [42 points, score: 62]<br />
# (4,16,23,0,0) [43 points, score: 62 1/3]<br />
# (4,15,24,0,0) [43 points, score: 62 3/4]<br />
# (3,16,24,0,0) [43 points, score: 63]<br />
<br />
Thus the 1**** and 3**** slices must come from the above list. Furthermore, if one of the slices is of type 1, then the other must be of type 4, and if one slice is of type 2, then the other must be of type 3 or 4.<br />
<br />
:'''Lemma 6''' There exists one cut in which the side slices have total score strictly greater than 125 (i.e. they thus involve only Type 2, Type 3, and Type 4 slices, with at least one side slice not equal to Type 2, and the cut here is of the form 43+39+43).<br />
<br />
'''Proof''' If not, then all cuts have side slices exactly equal to 125, which by the above table implies that one is Type 1 and one is Type 4, in particular all side slices have c=24. But this forces C=80, contradicting Corollary 5. <math>\Box</math><br />
<br />
Adding up the corners we conclude<br />
<br />
:'''Corollary 6''' <math>6 \leq A \leq 8</math>.<br />
<br />
:'''Lemma 7''' <math>D=0</math>.<br />
<br />
'''Proof''' Firstly, from Lemma 6 we have a cut in which the two side slices are omitting at most one C-point between them (and have no D-point), which forces the middle slice to have at most one D-point; thus D is at most 1.<br />
<br />
Now suppose instead that D=1 (e.g. if 11222 was in the set); then there would be two choices of coordinates in which one of the side slices would have d=1 (e.g. 1**** and *1****). But the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] show that such slices have a score of at most 59 7/12, so that the total score from those two coordinates is at most 63 + 59 7/12 = 122 7/12. On the other hand, the other three slices have a net score of at most 63+63 = 126. This averages out to at most 124.633... < 125, a contradiction. <math>\Box</math><br />
<br />
:'''Corollary 7''' Every middle slice has at most 40 points.<br />
<br />
'''Proof''' From Lemma 7 we see that the middle slice must have c=0, but from the n=4 statistics this is not possible for any slice of size 41 or higher (alternatively, one can use the inequalities <math>4a+b \leq 64</math> and <math>b \leq 32</math>). <math>\Box</math><br />
<br />
Each middle slice now has 39 or 40 points, with c=d=e=0, and so from the inequalities <math>4a+b \leq 64</math>, <math>b \leq 32</math> must have statistics (8,32,0,0,0),(7,32,0,0,0), or (8,31,0,0,0). In particular the "a" index of the middle slices is at most 8. Summing over all middle slices we conclude<br />
<br />
:'''Lemma 9''' B is at most 40.<br />
<br />
:'''Lemma 10''' C is equal to 78 or 79.<br />
<br />
'''Proof''' By Corollary 5, it suffices to show that <math>C \geq 78</math>.<br />
<br />
As 125=43+39+43, we see that every center slice must have at least 39 points. By Lemma 2 and Lemma 7 the center slice has c=d=e=0, thus the center slice has a+b >= 39. On the other hand, from the n=4 theory we have 4a+b <= 64, which forces b >= 31.<br />
<br />
By double counting, we see that 2C is equal to the sum of the b's of all the five center slices. Thus C >= 5*31/2 = 77.5 and the claim follows. <math>\Box</math><br />
<br />
From Lemmas 9 and 10 we see that <math>B+C \leq 119</math>, and thus <math>A \geq 6</math>. Also, if <math>A \geq 7</math>, then by Lemma 4 we have at least three pairs of A points with Hamming distance 2. At most two of these pairs eliminate the same C point, so we would have C=78 in that case. <br />
<br />
Putting all the above facts together, we see that (A,B,C,D,E,F) must be one of the following triples:<br />
<br />
* (6,40,79,0,0,0)<br />
* (7,40,78,0,0,0)<br />
* (8,39,78,0,0,0)<br />
<br />
All three cases can be eliminated, giving <math>c'_5=124</math>.<br />
<br />
=== Elimination of (6,40,79,0,0,0) ===<br />
<br />
Look at the 6 A points. From Lemma 4 we have at least two pairs (a,b), (c,d) of A-points that have Hamming separation 2.<br />
<br />
Now look at the midpoints of these two pairs; these midpoints are C-points cannot lie in the set. But we have exactly one C-point missing from the set, thus the midpoints must be the same. By symmetry, we may thus assume that the two pairs are (11111,11133) and (11113,11131). Thus 11111,11133, 11113, 11131 are in the set, and so every C-point other than 11122 is in the set. On the other hand, the B-points 11121, 11123, 11112, 11132 lie outside the set.<br />
<br />
At most one of 11312, 11332 lie in the set (since 11322 lies in the set). Suppose that 11312 lies outside the set, then we have a pair (xy1z2, xy3z2) with x,y,z = 1,3 that is totally omitted from the set, namely (11112,11312). On the other hand, every other pair of this form can have at most one point in the set, thus there are at most seven points in the set of the form (xyzw2) with x,y,z,w = 1,3. Similarly there are at most 8 points of the form xyz2w, or of xy2zw, x2yzw, 2xyzw, leading to at most 39 B-points in all, contradiction.<br />
<br />
=== Elimination of (7,40,78,0,0,0) ===<br />
<br />
By Lemma 4, we have at least three pairs of A-points of distance two apart that lie in the set.<br />
The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs must have the same midpoint, so we may assume as before that 111xy lies in the set for x,y=1,3, which implies that 1112* and 111*2 lie outside the set.<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 316. On the other hand, every one of the 160 lines can have at most two points, and two of these lines (namely 1112*, 111*2) have no points. Thus all the other lines must have exactly two points. <br />
<br />
We know that the C-point 11122 is missing from the set; there is one other missing C-point. Since 1112x, 111x2 lie outside the set, we conclude from the previous paragraph that 1132x, 113x2 and 1312x, 131x2 lie in the set. Taking midpoints we conclude that 11322 and 13122 lie outside the set. But this is now three C-points missing (together with 11122), a contradiction.<br />
<br />
=== Elimination of (8,39,78,0,0,0) ===<br />
<br />
By Lemma 4 we have at least four pairs of A-points of distance two apart that lie in the set. The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs (a,b), (c,d) must have the same midpoint p, and the other two pairs (a',b'), (c',d') must also have the same midpoint p'. (Note that every C-point is the midpoint of at most two such pairs.)<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 312. Every one of the 160 lines can have at most two points, and four of these (those in the plane of (a,b,c,d) or of (a',b',c',d') have no points. Thus all other lines must have exactly two points.<br />
<br />
Without loss of generality we have (a,b)=(11111,11133), (c,d) = (11113,11131), thus p = 11122. By permuting the first three indices, we may assume that p' is not of the form x2y2z, x2yz2, xy22z, xy2z2. Then 1112x lies outside the set and 1122x lies in the set, so by the above paragraph 1132x lies in the set; similarly for 113x2, 1312x, 131x2. This implies that 13122, 11322 lie outside the set, but this (together with 11122) shows that at least three C-points are missing, a contradiction.<br />
<br />
== General n ==<br />
<br />
General solution for <math>c'_N</math>. For any q, the union of the following sets is a Moser set. The size of this Moser set is maximized when q is near N/3, in which case it is <math>O(3^n/\sqrt{n})</math>. Most of the points are in the layers with q 2s and q-1 2s.<br />
<br />
* q 2s, all points from A(N-q,1)<br />
* q-1 2s, points from A(N-q+1,2)<br />
* q-2 2s, points from A(N-q+2,3)<br />
* etc.<br />
<br />
where A(m,d) is a subset of <math>[1,3]^m</math> for which any two points differ from each other in at least d places.<br />
<br />
Mathworld’s entry on error-correcting codes suggests it might be NP-complete to find the maximum size of A(m,d) in general. However, the size of A(m,d) can be bounded by sphere-packing arguments. For example, points in A(m,3) are surrounded by non-intersecting spheres of Hamming radius 1, and points in A(m,5) are surrounded by non-intersecting spheres of Hamming radius 2.<br />
<br />
* <math>|A(m,1)| = 2^m</math> because it includes all points in <math>[1,3]^m</math><br />
* <math>|A(m,2)| = 2^{m-1}</math> because it can include all points in <math>[1,3]^m</math> with an odd number of ones.<br />
* <math>|A(m,3)| \le 2^m/(m+1)</math> because the size of a Hamming sphere is m+1.<br />
<br />
The integer programming routine from Maple 12 was used to obtain upper bounds for <math>c'_6</math> and <math>c'_7</math>. A large number of linear inequalities, such as those described above in sections (n=3) and (n=4), were combined. The details are in [[Maple calculations]]. The results were that <math>c'_6 \le 361</math> and <math>c'_7 \le 1071</math>.<br />
<br />
A [[genetic algorithm]] has provided the following examples:<br />
<br />
* <math>c'_6 \geq 353</math> (26 examples; [http://twofoldgaze.wordpress.com/2009/03/10/353-element-solution/ here is one])<br />
* <math>c'_7 \geq 988</math> [http://twofoldgaze.wordpress.com/2009/03/10/978-element-solution/ Here is the example]<br />
<br />
== Larger sides (k>3) ==<br />
<br />
The following set gives a lower bound for Moser’s cube <math>[4]^n</math> (values 1,2,3,4): Pick all points where q entries are 2 or 3; and also pick those where q-1 entries are 2 or 3 and an odd number of entries are 1. This is maximized when q is near n/2, giving a lower bound of<br />
<br />
<math>\binom{n}{n/2} 2^n + \binom{n}{n/2-1} 2^{n-1}</math><br />
<br />
which is comparable to <math>4^n/\sqrt{n}</math> by [[Stirling's formula]].<br />
<br />
For k=5 (values 1,2,3,4,5) If A, B, C, D, and E denote the numbers of 1-s, 2-s, 3-s, 4-s and 5-s then the first three points of a geometric line form a 3-term arithmetic progression in A+E+2(B+D)+3C. So, for k=5 we have a similar lower bound for the Moser’s problem as for DHJ k=3, i.e. <math>5^{n - O(\sqrt{\log n})}</math>.<br />
<br />
The k=6 version of Moser implies DHJ(3). Indeed, any k=3 combinatorial line-free set can be "doubled up" into a k=6 geometric line-free set of the same density by pulling back the set from the map <math>\phi: [6]^n \to [3]^n</math> that maps 1, 2, 3, 4, 5, 6 to 1, 2, 3, 3, 2, 1 respectively; note that this map sends k=6 geometric lines to k=3 combinatorial lines.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Moser%27s_cube_problemMoser's cube problem2009-03-22T05:02:42Z<p>121.220.134.232: /* n=5 */</p>
<hr />
<div>Define a ''Moser set'' to be a subset of <math>[3]^n</math> which does not contain any [[geometric line]], and let <math>c'_n</math> denote the size of the largest Moser set in <math>[3]^n</math>. The first few values are (see [http://www.research.att.com/~njas/sequences/A003142 OEIS A003142]):<br />
<br />
: <math>c'_0 = 1; c'_1 = 2; c'_2 = 6; c'_3 = 16; c'_4 = 43; c'_5 = 124.</math><br />
<br />
Beyond this point, we only have some upper and lower bounds, in particular <math>353 \leq c'_6 \leq 361</math>; see [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DsU-uZ1tK7VEg this spreadsheet] for the latest bounds.<br />
<br />
The best known asymptotic lower bound for <math>c'_n</math> is<br />
<br />
: <math>c'_n \gg 3^n/\sqrt{n}</math>,<br />
<br />
formed by fixing the number of 2s to a single value near n/3. Compare this to the DHJ(2) or Sperner limit of <math>2^n/\sqrt{n}</math>. Is it possible to do any better? Note that we have a [[Upper_and_lower_bounds#Asymptotics|significantly better bound]] for the DHJ(3) <math>c_n</math>:<br />
<br />
: <math>c_n \geq 3^{n-O(\sqrt{\log n})}</math>.<br />
<br />
A more precise lower bound is<br />
<br />
: <math>c'_n \geq \binom{n+1}{q} 2^{n-q}</math><br />
<br />
where q is the nearest integer to <math>n/3</math>, formed by taking all strings with q 2s, together with all strings with q-1 2s and an odd number of 1s. This for instance gives the lower bound <math>c'_5 \geq 120</math>, which compares with the upper bound <math>c'_5 \leq 3 c'_4 = 129</math>.<br />
<br />
Using [[DHJ(3)]], we have the upper bound<br />
<br />
: <math>c'_n = o(3^n)</math>,<br />
<br />
but no effective decay rate is known. It would be good to have a combinatorial proof of this fact (which is weaker than [[DHJ(3)]], but implies [[Roth's theorem]]).<br />
<br />
== Notation ==<br />
<br />
Given a Moser set A in <math>[3]^n</math>, we let a be the number of points in A with no 2s, b be the number of points in A with one 2, c the number of points with two 2s, etc. We call (a,b,c,...) the ''statistics'' of A. Given a slice S of <math>[3]^n</math>, we let a(S), b(S), etc. denote the statistics of that slice (dropping the fixed coordinates). Thus for instance if A = {11, 12, 22, 32}, then (a,b,c) = (1,2,1), and (a(1*),b(1*)) = (1,1).<br />
<br />
We call a statistic (a,b,c,...) ''attainable'' if it is attained by a Moser set. We say that an attainable statistic is ''Pareto-optimal'' if it cannot be pointwise dominated by any other attainable statistic (a',b',c',...) (thus <math>a' \geq a</math>, <math>b' \geq b</math>, etc.) We say that it is ''extremal'' if it is not a convex combination of any other attainable statistic (this is a stronger property than Pareto-optimal). For the purposes of maximising linear scores of attainable statistics, it suffices to check extremal statistics.<br />
<br />
We let <math>(\alpha_0,\alpha_1,\ldots)</math> be the normalized version of <math>(a,b,\ldots)</math>, in which one divides the number of points of a certain type in the set by the total number of points in the set. Thus for instance <math>\alpha_0 = a/2^n, \alpha_1 = b/(n 2^{n-1}), \alpha_3 = c/(\binom{n}{2} 2^{n-2})</math>, etc., and the <math>\alpha_i</math> range between 0 and 1. Averaging arguments show that any linear inequality obeyed by the <math>\alpha_i</math> at one dimension is automatically inherited by higher dimensions, as are shifted versions of this inequality (in which <math>\alpha_i</math> is replaced by <math>\alpha_{i+1}</math>.<br />
<br />
The idea in 'c-statistics' is to identify 1s and 3s but leave the 2s intact. Let’s use x to denote letters that are either 1 or 3, then the 81 points in <math>[3]^4</math> get split up into 16 groups: 2222 (1 point), 222x, 22×2, 2×22, x222 (two points each), 22xx, 2×2x, x22x, 2xx2, x2×2, xx22 (four points each), 2xxx, x2xx, xx2x, xxx2 (eight points each), xxxx (sixteen points). Let c(w) denote the number of points inside a group w, e.g. c(xx22) is the number of points of the form xx22 inside the set, and is thus an integer from 0 to 4.<br />
<br />
== n=0 ==<br />
<br />
We trivially have <math>c'_0=1.</math><br />
<br />
== n=1 ==<br />
<br />
We trivially have <math>c'_1=2.</math> The Pareto-optimal values of the statistics (a,b) are (1,1) and (2,0); these are also the extremals. We thus have the inequality <math>a+b \leq 2</math>, or in normalized notation<br />
<br />
:<math>2\alpha_0 + \alpha_1 \leq 2</math>.<br />
<br />
== n=2 ==<br />
<br />
We have <math>c'_2 = 6</math>; the upper bound follows since <math>c'_2 \leq 3 c'_1</math>, and the lower bound follows by deleting one of the two diagonals from <math>[3]^2</math> (these are the only extremisers).<br />
<br />
The extremiser has statistics (a,b,c) = (2,4,0), which are of course Pareto-optimal. If c=1 then we must have a, b at most 2 (look at the lines through 22). This is attainable (e.g. {11, 12, 22, 23, 31}, and so (2,2,1) is another Pareto-optimal statistic. If a=4, then b and c must be 0, so we get another Pareto-optimal statistic (4,0,0) (attainable by {11, 13, 31, 33} of course). If a=3, then c=0 and b is at most 2, giving another Pareto-optimal statistic (3,2,0); but this is a convex combination of (4,0,0) and (2,4,0) and is thus not extremal. Thus the complete set of extremal statistics are<br />
<br />
(4,0,0), (2,4,0), (2,2,1).<br />
<br />
The sharp linear inequalities obeyed by a,b,c (other than the trivial ones <math>a,b,c \geq 0</math>) are then<br />
<br />
*<math>2a+b+2c \leq 8</math><br />
*<math>b+2c \leq 4</math><br />
*<math>a+2c \leq 4</math><br />
*<math>c \leq 1</math>.<br />
<br />
In normalized notation, we have<br />
*<math>4\alpha_0 + 2\alpha_1 + \alpha_2 \leq 4</math><br />
*<math>2\alpha_1 + \alpha_2 \leq 2</math><br />
*<math>2\alpha_0 + \alpha_2 \leq 2</math><br />
*<math>\alpha_2 \leq 1</math>.<br />
<br />
The Pareto optimizers for c-statistics are (c(22),c(2x),c(x2),c(xx)) = (1112),(0222),(0004), which are covered by these linear inequalities: <br />
*<math>c(22)+c(2x)+c(xx) \le 4</math>, <br />
*<math>c(22)+c(x2)+c(xx) \le 4</math>.<br />
<br />
== n=3 ==<br />
<br />
We have <math>c'_3 = 16</math>. The lower bound can be seen for instance by taking all the strings with one 2, and half the strings with no 2 (e.g. the strings with an odd number of 1s). The upper bound can be deduced from the corresponding [[upper and lower bounds]] for <math>c_3 = 18</math>; the 17-point and 18-point line-free sets each contain a geometric line.<br />
<br />
If a Moser set in <math>[3]^3</math> contains 222, then it can have at most 14 points, since the remaining 26 points in the cube split into 13 antipodal pairs, and at most one of each pair can lie in the set. By exhausting over the <math>2^{13} = 8192</math> possibilities, it can be shown that it is impossible for a 14-point set to exist; any Moser set containing 222 must in fact omit at least one antipodal pair completely and thus have only 13 points. (A human proof of this fact can be [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
The Pareto-optimal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(3,6,5,0),(4,4,5,0),(3,7,4,0),(4,6,4,0), (3,9,3,0),(4,7,3,0),(5,4,3,0),(4,9,2,0),(5,6,2,0),(6,3,2,0),(3,10,1,0),(5,7,1,0), (6,4,1,0),(4,12,0,0),(5,9,0,0),(6,6,0,0),(7,3,0,0),(8,0,0,0).<br />
<br />
These were found from a search of the <math>2^{27}</math> subsets of the cube.<br />
A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here].<br />
<br />
The extremal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(4,4,5,0),(4,6,4,0),(4,12,0,0),(8,0,0,0)<br />
<br />
The sharp linear bounds are :<br />
<br />
* <math>2a+b+2c+4d \leq 22</math><br />
* <math>3a+2b+3c+6d \leq 36</math><br />
* <math>7a+2b+4c+8d \leq 56</math><br />
* <math>6a+2b+3c+6d \leq 48</math><br />
* <math>b+c+3d \leq 12</math><br />
* <math>a+2c+4d \leq 14</math><br />
* <math>5a+4c+8d \leq 40</math><br />
* <math>a+4d \leq 8</math><br />
* <math>b+6d \leq 12</math><br />
* <math>c+3d \leq 6</math><br />
* <math>d \leq 1</math><br />
<br />
In normalized notation,<br />
* <math>8\alpha_0+ 6\alpha_1 + 6\alpha_2 + 2\alpha_3 \leq 11</math><br />
* <math>4\alpha_0+4\alpha_1+3\alpha_2+\alpha_3 \leq 6</math><br />
* <math>7\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 7</math><br />
* <math>8\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 8</math><br />
* <math>4\alpha_1+2\alpha_2+\alpha_3 \leq 4</math><br />
* <math>4\alpha_0+6\alpha_2+2\alpha_3 \leq 7</math><br />
* <math>5\alpha_0+3\alpha_2+\alpha_3 \leq 5</math><br />
* <math>2\alpha_0+\alpha_3 \leq 2</math><br />
* <math>2\alpha_1+\alpha_3 \leq 2</math><br />
* <math>2\alpha_2+\alpha_3 \leq 2</math><br />
* <math>\alpha_3 \leq 1</math><br />
<br />
The c-statistics can also be found from an exhaustive search of Moser sets. The resulting Pareto sets and linear inequalities can be found on Sheet 8 of [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# this spreadsheet]<br />
<br />
== n=4 ==<br />
<br />
A [http://abel.math.umu.se/~klasm/extremal-moser-n=5-t=3 computer search] has obtained all extremisers to <math>c'_4=43</math>. The 42-point solutions can be found [http://abel.math.umu.se/~klasm/moser-n=4-t=3-p=42.gz here].<br />
<br />
'''Proof that <math>c'_4 \leq 43</math>''': <br />
When e=1 (i.e. the 4D set contains 2222) then we have at most 41 points (in fact at most 39) by counting antipodal points, so assume e=0.<br />
<br />
Define the score of a 3D slice to be a/4+b/3+c/2+d. Observe from double counting that the size of a 4D set is the sum of the scores of its eight side slices.<br />
<br />
But by [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en looking at the extremals] we see that the largest score is 44/8, attained at only one point, namely when (a,b,c,d) = (2,6,6,0). So the only way one can have a 44-point set is if all side slices are (2,6,6,0), or equivalently if the whole set has statistics (a,b,c,d,e) = (4,16,24,0,0). But then we have all the points with two 2s, which means that the four "a" points cannot be separated by Hamming distance 2. We conclude that we must have an antipodal pair among the "a" points with an odd number of 1s, and an antipodal pair among the "a" points with an even number of 1s. By the symmetries of the cube, we may take the a-set to then be 1111, 3333, 1113, 3331. But then the "b" set must exclude both 1112 and 3332, and so can have at most three points in the eight-point set xyz2 (with x,y,z=1,3) rather than four (to get four points one must alternate in a checkerboard pattern). Adding this to the at most four points of the form xy2z, x2yz, 2xyz we see that b is at most 15, a contradiction. <math>\Box</math><br />
<br />
Given a subset of <math>[3]^4</math>, let a be the number of points with no 2s, b be the number of points with 1 2, and so forth. The quintuple (a,b,c,d,e) thus lies between (0,0,0,0,0) and (16,32,24,8,1).<br />
<br />
The 43-point solutions have distributions (a,b,c,d,e) as follows:<br />
<br />
* (5,20,18,0,0) [16 solutions]<br />
* (4,16,23,0,0) [768 solutions]<br />
* (3,16,24,0,0) [512 solutions]<br />
* (4,15,24,0,0) [256 solutions]<br />
<br />
The 42-point solutions are [http://abel.math.umu.se/~klasm/Moser-42-stat.pdf distributed as follows]:<br />
<br />
* (6,24,12,0,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-12 8 solutions]] <br />
* (5,20,17,0,0) [576 solutions]<br />
* (5,19,18,0,0) [384 solutions]<br />
* (6,16,18,2,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-e=2 192 solutions]]<br />
* (4,20,18,0,0) [272 solutions]<br />
* (5,17,20,0,0) [192 solutions]<br />
* (5,16,21,0,0) [3584 solutions]<br />
* (4,17,21,0,0) [768 solutions]<br />
* (4,16,22,0,0) [26880 solutions]<br />
* (5,15,22,0,0) [1536 solutions]<br />
* (4,15,23,0,0) [22272 solutions]<br />
* (3,16,23,0,0) [15744 solutions]<br />
* (4,14,24,0,0) [4224 solutions]<br />
* (3,15,24,0,0) [8704 solutions]<br />
* (2,16,24,0,0) [896 solutions]<br />
<br />
Note how c is usually quite large, and d quite low.<br />
<br />
One of the (6,24,12,0,0) solutions is <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math> (i.e. the set of points containing exactly two 1s, and/or exactly two 3s). The other seven are reflections of this set.<br />
<br />
There are 2,765,200 41-point solutions, [http://abel.math.umu.se/~klasm/solutions-4-t=3-41-moser.gz listed here]. The statistics for such points can be [http://abel.math.umu.se/~klasm/Moser-41-stat.pdf found here]. Noteworthy features of the statistics:<br />
<br />
* d is at most 3 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=3.gz 256 exceptional solutions] of the shape (5,15,18,3,0), have d at most 2; here are the [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=2.gz d=2 solutions] and [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=1.gz d=1 solutions])<br />
* c is at least 6 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-c=6.gz 16 exceptional solutions] of the shape (7,28,6,0,0), have c at least 11).<br />
<br />
Statistics for the 41-point, 42-point, and 43-point solutions can be found [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en here].<br />
<br />
If a Moser set in <math>[3]^4</math> contains 2222, then by the n=3 theory, any middle slice (i.e. 2***, *2**, **2*, or ***2) is missing at least one antipodal pair. But each antipodal pair belongs to at most three middle slices, thus two of the 40 antipodal pairs must be completely missing. As a consequence, any Moser set containing 2222 can have at most 39 points.<br />
(A more refined analysis can be found at [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
We have the following inequalities connecting a,b,c,d,e:<br />
<br />
* <math>4a+b \leq 64</math>: There are 32 lines connecting two "a" points with a "b" point; each "a" point belongs to four of these lines, and each "b" point belongs to one. But each such line can have at most two points in the set, and the claim follows.<br />
* This can be refined to <math>4a+b+\frac{2}{3} c \leq 64</math>: There are 24 planes connecting four "a" points, four "b" points, and one "c" point; each "a" point belongs to six of these, each "b" point belongs to three, and each "c" point belongs to one. For each of these planes, we have <math>2a + b + 2c \leq 8</math> from the n=2 theory, and the claim follows.<br />
* <math>6a+2c \leq 96</math>: There are 96 lines connecting two "a" points with a "c" point; each "a" point belongs to six of these lines, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
* <math>3b+2c \leq 96</math>: There are 48 lines connecting two "b" points to a "c" point; each "b" point belongs to three of these points, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
<br />
The inequalities for n=3 imply inequalities for n=4. Indeed, there are eight side slices of <math>[3]^4</math>; each "a" point belongs to four of these, each "b" point belongs to three, each "c" point belongs to two, and each "d" point belongs to one. Thus, any inequality of the form<br />
:<math> \alpha a + \beta b + \gamma c + \delta d \leq M</math><br />
in three dimensions implies the inequality<br />
:<math> 4 \alpha a + 3 \beta b + 2 \gamma c + \delta d \leq 8M</math><br />
in four dimensions. Thus we have<br />
<br />
* <math>8a+3b+4c+4d \leq 176</math><br />
* <math>2a+b+c+d \leq 48</math><br />
* <math>14a+3b+4c+4d \leq 244</math><br />
* <math>4a+b+c+d \leq 64</math> <br />
* <math>3b+2c+3d \leq 96</math><br />
* <math>a+c+d \leq 28</math><br />
* <math>5a+2c+2d \leq 80</math><br />
* <math>a+d \leq 16</math><br />
* <math>b+2d \leq 32</math><br />
* <math>2c+3d \leq 48</math><br />
<br />
Cubes also sit diagonally in the 4-dimensional cube. They may have coordinates xxyz, where xx runs over (11,22,33) or (13,22,31), while y and z run over (1,2,3). These cubes have a different distribution of 2s than the ordinary slices: (a,b,c,d,e) = (8,8,6,4,1) instead of (8,12,6,1,0) for the side slices and (0,8,12,6,1) for the middle slices. So a different set of inequalities arise. It is possible to run through the same procedure as described above for n=3, and the inequalities that arise are <math>2a+b+2c+2d+4e \le 24</math> and <math>4a+b+2c+2d+4e \le 32</math> within the 3D cube, which when averaged becomes <math>4a+b+2c+4d+16e \le 96</math> and <math>8a+b+2c+4d+16e \le 128</math> for the 4D cube. A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here] on sheet 2. Other sheets of this spreadsheet contain results for a 3D cube sitting diagonally in a 5D, 6D or 7D cube. Notice the similarity of the equations that arise for the xxxyz, xxyyz and xxxxyz diagonals. Also notice the (a,b,c,...) statistics for the xxxxyyz diagonals have the same Pareto sets and linear inequalities as the cube's c-statistics.<br />
<br />
The c-statistics for the 3D cube are bounded by a set of 20 inequalities. By considering the fourteen ways a 3D cube sits in the 4D cube, the result is a set of 239 inequalities for the 4D cube's c-statistics. However, all 239 inequalities are satisfied by a 44-point solution given by c(xxxx) = c(2xxx) = c(22xx) = 4, and permutations.<br />
<br />
== n=5 ==<br />
<br />
Let (A,B,C,D,E,F) be the statistics of a five-dimensional Moser set, thus (A,B,C,D,E,F) varies between (0,0,0,0,0,0) and (32,80,80,40,10,1).<br />
<br />
There are several Moser sets with the statistics (4,40,80,0,0,0), which thus have 124 points. Indeed, one can take<br />
<br />
* all points with two 2s;<br />
* all points with one 2 and an even number of 1s; and<br />
* (13111),(13113),(31311),(13333). Any two of these four points differ in three places, except for one pair of points that differ in one place.<br />
<br />
For the rest of this section, we assume that the Moser set <math>{\mathcal A}</math> is a 125-point Moser set. We will prove that <math>{\mathcal A}</math> cannot exist.<br />
<br />
:'''Lemma 1''': F=0.<br />
<br />
'''Proof''' If F is non-zero, then the Moser set contains 22222, then each of the 121 antipodal pairs can have at most one point in the set, leading to only 122 points. <math>\Box</math><br />
<br />
:'''Lemma 2''': Every middle slice of <math>{\mathcal A}</math> has at most 41 points.<br />
<br />
'''Proof''' Without loss of generality we may consider the 2**** slice. There are two cases, depending on the value of c(2****).<br />
<br />
Suppose first that c is at least 17; thus there are at least 17 points of the form 222xy, 22x2y, 22xy2, 2x2y2, 2xy22, or 2x22y, where the x, y denote 1 or 3. This gives 34 "xy" wildcards in all in four coordinate slots; by the pigeonhole principle one of the slots sees at least 9 of the wildcards. By symmetry, we may assume that the second coordinate slot sees at least 9 of these wildcards, thus there are at least 9 points of the form 2x22y, 2x2y2, 2xy22. The x=1, x=3 cases can absorb at most six of these, thus each of these cases must absorb at least three points, with at least one absorbing at least five. Let's say that it's the x=3 case that absorbs 5; thus <math>d(*1***) \geq 3</math> and <math>d(*3***) \geq 5</math>. From the n=4 theory this means that the *1*** slice has at most 41 points, and the *3*** slice has at most 40. Meanwhile, the middle slice has at most 43, leading to 41+41+42=124 points in all.<br />
<br />
Now suppose c is less than 17; then by the n=4 theory the middle slice is one of the eight (6,24,12,0,0) sets. Without loss of generality we may take it to be <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math>; in particular, the middle slice contains the points 21122 21212 21221 23322 23232 23223. In particular, the *1*** and *3*** slices have a "d" value of at least three, and so have at most 41 points. If the *2*** slice has at most 42 points, then we are at 41+42+41=124 points as needed, but if we have 43 or more, then we are back in the first case (as <math>c(*2***) \geq 17</math>) after permuting the indices.<br />
<math>\Box</math><br />
<br />
Since 125=41+41+43, we thus have<br />
<br />
:'''Corollary 1''': Every side slice of <math>{\mathcal A}</math> has at least 41 points. If one side slice does have 41 points, then the other has 43.<br />
<br />
Combining this with the n=4 statistics, we conclude<br />
<br />
:'''Corollary 2''': Every side slice of <math>{\mathcal A}</math> has e=0, and <math>d \leq 3</math>. Given two opposite side slices, e.g. 1**** and 3****, we have <math>d(1****)+d(3****) \leq 4</math>.<br />
<br />
:'''Corollary 3''': E=0.<br />
<br />
:'''Corollary 4''': Any middle slice has a "c" value of at most 8.<br />
<br />
'''Proof''' Let's work with the 2**** slice. By Corollary 2, there are at most four contributions to c(2****) of the form 21*** or 23***, and similarly for the other three positions. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 3''': <math>2B+C \leq 160</math>. <br />
<br />
'''Proof''': There are 160 lines connecting one "C" point to two "B" points (e.g. 11112, 11122, 11132); each "C" point lies in two of these, and each "B" point lies on four. A Moser set can have at most two points out of each of these lines. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 4''' Given m A-points with <math>m \geq 5</math>, there exists at least m-4 pairs (a,b) of such A-points with Hamming distance exactly two (i.e. b differs from a in exactly two places).<br />
<br />
'''Proof''' It suffices to check this for m=5, since the case of larger m then follows by locating a pair, removing a point that contributes to that pair, and using the induction hypothesis. Given 5 A-points, we may assume by the pigeonhole principle and symmetry that at least three of them have an odd number of 1s. Suppose 11111 is one of the points, and that no pair has Hamming distance 2. All points with two 3s are excluded, so the only points allowed with an odd number of 1s are those with four 3s. But all those points differ from each other in two positions, so at most one of them is allowed. <math>\Box</math><br />
<br />
:'''Corollary 5''' <math>C \leq 79</math>.<br />
<br />
'''Proof''': Suppose for contradiction that C=80, then D=0 and <math>B \leq 40</math>. From Lemma 4 we also see that A cannot be 5 or more, leading to the contradiction.<br />
<math>\Box</math><br />
<br />
Define the '''score''' of a Moser set in <math>[3]^4</math> to be the quantity <math>a + 5b/4 + 5c/3 + 5d/2 + 5e</math>. Double-counting (and Lemma 2) gives<br />
<br />
:'''Lemma 5''' The total score of all the ten side-slices of <math>{\mathcal A}</math> is equal to <math>5|{\mathcal A}| = 5 \times 125</math>. In particular, there exists a pair of opposite side-slices whose scores add up to at least 125.<br />
<br />
By Lemma 5 and symmetry, we may assume that the 1**** and 3**** slices have score adding up to at least 125. By Lemma 2, the 2**** slice has at most 41 points, which imply that the 1**** and 3**** have 41, 42, or 43 points. From the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] we know that all 41-point, 42-point, 43-point slices have score less than 62, with the following exceptions:<br />
<br />
# (2,16,24,0,0) [42 points, score: 62]<br />
# (4,16,23,0,0) [43 points, score: 62 1/3]<br />
# (4,15,24,0,0) [43 points, score: 62 3/4]<br />
# (3,16,24,0,0) [43 points, score: 63]<br />
<br />
Thus the 1**** and 3**** slices must come from the above list. Furthermore, if one of the slices is of type 1, then the other must be of type 4, and if one slice is of type 2, then the other must be of type 3 or 4.<br />
<br />
:'''Lemma 6''' There exists one cut in which the side slices have total score strictly greater than 125 (i.e. they thus involve only Type 2, Type 3, and Type 4 slices, with at least one side slice not equal to Type 2, and the cut here is of the form 43+39+43).<br />
<br />
'''Proof''' If not, then all cuts have side slices exactly equal to 125, which by the above table implies that one is Type 1 and one is Type 4, in particular all side slices have c=24. But this forces C=80, contradicting Corollary 5. <math>\Box</math><br />
<br />
Adding up the corners we conclude<br />
<br />
:'''Corollary 6''' <math>6 \leq A \leq 8</math>.<br />
<br />
:'''Lemma 7''' <math>D=0</math>.<br />
<br />
'''Proof''' Firstly, from Lemma 6 we have a cut in which the two side slices are omitting at most one C-point between them (and have no D-point), which forces the middle slice to have at most one D-point; thus D is at most 1.<br />
<br />
Now suppose instead that D=1 (e.g. if 11222 was in the set); then there would be two choices of coordinates in which one of the side slices would have d=1 (e.g. 1**** and *1****). But the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] show that such slices have a score of at most 59 7/12, so that the total score from those two coordinates is at most 63 + 59 7/12 = 122 7/12. On the other hand, the other three slices have a net score of at most 63+63 = 126. This averages out to at most 124.633... < 125, a contradiction. <math>\Box</math><br />
<br />
:'''Corollary 7''' Every middle slice has at most 40 points.<br />
<br />
'''Proof''' From Lemma 7 we see that the middle slice must have c=0, but from the n=4 statistics this is not possible for any slice of size 41 or higher (alternatively, one can use the inequalities <math>4a+b \leq 64</math> and <math>b \leq 32</math>). <math>\Box</math><br />
<br />
Each middle slice now has 39 or 40 points, with c=d=e=0, and so from the inequalities <math>4a+b \leq 64</math>, <math>b \leq 32</math> must have statistics (8,32,0,0,0),(7,32,0,0,0), or (8,31,0,0,0). In particular the "a" index of the middle slices is at most 8. Summing over all middle slices we conclude<br />
<br />
:'''Lemma 9''' B is at most 40.<br />
<br />
:'''Lemma 10''' C is equal to 78 or 79.<br />
<br />
'''Proof''' By Corollary 5, it suffices to show that <math>C \geq 78</math>.<br />
<br />
As 125=43+39+43, we see that every center slice must have at least 39 points. By Lemma 2 and Lemma 7 the center slice has c=d=e=0, thus the center slice has a+b >= 39. On the other hand, from the n=4 theory we have 4a+b <= 64, which forces b >= 31.<br />
<br />
By double counting, we see that 2C is equal to the sum of the b's of all the five center slices. Thus C >= 5*31/2 = 77.5 and the claim follows. <math>\Box</math><br />
<br />
From Lemmas 9 and 10 we see that <math>B+C \leq 119</math>, and thus <math>A \geq 6</math>. Also, if <math>A \geq 7</math>, then by Lemma 4 we have at least three pairs of A points with Hamming distance 2. At most two of these pairs eliminate the same C point, so we would have C=78 in that case. <br />
<br />
Putting all the above facts together, we see that (A,B,C,D,E,F) must be one of the following triples:<br />
<br />
* (6,40,79,0,0,0)<br />
* (7,40,78,0,0,0)<br />
* (8,39,78,0,0,0)<br />
<br />
All three cases can be eliminated, giving <math>c'_5=124</math>.<br />
<br />
=== Elimination of (6,40,79,0,0,0) ===<br />
<br />
Look at the 6 A points. From Lemma 4 we have at least two pairs (a,b), (c,d) of A-points that have Hamming separation 2.<br />
<br />
Now look at the midpoints of these two pairs; these midpoints are C-points cannot lie in the set. But we have exactly one C-point missing from the set, thus the midpoints must be the same. By symmetry, we may thus assume that the two pairs are (11111,11133) and (11113,11131). Thus 11111,11133, 11113, 11131 are in the set, and so every C-point other than 11122 is in the set. On the other hand, the B-points 11121, 11123, 11112, 11132 lie outside the set.<br />
<br />
At most one of 11312, 11332 lie in the set (since 11322 lies in the set). Suppose that 11312 lies outside the set, then we have a pair (xy1z2, xy3z2) with x,y,z = 1,3 that is totally omitted from the set, namely (11112,11312). On the other hand, every other pair of this form can have at most one point in the set, thus there are at most seven points in the set of the form (xyzw2) with x,y,z,w = 1,3. Similarly there are at most 8 points of the form xyz2w, or of xy2zw, x2yzw, 2xyzw, leading to at most 39 B-points in all, contradiction.<br />
<br />
=== Elimination of (7,40,78,0,0,0) ===<br />
<br />
By Lemma 4, we have at least three pairs of A-points of distance two apart that lie in the set.<br />
The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs must have the same midpoint, so we may assume as before that 111xy lies in the set for x,y=1,3, which implies that 1112* and 111*2 lie outside the set.<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 316. On the other hand, every one of the 160 lines can have at most two points, and two of these lines (namely 1112*, 111*2) have no points. Thus all the other lines must have exactly two points. <br />
<br />
We know that the C-point 11122 is missing from the set; there is one other missing C-point. Since 1112x, 111x2 lie outside the set, we conclude from the previous paragraph that 1132x, 113x2 and 1312x, 131x2 lie in the set. Taking midpoints we conclude that 11322 and 13122 lie outside the set. But this is now three C-points missing (together with 11122), a contradiction.<br />
<br />
=== Elimination of (8,39,78,0,0,0) ===<br />
<br />
By Lemma 4 we have at least four pairs of A-points of distance two apart that lie in the set. The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs (a,b), (c,d) must have the same midpoint p, and the other two pairs (a',b'), (c',d') must also have the same midpoint p'. (Note that every C-point is the midpoint of at most two such pairs.)<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 312. Every one of the 160 lines can have at most two points, and four of these (those in the plane of (a,b,c,d) or of (a',b',c',d') have no points. Thus all other lines must have exactly two points.<br />
<br />
Without loss of generality we have (a,b)=(11111,11133), (c,d) = (11113,11131), thus p = 11122. By permuting the first three indices, we may assume that p' is not of the form x2y2z, x2yz2, xy22z, xy2z2. Then 1112x lies outside the set and 1122x lies in the set, so by the above paragraph 1132x lies in the set; similarly for 113x2, 1312x, 131x2. This implies that 13122, 11322 lie outside the set, but this (together with 11122) shows that at least three C-points are missing, a contradiction.<br />
<br />
== General n ==<br />
<br />
General solution for <math>c'_N</math><br />
<br />
* q 2s, all points from A(N-q,1)<br />
* q-1 2s, points from A(N-q+1,2)<br />
* q-2 2s, points from A(N-q+2,3)<br />
* etc.<br />
<br />
where A(m,d) is a subset of <math>[1,3]^m</math> for which any two points differ from each other in at least d places.<br />
<br />
Mathworld’s entry on error-correcting codes suggests it might be NP-complete to find the size of A(m,d) in general.<br />
<br />
* <math>|A(m,1)| = 2^m</math> because it includes all points in <math>[1,3]^m</math><br />
* <math>|A(m,2)| = 2^{m-1}</math> because it can include all points in <math>[1,3]^m</math> with an odd number of 1s<br />
<br />
The integer programming routine from Maple 12 was used to obtain upper bounds for <math>c'_6</math> and <math>c'_7</math>. A large number of linear inequalities, described above in sections (n=3) and (n=4), were combined. The details are in [[Maple calculations]]. The results were that <math>c'_6 \le 361</math> and <math>c'_7 \le 1071</math>.<br />
<br />
A [[genetic algorithm]] has provided the following examples:<br />
<br />
* <math>c'_6 \geq 353</math> (26 examples; [http://twofoldgaze.wordpress.com/2009/03/10/353-element-solution/ here is one])<br />
* <math>c'_7 \geq 988</math> [http://twofoldgaze.wordpress.com/2009/03/10/978-element-solution/ Here is the example]<br />
<br />
== Larger sides (k>3) ==<br />
<br />
The following set gives a lower bound for Moser’s cube <math>[4]^n</math> (values 1,2,3,4): Pick all points where q entries are 2 or 3; and also pick those where q-1 entries are 2 or 3 and an odd number of entries are 1. This is maximized when q is near n/2, giving a lower bound of<br />
<br />
<math>\binom{n}{n/2} 2^n + \binom{n}{n/2-1} 2^{n-1}</math><br />
<br />
which is comparable to <math>4^n/\sqrt{n}</math> by [[Stirling's formula]].<br />
<br />
For k=5 (values 1,2,3,4,5) If A, B, C, D, and E denote the numbers of 1-s, 2-s, 3-s, 4-s and 5-s then the first three points of a geometric line form a 3-term arithmetic progression in A+E+2(B+D)+3C. So, for k=5 we have a similar lower bound for the Moser’s problem as for DHJ k=3, i.e. <math>5^{n - O(\sqrt{\log n})}</math>.<br />
<br />
The k=6 version of Moser implies DHJ(3). Indeed, any k=3 combinatorial line-free set can be "doubled up" into a k=6 geometric line-free set of the same density by pulling back the set from the map <math>\phi: [6]^n \to [3]^n</math> that maps 1, 2, 3, 4, 5, 6 to 1, 2, 3, 3, 2, 1 respectively; note that this map sends k=6 geometric lines to k=3 combinatorial lines.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Moser%27s_cube_problemMoser's cube problem2009-03-22T04:59:43Z<p>121.220.134.232: /* n=4 */</p>
<hr />
<div>Define a ''Moser set'' to be a subset of <math>[3]^n</math> which does not contain any [[geometric line]], and let <math>c'_n</math> denote the size of the largest Moser set in <math>[3]^n</math>. The first few values are (see [http://www.research.att.com/~njas/sequences/A003142 OEIS A003142]):<br />
<br />
: <math>c'_0 = 1; c'_1 = 2; c'_2 = 6; c'_3 = 16; c'_4 = 43; c'_5 = 124.</math><br />
<br />
Beyond this point, we only have some upper and lower bounds, in particular <math>353 \leq c'_6 \leq 361</math>; see [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DsU-uZ1tK7VEg this spreadsheet] for the latest bounds.<br />
<br />
The best known asymptotic lower bound for <math>c'_n</math> is<br />
<br />
: <math>c'_n \gg 3^n/\sqrt{n}</math>,<br />
<br />
formed by fixing the number of 2s to a single value near n/3. Compare this to the DHJ(2) or Sperner limit of <math>2^n/\sqrt{n}</math>. Is it possible to do any better? Note that we have a [[Upper_and_lower_bounds#Asymptotics|significantly better bound]] for the DHJ(3) <math>c_n</math>:<br />
<br />
: <math>c_n \geq 3^{n-O(\sqrt{\log n})}</math>.<br />
<br />
A more precise lower bound is<br />
<br />
: <math>c'_n \geq \binom{n+1}{q} 2^{n-q}</math><br />
<br />
where q is the nearest integer to <math>n/3</math>, formed by taking all strings with q 2s, together with all strings with q-1 2s and an odd number of 1s. This for instance gives the lower bound <math>c'_5 \geq 120</math>, which compares with the upper bound <math>c'_5 \leq 3 c'_4 = 129</math>.<br />
<br />
Using [[DHJ(3)]], we have the upper bound<br />
<br />
: <math>c'_n = o(3^n)</math>,<br />
<br />
but no effective decay rate is known. It would be good to have a combinatorial proof of this fact (which is weaker than [[DHJ(3)]], but implies [[Roth's theorem]]).<br />
<br />
== Notation ==<br />
<br />
Given a Moser set A in <math>[3]^n</math>, we let a be the number of points in A with no 2s, b be the number of points in A with one 2, c the number of points with two 2s, etc. We call (a,b,c,...) the ''statistics'' of A. Given a slice S of <math>[3]^n</math>, we let a(S), b(S), etc. denote the statistics of that slice (dropping the fixed coordinates). Thus for instance if A = {11, 12, 22, 32}, then (a,b,c) = (1,2,1), and (a(1*),b(1*)) = (1,1).<br />
<br />
We call a statistic (a,b,c,...) ''attainable'' if it is attained by a Moser set. We say that an attainable statistic is ''Pareto-optimal'' if it cannot be pointwise dominated by any other attainable statistic (a',b',c',...) (thus <math>a' \geq a</math>, <math>b' \geq b</math>, etc.) We say that it is ''extremal'' if it is not a convex combination of any other attainable statistic (this is a stronger property than Pareto-optimal). For the purposes of maximising linear scores of attainable statistics, it suffices to check extremal statistics.<br />
<br />
We let <math>(\alpha_0,\alpha_1,\ldots)</math> be the normalized version of <math>(a,b,\ldots)</math>, in which one divides the number of points of a certain type in the set by the total number of points in the set. Thus for instance <math>\alpha_0 = a/2^n, \alpha_1 = b/(n 2^{n-1}), \alpha_3 = c/(\binom{n}{2} 2^{n-2})</math>, etc., and the <math>\alpha_i</math> range between 0 and 1. Averaging arguments show that any linear inequality obeyed by the <math>\alpha_i</math> at one dimension is automatically inherited by higher dimensions, as are shifted versions of this inequality (in which <math>\alpha_i</math> is replaced by <math>\alpha_{i+1}</math>.<br />
<br />
The idea in 'c-statistics' is to identify 1s and 3s but leave the 2s intact. Let’s use x to denote letters that are either 1 or 3, then the 81 points in <math>[3]^4</math> get split up into 16 groups: 2222 (1 point), 222x, 22×2, 2×22, x222 (two points each), 22xx, 2×2x, x22x, 2xx2, x2×2, xx22 (four points each), 2xxx, x2xx, xx2x, xxx2 (eight points each), xxxx (sixteen points). Let c(w) denote the number of points inside a group w, e.g. c(xx22) is the number of points of the form xx22 inside the set, and is thus an integer from 0 to 4.<br />
<br />
== n=0 ==<br />
<br />
We trivially have <math>c'_0=1.</math><br />
<br />
== n=1 ==<br />
<br />
We trivially have <math>c'_1=2.</math> The Pareto-optimal values of the statistics (a,b) are (1,1) and (2,0); these are also the extremals. We thus have the inequality <math>a+b \leq 2</math>, or in normalized notation<br />
<br />
:<math>2\alpha_0 + \alpha_1 \leq 2</math>.<br />
<br />
== n=2 ==<br />
<br />
We have <math>c'_2 = 6</math>; the upper bound follows since <math>c'_2 \leq 3 c'_1</math>, and the lower bound follows by deleting one of the two diagonals from <math>[3]^2</math> (these are the only extremisers).<br />
<br />
The extremiser has statistics (a,b,c) = (2,4,0), which are of course Pareto-optimal. If c=1 then we must have a, b at most 2 (look at the lines through 22). This is attainable (e.g. {11, 12, 22, 23, 31}, and so (2,2,1) is another Pareto-optimal statistic. If a=4, then b and c must be 0, so we get another Pareto-optimal statistic (4,0,0) (attainable by {11, 13, 31, 33} of course). If a=3, then c=0 and b is at most 2, giving another Pareto-optimal statistic (3,2,0); but this is a convex combination of (4,0,0) and (2,4,0) and is thus not extremal. Thus the complete set of extremal statistics are<br />
<br />
(4,0,0), (2,4,0), (2,2,1).<br />
<br />
The sharp linear inequalities obeyed by a,b,c (other than the trivial ones <math>a,b,c \geq 0</math>) are then<br />
<br />
*<math>2a+b+2c \leq 8</math><br />
*<math>b+2c \leq 4</math><br />
*<math>a+2c \leq 4</math><br />
*<math>c \leq 1</math>.<br />
<br />
In normalized notation, we have<br />
*<math>4\alpha_0 + 2\alpha_1 + \alpha_2 \leq 4</math><br />
*<math>2\alpha_1 + \alpha_2 \leq 2</math><br />
*<math>2\alpha_0 + \alpha_2 \leq 2</math><br />
*<math>\alpha_2 \leq 1</math>.<br />
<br />
The Pareto optimizers for c-statistics are (c(22),c(2x),c(x2),c(xx)) = (1112),(0222),(0004), which are covered by these linear inequalities: <br />
*<math>c(22)+c(2x)+c(xx) \le 4</math>, <br />
*<math>c(22)+c(x2)+c(xx) \le 4</math>.<br />
<br />
== n=3 ==<br />
<br />
We have <math>c'_3 = 16</math>. The lower bound can be seen for instance by taking all the strings with one 2, and half the strings with no 2 (e.g. the strings with an odd number of 1s). The upper bound can be deduced from the corresponding [[upper and lower bounds]] for <math>c_3 = 18</math>; the 17-point and 18-point line-free sets each contain a geometric line.<br />
<br />
If a Moser set in <math>[3]^3</math> contains 222, then it can have at most 14 points, since the remaining 26 points in the cube split into 13 antipodal pairs, and at most one of each pair can lie in the set. By exhausting over the <math>2^{13} = 8192</math> possibilities, it can be shown that it is impossible for a 14-point set to exist; any Moser set containing 222 must in fact omit at least one antipodal pair completely and thus have only 13 points. (A human proof of this fact can be [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
The Pareto-optimal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(3,6,5,0),(4,4,5,0),(3,7,4,0),(4,6,4,0), (3,9,3,0),(4,7,3,0),(5,4,3,0),(4,9,2,0),(5,6,2,0),(6,3,2,0),(3,10,1,0),(5,7,1,0), (6,4,1,0),(4,12,0,0),(5,9,0,0),(6,6,0,0),(7,3,0,0),(8,0,0,0).<br />
<br />
These were found from a search of the <math>2^{27}</math> subsets of the cube.<br />
A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here].<br />
<br />
The extremal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(4,4,5,0),(4,6,4,0),(4,12,0,0),(8,0,0,0)<br />
<br />
The sharp linear bounds are :<br />
<br />
* <math>2a+b+2c+4d \leq 22</math><br />
* <math>3a+2b+3c+6d \leq 36</math><br />
* <math>7a+2b+4c+8d \leq 56</math><br />
* <math>6a+2b+3c+6d \leq 48</math><br />
* <math>b+c+3d \leq 12</math><br />
* <math>a+2c+4d \leq 14</math><br />
* <math>5a+4c+8d \leq 40</math><br />
* <math>a+4d \leq 8</math><br />
* <math>b+6d \leq 12</math><br />
* <math>c+3d \leq 6</math><br />
* <math>d \leq 1</math><br />
<br />
In normalized notation,<br />
* <math>8\alpha_0+ 6\alpha_1 + 6\alpha_2 + 2\alpha_3 \leq 11</math><br />
* <math>4\alpha_0+4\alpha_1+3\alpha_2+\alpha_3 \leq 6</math><br />
* <math>7\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 7</math><br />
* <math>8\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 8</math><br />
* <math>4\alpha_1+2\alpha_2+\alpha_3 \leq 4</math><br />
* <math>4\alpha_0+6\alpha_2+2\alpha_3 \leq 7</math><br />
* <math>5\alpha_0+3\alpha_2+\alpha_3 \leq 5</math><br />
* <math>2\alpha_0+\alpha_3 \leq 2</math><br />
* <math>2\alpha_1+\alpha_3 \leq 2</math><br />
* <math>2\alpha_2+\alpha_3 \leq 2</math><br />
* <math>\alpha_3 \leq 1</math><br />
<br />
The c-statistics can also be found from an exhaustive search of Moser sets. The resulting Pareto sets and linear inequalities can be found on Sheet 8 of [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# this spreadsheet]<br />
<br />
== n=4 ==<br />
<br />
A [http://abel.math.umu.se/~klasm/extremal-moser-n=5-t=3 computer search] has obtained all extremisers to <math>c'_4=43</math>. The 42-point solutions can be found [http://abel.math.umu.se/~klasm/moser-n=4-t=3-p=42.gz here].<br />
<br />
'''Proof that <math>c'_4 \leq 43</math>''': <br />
When e=1 (i.e. the 4D set contains 2222) then we have at most 41 points (in fact at most 39) by counting antipodal points, so assume e=0.<br />
<br />
Define the score of a 3D slice to be a/4+b/3+c/2+d. Observe from double counting that the size of a 4D set is the sum of the scores of its eight side slices.<br />
<br />
But by [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en looking at the extremals] we see that the largest score is 44/8, attained at only one point, namely when (a,b,c,d) = (2,6,6,0). So the only way one can have a 44-point set is if all side slices are (2,6,6,0), or equivalently if the whole set has statistics (a,b,c,d,e) = (4,16,24,0,0). But then we have all the points with two 2s, which means that the four "a" points cannot be separated by Hamming distance 2. We conclude that we must have an antipodal pair among the "a" points with an odd number of 1s, and an antipodal pair among the "a" points with an even number of 1s. By the symmetries of the cube, we may take the a-set to then be 1111, 3333, 1113, 3331. But then the "b" set must exclude both 1112 and 3332, and so can have at most three points in the eight-point set xyz2 (with x,y,z=1,3) rather than four (to get four points one must alternate in a checkerboard pattern). Adding this to the at most four points of the form xy2z, x2yz, 2xyz we see that b is at most 15, a contradiction. <math>\Box</math><br />
<br />
Given a subset of <math>[3]^4</math>, let a be the number of points with no 2s, b be the number of points with 1 2, and so forth. The quintuple (a,b,c,d,e) thus lies between (0,0,0,0,0) and (16,32,24,8,1).<br />
<br />
The 43-point solutions have distributions (a,b,c,d,e) as follows:<br />
<br />
* (5,20,18,0,0) [16 solutions]<br />
* (4,16,23,0,0) [768 solutions]<br />
* (3,16,24,0,0) [512 solutions]<br />
* (4,15,24,0,0) [256 solutions]<br />
<br />
The 42-point solutions are [http://abel.math.umu.se/~klasm/Moser-42-stat.pdf distributed as follows]:<br />
<br />
* (6,24,12,0,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-12 8 solutions]] <br />
* (5,20,17,0,0) [576 solutions]<br />
* (5,19,18,0,0) [384 solutions]<br />
* (6,16,18,2,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-e=2 192 solutions]]<br />
* (4,20,18,0,0) [272 solutions]<br />
* (5,17,20,0,0) [192 solutions]<br />
* (5,16,21,0,0) [3584 solutions]<br />
* (4,17,21,0,0) [768 solutions]<br />
* (4,16,22,0,0) [26880 solutions]<br />
* (5,15,22,0,0) [1536 solutions]<br />
* (4,15,23,0,0) [22272 solutions]<br />
* (3,16,23,0,0) [15744 solutions]<br />
* (4,14,24,0,0) [4224 solutions]<br />
* (3,15,24,0,0) [8704 solutions]<br />
* (2,16,24,0,0) [896 solutions]<br />
<br />
Note how c is usually quite large, and d quite low.<br />
<br />
One of the (6,24,12,0,0) solutions is <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math> (i.e. the set of points containing exactly two 1s, and/or exactly two 3s). The other seven are reflections of this set.<br />
<br />
There are 2,765,200 41-point solutions, [http://abel.math.umu.se/~klasm/solutions-4-t=3-41-moser.gz listed here]. The statistics for such points can be [http://abel.math.umu.se/~klasm/Moser-41-stat.pdf found here]. Noteworthy features of the statistics:<br />
<br />
* d is at most 3 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=3.gz 256 exceptional solutions] of the shape (5,15,18,3,0), have d at most 2; here are the [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=2.gz d=2 solutions] and [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=1.gz d=1 solutions])<br />
* c is at least 6 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-c=6.gz 16 exceptional solutions] of the shape (7,28,6,0,0), have c at least 11).<br />
<br />
Statistics for the 41-point, 42-point, and 43-point solutions can be found [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en here].<br />
<br />
If a Moser set in <math>[3]^4</math> contains 2222, then by the n=3 theory, any middle slice (i.e. 2***, *2**, **2*, or ***2) is missing at least one antipodal pair. But each antipodal pair belongs to at most three middle slices, thus two of the 40 antipodal pairs must be completely missing. As a consequence, any Moser set containing 2222 can have at most 39 points.<br />
(A more refined analysis can be found at [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
We have the following inequalities connecting a,b,c,d,e:<br />
<br />
* <math>4a+b \leq 64</math>: There are 32 lines connecting two "a" points with a "b" point; each "a" point belongs to four of these lines, and each "b" point belongs to one. But each such line can have at most two points in the set, and the claim follows.<br />
* This can be refined to <math>4a+b+\frac{2}{3} c \leq 64</math>: There are 24 planes connecting four "a" points, four "b" points, and one "c" point; each "a" point belongs to six of these, each "b" point belongs to three, and each "c" point belongs to one. For each of these planes, we have <math>2a + b + 2c \leq 8</math> from the n=2 theory, and the claim follows.<br />
* <math>6a+2c \leq 96</math>: There are 96 lines connecting two "a" points with a "c" point; each "a" point belongs to six of these lines, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
* <math>3b+2c \leq 96</math>: There are 48 lines connecting two "b" points to a "c" point; each "b" point belongs to three of these points, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
<br />
The inequalities for n=3 imply inequalities for n=4. Indeed, there are eight side slices of <math>[3]^4</math>; each "a" point belongs to four of these, each "b" point belongs to three, each "c" point belongs to two, and each "d" point belongs to one. Thus, any inequality of the form<br />
:<math> \alpha a + \beta b + \gamma c + \delta d \leq M</math><br />
in three dimensions implies the inequality<br />
:<math> 4 \alpha a + 3 \beta b + 2 \gamma c + \delta d \leq 8M</math><br />
in four dimensions. Thus we have<br />
<br />
* <math>8a+3b+4c+4d \leq 176</math><br />
* <math>2a+b+c+d \leq 48</math><br />
* <math>14a+3b+4c+4d \leq 244</math><br />
* <math>4a+b+c+d \leq 64</math> <br />
* <math>3b+2c+3d \leq 96</math><br />
* <math>a+c+d \leq 28</math><br />
* <math>5a+2c+2d \leq 80</math><br />
* <math>a+d \leq 16</math><br />
* <math>b+2d \leq 32</math><br />
* <math>2c+3d \leq 48</math><br />
<br />
Cubes also sit diagonally in the 4-dimensional cube. They may have coordinates xxyz, where xx runs over (11,22,33) or (13,22,31), while y and z run over (1,2,3). These cubes have a different distribution of 2s than the ordinary slices: (a,b,c,d,e) = (8,8,6,4,1) instead of (8,12,6,1,0) for the side slices and (0,8,12,6,1) for the middle slices. So a different set of inequalities arise. It is possible to run through the same procedure as described above for n=3, and the inequalities that arise are <math>2a+b+2c+2d+4e \le 24</math> and <math>4a+b+2c+2d+4e \le 32</math> within the 3D cube, which when averaged becomes <math>4a+b+2c+4d+16e \le 96</math> and <math>8a+b+2c+4d+16e \le 128</math> for the 4D cube. A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here] on sheet 2. Other sheets of this spreadsheet contain results for a 3D cube sitting diagonally in a 5D, 6D or 7D cube. Notice the similarity of the equations that arise for the xxxyz, xxyyz and xxxxyz diagonals. Also notice the (a,b,c,...) statistics for the xxxxyyz diagonals have the same Pareto sets and linear inequalities as the cube's c-statistics.<br />
<br />
The c-statistics for the 3D cube are bounded by a set of 20 inequalities. By considering the fourteen ways a 3D cube sits in the 4D cube, the result is a set of 239 inequalities for the 4D cube's c-statistics. However, all 239 inequalities are satisfied by a 44-point solution given by c(xxxx) = c(2xxx) = c(22xx) = 4, and permutations.<br />
<br />
== n=5 ==<br />
<br />
Let (A,B,C,D,E,F) be the statistics of a five-dimensional Moser set, thus (A,B,C,D,E,F) varies between (0,0,0,0,0,0) and (32,80,80,40,10,1).<br />
<br />
There are several Moser sets with the statistics (4,40,80,0,0,0), which thus have 124 points. Indeed, one can take<br />
<br />
* all points with two 2s;<br />
* all points with one 2 and an even number of 1s; and<br />
* (13111),(13113),(31311),(13333). Any two of these four points differ in three places, except for one pair of points that differ in one place.<br />
<br />
For the rest of this section, we assume that the Moser set <math>{\mathcal A}</math> is a 125-point set.<br />
<br />
:'''Lemma 1''': F=0.<br />
<br />
'''Proof''' If F is non-zero, then the Moser set contains 22222, then each of the 121 antipodal pairs can have at most one point in the set, leading to only 122 points. <math>\Box</math><br />
<br />
:'''Lemma 2''': Every middle slice of <math>{\mathcal A}</math> has at most 41 points.<br />
<br />
'''Proof''' Without loss of generality we may consider the 2**** slice. There are two cases, depending on the value of c(2****).<br />
<br />
Suppose first that c is at least 17; thus there are at least 17 points of the form 222xy, 22x2y, 22xy2, 2x2y2, 2xy22, or 2x22y, where the x, y denote 1 or 3. This gives 34 "xy" wildcards in all in four coordinate slots; by the pigeonhole principle one of the slots sees at least 9 of the wildcards. By symmetry, we may assume that the second coordinate slot sees at least 9 of these wildcards, thus there are at least 9 points of the form 2x22y, 2x2y2, 2xy22. The x=1, x=3 cases can absorb at most six of these, thus each of these cases must absorb at least three points, with at least one absorbing at least five. Let's say that it's the x=3 case that absorbs 5; thus <math>d(*1***) \geq 3</math> and <math>d(*3***) \geq 5</math>. From the n=4 theory this means that the *1*** slice has at most 41 points, and the *3*** slice has at most 40. Meanwhile, the middle slice has at most 43, leading to 41+41+42=124 points in all.<br />
<br />
Now suppose c is less than 17; then by the n=4 theory the middle slice is one of the eight (6,24,12,0,0) sets. Without loss of generality we may take it to be <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math>; in particular, the middle slice contains the points 21122 21212 21221 23322 23232 23223. In particular, the *1*** and *3*** slices have a "d" value of at least three, and so have at most 41 points. If the *2*** slice has at most 42 points, then we are at 41+42+41=124 points as needed, but if we have 43 or more, then we are back in the first case (as <math>c(*2***) \geq 17</math>) after permuting the indices.<br />
<math>\Box</math><br />
<br />
Since 125=41+41+43, we thus have<br />
<br />
:'''Corollary 1''': Every side slice of <math>{\mathcal A}</math> has at least 41 points. If one side slice does have 41 points, then the other has 43.<br />
<br />
Combining this with the n=4 statistics, we conclude<br />
<br />
:'''Corollary 2''': Every side slice of <math>{\mathcal A}</math> has e=0, and <math>d \leq 3</math>. Given two opposite side slices, e.g. 1**** and 3****, we have <math>d(1****)+d(3****) \leq 4</math>.<br />
<br />
:'''Corollary 3''': E=0.<br />
<br />
:'''Corollary 4''': Any middle slice has a "c" value of at most 8.<br />
<br />
'''Proof''' Let's work with the 2**** slice. By Corollary 2, there are at most four contributions to c(2****) of the form 21*** or 23***, and similarly for the other three positions. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 3''': <math>2B+C \leq 160</math>. <br />
<br />
'''Proof''': There are 160 lines connecting one "C" point to two "B" points (e.g. 11112, 11122, 11132); each "C" point lies in two of these, and each "B" point lies on four. A Moser set can have at most two points out of each of these lines. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 4''' Given m A-points with <math>m \geq 5</math>, there exists at least m-4 pairs (a,b) of such A-points with Hamming distance exactly two (i.e. b differs from a in exactly two places).<br />
<br />
'''Proof''' It suffices to check this for m=5, since the case of larger m then follows by locating a pair, removing a point that contributes to that pair, and using the induction hypothesis. Given 5 A-points, we may assume by the pigeonhole principle and symmetry that at least three of them have an odd number of 1s. Suppose 11111 is one of the points, and that no pair has Hamming distance 2. All points with two 3s are excluded, so the only points allowed with an odd number of 1s are those with four 3s. But all those points differ from each other in two positions, so at most one of them is allowed. <math>\Box</math><br />
<br />
:'''Corollary 5''' <math>C \leq 79</math>.<br />
<br />
'''Proof''': Suppose for contradiction that C=80, then D=0 and <math>B \leq 40</math>. From Lemma 4 we also see that A cannot be 5 or more, leading to the contradiction.<br />
<math>\Box</math><br />
<br />
Define the '''score''' of a Moser set in <math>[3]^4</math> to be the quantity <math>a + 5b/4 + 5c/3 + 5d/2 + 5e</math>. Double-counting (and Lemma 2) gives<br />
<br />
:'''Lemma 5''' The total score of all the ten side-slices of <math>{\mathcal A}</math> is equal to <math>5|{\mathcal A}| = 5 \times 125</math>. In particular, there exists a pair of opposite side-slices whose scores add up to at least 125.<br />
<br />
By Lemma 5 and symmetry, we may assume that the 1**** and 3**** slices have score adding up to at least 125. By Lemma 2, the 2**** slice has at most 41 points, which imply that the 1**** and 3**** have 41, 42, or 43 points. From the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] we know that all 41-point, 42-point, 43-point slices have score less than 62, with the following exceptions:<br />
<br />
# (2,16,24,0,0) [42 points, score: 62]<br />
# (4,16,23,0,0) [43 points, score: 62 1/3]<br />
# (4,15,24,0,0) [43 points, score: 62 3/4]<br />
# (3,16,24,0,0) [43 points, score: 63]<br />
<br />
Thus the 1**** and 3**** slices must come from the above list. Furthermore, if one of the slices is of type 1, then the other must be of type 4, and if one slice is of type 2, then the other must be of type 3 or 4.<br />
<br />
:'''Lemma 6''' There exists one cut in which the side slices have total score strictly greater than 125 (i.e. they thus involve only Type 2, Type 3, and Type 4 slices, with at least one side slice not equal to Type 2, and the cut here is of the form 43+39+43).<br />
<br />
'''Proof''' If not, then all cuts have side slices exactly equal to 125, which by the above table implies that one is Type 1 and one is Type 4, in particular all side slices have c=24. But this forces C=80, contradicting Corollary 5. <math>\Box</math><br />
<br />
Adding up the corners we conclude<br />
<br />
:'''Corollary 6''' <math>6 \leq A \leq 8</math>.<br />
<br />
:'''Lemma 7''' <math>D=0</math>.<br />
<br />
'''Proof''' Firstly, from Lemma 6 we have a cut in which the two side slices are omitting at most one C-point between them (and have no D-point), which forces the middle slice to have at most one D-point; thus D is at most 1.<br />
<br />
Now suppose instead that D=1 (e.g. if 11222 was in the set); then there would be two choices of coordinates in which one of the side slices would have d=1 (e.g. 1**** and *1****). But the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] show that such slices have a score of at most 59 7/12, so that the total score from those two coordinates is at most 63 + 59 7/12 = 122 7/12. On the other hand, the other three slices have a net score of at most 63+63 = 126. This averages out to at most 124.633... < 125, a contradiction. <math>\Box</math><br />
<br />
:'''Corollary 7''' Every middle slice has at most 40 points.<br />
<br />
'''Proof''' From Lemma 7 we see that the middle slice must have c=0, but from the n=4 statistics this is not possible for any slice of size 41 or higher (alternatively, one can use the inequalities <math>4a+b \leq 64</math> and <math>b \leq 32</math>). <math>\Box</math><br />
<br />
Each middle slice now has 39 or 40 points, with c=d=e=0, and so from the inequalities <math>4a+b \leq 64</math>, <math>b \leq 32</math> must have statistics (8,32,0,0,0),(7,32,0,0,0), or (8,31,0,0,0). In particular the "a" index of the middle slices is at most 8. Summing over all middle slices we conclude<br />
<br />
:'''Lemma 9''' B is at most 40.<br />
<br />
:'''Lemma 10''' C is equal to 78 or 79.<br />
<br />
'''Proof''' By Corollary 5, it suffices to show that <math>C \geq 78</math>.<br />
<br />
As 125=43+39+43, we see that every center slice must have at least 39 points. By Lemma 2 and Lemma 7 the center slice has c=d=e=0, thus the center slice has a+b >= 39. On the other hand, from the n=4 theory we have 4a+b <= 64, which forces b >= 31.<br />
<br />
By double counting, we see that 2C is equal to the sum of the b's of all the five center slices. Thus C >= 5*31/2 = 77.5 and the claim follows. <math>\Box</math><br />
<br />
From Lemmas 9 and 10 we see that <math>B+C \leq 119</math>, and thus <math>A \geq 6</math>. Also, if <math>A \geq 7</math>, then by Lemma 4 we have at least three pairs of A points with Hamming distance 2. At most two of these pairs eliminate the same C point, so we would have C=78 in that case. <br />
<br />
Putting all the above facts together, we see that (A,B,C,D,E,F) must be one of the following triples:<br />
<br />
* (6,40,79,0,0,0)<br />
* (7,40,78,0,0,0)<br />
* (8,39,78,0,0,0)<br />
<br />
All three cases can be eliminated, giving <math>c'_5=124</math>.<br />
<br />
=== Elimination of (6,40,79,0,0,0) ===<br />
<br />
Look at the 6 A points. From Lemma 4 we have at least two pairs (a,b), (c,d) of A-points that have Hamming separation 2.<br />
<br />
Now look at the midpoints of these two pairs; these midpoints are C-points cannot lie in the set. But we have exactly one C-point missing from the set, thus the midpoints must be the same. By symmetry, we may thus assume that the two pairs are (11111,11133) and (11113,11131). Thus 11111,11133, 11113, 11131 are in the set, and so every C-point other than 11122 is in the set. On the other hand, the B-points 11121, 11123, 11112, 11132 lie outside the set.<br />
<br />
At most one of 11312, 11332 lie in the set (since 11322 lies in the set). Suppose that 11312 lies outside the set, then we have a pair (xy1z2, xy3z2) with x,y,z = 1,3 that is totally omitted from the set, namely (11112,11312). On the other hand, every other pair of this form can have at most one point in the set, thus there are at most seven points in the set of the form (xyzw2) with x,y,z,w = 1,3. Similarly there are at most 8 points of the form xyz2w, or of xy2zw, x2yzw, 2xyzw, leading to at most 39 B-points in all, contradiction.<br />
<br />
=== Elimination of (7,40,78,0,0,0) ===<br />
<br />
By Lemma 4, we have at least three pairs of A-points of distance two apart that lie in the set.<br />
The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs must have the same midpoint, so we may assume as before that 111xy lies in the set for x,y=1,3, which implies that 1112* and 111*2 lie outside the set.<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 316. On the other hand, every one of the 160 lines can have at most two points, and two of these lines (namely 1112*, 111*2) have no points. Thus all the other lines must have exactly two points. <br />
<br />
We know that the C-point 11122 is missing from the set; there is one other missing C-point. Since 1112x, 111x2 lie outside the set, we conclude from the previous paragraph that 1132x, 113x2 and 1312x, 131x2 lie in the set. Taking midpoints we conclude that 11322 and 13122 lie outside the set. But this is now three C-points missing (together with 11122), a contradiction.<br />
<br />
=== Elimination of (8,39,78,0,0,0) ===<br />
<br />
By Lemma 4 we have at least four pairs of A-points of distance two apart that lie in the set. The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs (a,b), (c,d) must have the same midpoint p, and the other two pairs (a',b'), (c',d') must also have the same midpoint p'. (Note that every C-point is the midpoint of at most two such pairs.)<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 312. Every one of the 160 lines can have at most two points, and four of these (those in the plane of (a,b,c,d) or of (a',b',c',d') have no points. Thus all other lines must have exactly two points.<br />
<br />
Without loss of generality we have (a,b)=(11111,11133), (c,d) = (11113,11131), thus p = 11122. By permuting the first three indices, we may assume that p' is not of the form x2y2z, x2yz2, xy22z, xy2z2. Then 1112x lies outside the set and 1122x lies in the set, so by the above paragraph 1132x lies in the set; similarly for 113x2, 1312x, 131x2. This implies that 13122, 11322 lie outside the set, but this (together with 11122) shows that at least three C-points are missing, a contradiction.<br />
<br />
== General n ==<br />
<br />
General solution for <math>c'_N</math><br />
<br />
* q 2s, all points from A(N-q,1)<br />
* q-1 2s, points from A(N-q+1,2)<br />
* q-2 2s, points from A(N-q+2,3)<br />
* etc.<br />
<br />
where A(m,d) is a subset of <math>[1,3]^m</math> for which any two points differ from each other in at least d places.<br />
<br />
Mathworld’s entry on error-correcting codes suggests it might be NP-complete to find the size of A(m,d) in general.<br />
<br />
* <math>|A(m,1)| = 2^m</math> because it includes all points in <math>[1,3]^m</math><br />
* <math>|A(m,2)| = 2^{m-1}</math> because it can include all points in <math>[1,3]^m</math> with an odd number of 1s<br />
<br />
The integer programming routine from Maple 12 was used to obtain upper bounds for <math>c'_6</math> and <math>c'_7</math>. A large number of linear inequalities, described above in sections (n=3) and (n=4), were combined. The details are in [[Maple calculations]]. The results were that <math>c'_6 \le 361</math> and <math>c'_7 \le 1071</math>.<br />
<br />
A [[genetic algorithm]] has provided the following examples:<br />
<br />
* <math>c'_6 \geq 353</math> (26 examples; [http://twofoldgaze.wordpress.com/2009/03/10/353-element-solution/ here is one])<br />
* <math>c'_7 \geq 988</math> [http://twofoldgaze.wordpress.com/2009/03/10/978-element-solution/ Here is the example]<br />
<br />
== Larger sides (k>3) ==<br />
<br />
The following set gives a lower bound for Moser’s cube <math>[4]^n</math> (values 1,2,3,4): Pick all points where q entries are 2 or 3; and also pick those where q-1 entries are 2 or 3 and an odd number of entries are 1. This is maximized when q is near n/2, giving a lower bound of<br />
<br />
<math>\binom{n}{n/2} 2^n + \binom{n}{n/2-1} 2^{n-1}</math><br />
<br />
which is comparable to <math>4^n/\sqrt{n}</math> by [[Stirling's formula]].<br />
<br />
For k=5 (values 1,2,3,4,5) If A, B, C, D, and E denote the numbers of 1-s, 2-s, 3-s, 4-s and 5-s then the first three points of a geometric line form a 3-term arithmetic progression in A+E+2(B+D)+3C. So, for k=5 we have a similar lower bound for the Moser’s problem as for DHJ k=3, i.e. <math>5^{n - O(\sqrt{\log n})}</math>.<br />
<br />
The k=6 version of Moser implies DHJ(3). Indeed, any k=3 combinatorial line-free set can be "doubled up" into a k=6 geometric line-free set of the same density by pulling back the set from the map <math>\phi: [6]^n \to [3]^n</math> that maps 1, 2, 3, 4, 5, 6 to 1, 2, 3, 3, 2, 1 respectively; note that this map sends k=6 geometric lines to k=3 combinatorial lines.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Moser%27s_cube_problemMoser's cube problem2009-03-22T04:43:07Z<p>121.220.134.232: /* n=2 */</p>
<hr />
<div>Define a ''Moser set'' to be a subset of <math>[3]^n</math> which does not contain any [[geometric line]], and let <math>c'_n</math> denote the size of the largest Moser set in <math>[3]^n</math>. The first few values are (see [http://www.research.att.com/~njas/sequences/A003142 OEIS A003142]):<br />
<br />
: <math>c'_0 = 1; c'_1 = 2; c'_2 = 6; c'_3 = 16; c'_4 = 43; c'_5 = 124.</math><br />
<br />
Beyond this point, we only have some upper and lower bounds, in particular <math>353 \leq c'_6 \leq 361</math>; see [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DsU-uZ1tK7VEg this spreadsheet] for the latest bounds.<br />
<br />
The best known asymptotic lower bound for <math>c'_n</math> is<br />
<br />
: <math>c'_n \gg 3^n/\sqrt{n}</math>,<br />
<br />
formed by fixing the number of 2s to a single value near n/3. Compare this to the DHJ(2) or Sperner limit of <math>2^n/\sqrt{n}</math>. Is it possible to do any better? Note that we have a [[Upper_and_lower_bounds#Asymptotics|significantly better bound]] for the DHJ(3) <math>c_n</math>:<br />
<br />
: <math>c_n \geq 3^{n-O(\sqrt{\log n})}</math>.<br />
<br />
A more precise lower bound is<br />
<br />
: <math>c'_n \geq \binom{n+1}{q} 2^{n-q}</math><br />
<br />
where q is the nearest integer to <math>n/3</math>, formed by taking all strings with q 2s, together with all strings with q-1 2s and an odd number of 1s. This for instance gives the lower bound <math>c'_5 \geq 120</math>, which compares with the upper bound <math>c'_5 \leq 3 c'_4 = 129</math>.<br />
<br />
Using [[DHJ(3)]], we have the upper bound<br />
<br />
: <math>c'_n = o(3^n)</math>,<br />
<br />
but no effective decay rate is known. It would be good to have a combinatorial proof of this fact (which is weaker than [[DHJ(3)]], but implies [[Roth's theorem]]).<br />
<br />
== Notation ==<br />
<br />
Given a Moser set A in <math>[3]^n</math>, we let a be the number of points in A with no 2s, b be the number of points in A with one 2, c the number of points with two 2s, etc. We call (a,b,c,...) the ''statistics'' of A. Given a slice S of <math>[3]^n</math>, we let a(S), b(S), etc. denote the statistics of that slice (dropping the fixed coordinates). Thus for instance if A = {11, 12, 22, 32}, then (a,b,c) = (1,2,1), and (a(1*),b(1*)) = (1,1).<br />
<br />
We call a statistic (a,b,c,...) ''attainable'' if it is attained by a Moser set. We say that an attainable statistic is ''Pareto-optimal'' if it cannot be pointwise dominated by any other attainable statistic (a',b',c',...) (thus <math>a' \geq a</math>, <math>b' \geq b</math>, etc.) We say that it is ''extremal'' if it is not a convex combination of any other attainable statistic (this is a stronger property than Pareto-optimal). For the purposes of maximising linear scores of attainable statistics, it suffices to check extremal statistics.<br />
<br />
We let <math>(\alpha_0,\alpha_1,\ldots)</math> be the normalized version of <math>(a,b,\ldots)</math>, in which one divides the number of points of a certain type in the set by the total number of points in the set. Thus for instance <math>\alpha_0 = a/2^n, \alpha_1 = b/(n 2^{n-1}), \alpha_3 = c/(\binom{n}{2} 2^{n-2})</math>, etc., and the <math>\alpha_i</math> range between 0 and 1. Averaging arguments show that any linear inequality obeyed by the <math>\alpha_i</math> at one dimension is automatically inherited by higher dimensions, as are shifted versions of this inequality (in which <math>\alpha_i</math> is replaced by <math>\alpha_{i+1}</math>.<br />
<br />
The idea in 'c-statistics' is to identify 1s and 3s but leave the 2s intact. Let’s use x to denote letters that are either 1 or 3, then the 81 points in <math>[3]^4</math> get split up into 16 groups: 2222 (1 point), 222x, 22×2, 2×22, x222 (two points each), 22xx, 2×2x, x22x, 2xx2, x2×2, xx22 (four points each), 2xxx, x2xx, xx2x, xxx2 (eight points each), xxxx (sixteen points). Let c(w) denote the number of points inside a group w, e.g. c(xx22) is the number of points of the form xx22 inside the set, and is thus an integer from 0 to 4.<br />
<br />
== n=0 ==<br />
<br />
We trivially have <math>c'_0=1.</math><br />
<br />
== n=1 ==<br />
<br />
We trivially have <math>c'_1=2.</math> The Pareto-optimal values of the statistics (a,b) are (1,1) and (2,0); these are also the extremals. We thus have the inequality <math>a+b \leq 2</math>, or in normalized notation<br />
<br />
:<math>2\alpha_0 + \alpha_1 \leq 2</math>.<br />
<br />
== n=2 ==<br />
<br />
We have <math>c'_2 = 6</math>; the upper bound follows since <math>c'_2 \leq 3 c'_1</math>, and the lower bound follows by deleting one of the two diagonals from <math>[3]^2</math> (these are the only extremisers).<br />
<br />
The extremiser has statistics (a,b,c) = (2,4,0), which are of course Pareto-optimal. If c=1 then we must have a, b at most 2 (look at the lines through 22). This is attainable (e.g. {11, 12, 22, 23, 31}, and so (2,2,1) is another Pareto-optimal statistic. If a=4, then b and c must be 0, so we get another Pareto-optimal statistic (4,0,0) (attainable by {11, 13, 31, 33} of course). If a=3, then c=0 and b is at most 2, giving another Pareto-optimal statistic (3,2,0); but this is a convex combination of (4,0,0) and (2,4,0) and is thus not extremal. Thus the complete set of extremal statistics are<br />
<br />
(4,0,0), (2,4,0), (2,2,1).<br />
<br />
The sharp linear inequalities obeyed by a,b,c (other than the trivial ones <math>a,b,c \geq 0</math>) are then<br />
<br />
*<math>2a+b+2c \leq 8</math><br />
*<math>b+2c \leq 4</math><br />
*<math>a+2c \leq 4</math><br />
*<math>c \leq 1</math>.<br />
<br />
In normalized notation, we have<br />
*<math>4\alpha_0 + 2\alpha_1 + \alpha_2 \leq 4</math><br />
*<math>2\alpha_1 + \alpha_2 \leq 2</math><br />
*<math>2\alpha_0 + \alpha_2 \leq 2</math><br />
*<math>\alpha_2 \leq 1</math>.<br />
<br />
The Pareto optimizers for c-statistics are (c(22),c(2x),c(x2),c(xx)) = (1112),(0222),(0004), which are covered by these linear inequalities: <br />
*<math>c(22)+c(2x)+c(xx) \le 4</math>, <br />
*<math>c(22)+c(x2)+c(xx) \le 4</math>.<br />
<br />
== n=3 ==<br />
<br />
We have <math>c'_3 = 16</math>. The lower bound can be seen for instance by taking all the strings with one 2, and half the strings with no 2 (e.g. the strings with an odd number of 1s). The upper bound can be deduced from the corresponding [[upper and lower bounds]] for <math>c_3 = 18</math>; the 17-point and 18-point line-free sets each contain a geometric line.<br />
<br />
If a Moser set in <math>[3]^3</math> contains 222, then it can have at most 14 points, since the remaining 26 points in the cube split into 13 antipodal pairs, and at most one of each pair can lie in the set. By exhausting over the <math>2^{13} = 8192</math> possibilities, it can be shown that it is impossible for a 14-point set to exist; any Moser set containing 222 must in fact omit at least one antipodal pair completely and thus have only 13 points. (A human proof of this fact can be [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
The Pareto-optimal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(3,6,5,0),(4,4,5,0),(3,7,4,0),(4,6,4,0), (3,9,3,0),(4,7,3,0),(5,4,3,0),(4,9,2,0),(5,6,2,0),(6,3,2,0),(3,10,1,0),(5,7,1,0), (6,4,1,0),(4,12,0,0),(5,9,0,0),(6,6,0,0),(7,3,0,0),(8,0,0,0).<br />
<br />
These were found from a search of the <math>2^{27}</math> subsets of the cube.<br />
A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here].<br />
<br />
The extremal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(4,4,5,0),(4,6,4,0),(4,12,0,0),(8,0,0,0)<br />
<br />
The sharp linear bounds are :<br />
<br />
* <math>2a+b+2c+4d \leq 22</math><br />
* <math>3a+2b+3c+6d \leq 36</math><br />
* <math>7a+2b+4c+8d \leq 56</math><br />
* <math>6a+2b+3c+6d \leq 48</math><br />
* <math>b+c+3d \leq 12</math><br />
* <math>a+2c+4d \leq 14</math><br />
* <math>5a+4c+8d \leq 40</math><br />
* <math>a+4d \leq 8</math><br />
* <math>b+6d \leq 12</math><br />
* <math>c+3d \leq 6</math><br />
* <math>d \leq 1</math><br />
<br />
In normalized notation,<br />
* <math>8\alpha_0+ 6\alpha_1 + 6\alpha_2 + 2\alpha_3 \leq 11</math><br />
* <math>4\alpha_0+4\alpha_1+3\alpha_2+\alpha_3 \leq 6</math><br />
* <math>7\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 7</math><br />
* <math>8\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 8</math><br />
* <math>4\alpha_1+2\alpha_2+\alpha_3 \leq 4</math><br />
* <math>4\alpha_0+6\alpha_2+2\alpha_3 \leq 7</math><br />
* <math>5\alpha_0+3\alpha_2+\alpha_3 \leq 5</math><br />
* <math>2\alpha_0+\alpha_3 \leq 2</math><br />
* <math>2\alpha_1+\alpha_3 \leq 2</math><br />
* <math>2\alpha_2+\alpha_3 \leq 2</math><br />
* <math>\alpha_3 \leq 1</math><br />
<br />
The c-statistics can also be found from an exhaustive search of Moser sets. The resulting Pareto sets and linear inequalities can be found on Sheet 8 of [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# this spreadsheet]<br />
<br />
== n=4 ==<br />
<br />
A [http://abel.math.umu.se/~klasm/extremal-moser-n=5-t=3 computer search] has obtained all extremisers to <math>c'_4=43</math>. The 42-point solutions can be found [http://abel.math.umu.se/~klasm/moser-n=4-t=3-p=42.gz here].<br />
<br />
'''Proof that <math>c'_4 \leq 43</math>''': <br />
When e=1 (i.e. the 4D set contains 2222) then we have at most 41 points (in fact at most 39) by counting antipodal points, so assume e=0.<br />
<br />
Define the score of a 3D slice to be a/4+b/3+c/2+d. Observe from double counting that the size of a 4D set is the sum of the scores of its eight side slices.<br />
<br />
But by [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en looking at the extremals] we see that the largest score is 44/8, attained at only one point, namely when (a,b,c,d) = (2,6,6,0). So the only way one can have a 44-point set is if all side slices are (2,6,6,0), or equivalently if the whole set has statistics (a,b,c,d,e) = (4,16,24,0,0). But then we have all the points with two 2s, which means that the four "a" points cannot be separated by Hamming distance 2. We conclude that we must have an antipodal pair among the "a" points with an odd number of 1s, and an antipodal pair among the "a" points with an even number of 1s. By the symmetries of the cube, we may take the a-set to then be 1111, 3333, 1113, 3331. But then the "b" set must exclude both 1112 and 3332, and so can have at most three points in the eight-point set xyz2 (with x,y,z=1,3) rather than four (to get four points one must alternate in a checkerboard pattern). Adding this to the at most four points of the form xy2z, x2yz, 2xyz we see that b is at most 15, a contradiction. <math>\Box</math><br />
<br />
Given a subset of <math>[3]^4</math>, let a be the number of points with no 2s, b be the number of points with 1 2, and so forth. The quintuple (a,b,c,d,e) thus lies between (0,0,0,0,0) and (16,32,24,8,1).<br />
<br />
The 43-point solutions have distributions (a,b,c,d,e) as follows:<br />
<br />
* (5,20,18,0,0) [16 solutions]<br />
* (4,16,23,0,0) [768 solutions]<br />
* (3,16,24,0,0) [512 solutions]<br />
* (4,15,24,0,0) [256 solutions]<br />
<br />
The 42-point solutions are [http://abel.math.umu.se/~klasm/Moser-42-stat.pdf distributed as follows]:<br />
<br />
* (6,24,12,0,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-12 8 solutions]] <br />
* (5,20,17,0,0) [576 solutions]<br />
* (5,19,18,0,0) [384 solutions]<br />
* (6,16,18,2,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-e=2 192 solutions]]<br />
* (4,20,18,0,0) [272 solutions]<br />
* (5,17,20,0,0) [192 solutions]<br />
* (5,16,21,0,0) [3584 solutions]<br />
* (4,17,21,0,0) [768 solutions]<br />
* (4,16,22,0,0) [26880 solutions]<br />
* (5,15,22,0,0) [1536 solutions]<br />
* (4,15,23,0,0) [22272 solutions]<br />
* (3,16,23,0,0) [15744 solutions]<br />
* (4,14,24,0,0) [4224 solutions]<br />
* (3,15,24,0,0) [8704 solutions]<br />
* (2,16,24,0,0) [896 solutions]<br />
<br />
Note how c is usually quite large, and d quite low.<br />
<br />
One of the (6,24,12,0,0) solutions is <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math> (i.e. the set of points containing exactly two 1s, and/or exactly two 3s). The other seven are reflections of this set.<br />
<br />
There are 2,765,200 41-point solutions, [http://abel.math.umu.se/~klasm/solutions-4-t=3-41-moser.gz listed here]. The statistics for such points can be [http://abel.math.umu.se/~klasm/Moser-41-stat.pdf found here]. Noteworthy features of the statistics:<br />
<br />
* d is at most 3 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=3.gz 256 exceptional solutions] of the shape (5,15,18,3,0), have d at most 2; here are the [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=2.gz d=2 solutions] and [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=1.gz d=1 solutions])<br />
* c is at least 6 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-c=6.gz 16 exceptional solutions] of the shape (7,28,6,0,0), have c at least 11).<br />
<br />
Statistics for the 41-point, 42-point, and 43-point solutions can be found [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en here].<br />
<br />
If a Moser set in <math>[3]^4</math> contains 2222, then by the n=3 theory, any middle slice (i.e. 2***, *2**, **2*, or ***2) is missing at least one antipodal pair. But each antipodal pair belongs to at most three middle slices, thus two of the 40 antipodal pairs must be completely missing. As a consequence, any Moser set containing 2222 can have at most 39 points.<br />
(A more refined analysis can be found at [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
We have the following inequalities connecting a,b,c,d,e:<br />
<br />
* <math>4a+b \leq 64</math>: There are 32 lines connecting two "a" points with a "b" point; each "a" point belongs to four of these lines, and each "b" point belongs to one. But each such line can have at most two points in the set, and the claim follows.<br />
* This can be refined to <math>4a+b+\frac{2}{3} c \leq 64</math>: There are 24 planes connecting four "a" points, four "b" points, and one "c" point; each "a" point belongs to six of these, each "b" point belongs to three, and each "c" point belongs to one. For each of these planes, we have <math>2a + b + 2c \leq 8</math> from the n=2 theory, and the claim follows.<br />
* <math>6a+2c \leq 96</math>: There are 96 lines connecting two "a" points with a "c" point; each "a" point belongs to six of these lines, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
* <math>3b+2c \leq 96</math>: There are 48 lines connecting two "b" points to a "c" point; each "b" point belongs to three of these points, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
<br />
The inequalities for n=3 imply inequalities for n=4. Indeed, there are eight side slices of <math>[3]^4</math>; each "a" point belongs to four of these, each "b" point belongs to three, each "c" point belongs to two, and each "d" point belongs to one. Thus, any inequality of the form<br />
:<math> \alpha a + \beta b + \gamma c + \delta d \leq M</math><br />
in three dimensions implies the inequality<br />
:<math> 4 \alpha a + 3 \beta b + 2 \gamma c + \delta d \leq 8M</math><br />
in four dimensions. Thus we have<br />
<br />
* <math>8a+3b+4c+4d \leq 176</math><br />
* <math>2a+b+c+d \leq 48</math><br />
* <math>14a+3b+4c+4d \leq 244</math><br />
* <math>4a+b+c+d \leq 64</math> <br />
* <math>3b+2c+3d \leq 96</math><br />
* <math>a+c+d \leq 28</math><br />
* <math>5a+2c+2d \leq 80</math><br />
* <math>a+d \leq 16</math><br />
* <math>b+2d \leq 32</math><br />
* <math>2c+3d \leq 48</math><br />
<br />
Cubes also sit diagonally in the 4-dimensional cube. They may have coordinates xxyz, where xx runs over (11,22,33) or (13,22,31), while y and z run over (1,2,3). These cubes have a different distribution of 2s than the ordinary slices: (a,b,c,d,e) = (8,8,6,4,1) instead of (8,12,6,1,0) for the side slices and (0,8,12,6,1) for the middle slices. So a different set of inequalities arise. It is possible to run through the same procedure as described above for n=3, and the inequalities that arise are <math>2a+b+2c+2d+4e \le 24</math> and <math>4a+b+2c+2d+4e \le 32</math> within the 3D cube, which when averaged becomes <math>4a+b+2c+4d+16e \le 96</math> and <math>8a+b+2c+4d+16e \le 128</math> for the 4D cube. A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here] on sheet 2. Other sheets of this spreadsheet contain results for a 3D cube sitting diagonally in a 5D, 6D or 7D cube. Notice the similarity of the equations that arise for the xxxyz, xxyyz and xxxxyz diagonals. Also notice the xxxxyyz diagonals have the same Pareto sets and linear inequalities as the cube's c-statistics<br />
<br />
== n=5 ==<br />
<br />
Let (A,B,C,D,E,F) be the statistics of a five-dimensional Moser set, thus (A,B,C,D,E,F) varies between (0,0,0,0,0,0) and (32,80,80,40,10,1).<br />
<br />
There are several Moser sets with the statistics (4,40,80,0,0,0), which thus have 124 points. Indeed, one can take<br />
<br />
* all points with two 2s;<br />
* all points with one 2 and an even number of 1s; and<br />
* (13111),(13113),(31311),(13333). Any two of these four points differ in three places, except for one pair of points that differ in one place.<br />
<br />
For the rest of this section, we assume that the Moser set <math>{\mathcal A}</math> is a 125-point set.<br />
<br />
:'''Lemma 1''': F=0.<br />
<br />
'''Proof''' If F is non-zero, then the Moser set contains 22222, then each of the 121 antipodal pairs can have at most one point in the set, leading to only 122 points. <math>\Box</math><br />
<br />
:'''Lemma 2''': Every middle slice of <math>{\mathcal A}</math> has at most 41 points.<br />
<br />
'''Proof''' Without loss of generality we may consider the 2**** slice. There are two cases, depending on the value of c(2****).<br />
<br />
Suppose first that c is at least 17; thus there are at least 17 points of the form 222xy, 22x2y, 22xy2, 2x2y2, 2xy22, or 2x22y, where the x, y denote 1 or 3. This gives 34 "xy" wildcards in all in four coordinate slots; by the pigeonhole principle one of the slots sees at least 9 of the wildcards. By symmetry, we may assume that the second coordinate slot sees at least 9 of these wildcards, thus there are at least 9 points of the form 2x22y, 2x2y2, 2xy22. The x=1, x=3 cases can absorb at most six of these, thus each of these cases must absorb at least three points, with at least one absorbing at least five. Let's say that it's the x=3 case that absorbs 5; thus <math>d(*1***) \geq 3</math> and <math>d(*3***) \geq 5</math>. From the n=4 theory this means that the *1*** slice has at most 41 points, and the *3*** slice has at most 40. Meanwhile, the middle slice has at most 43, leading to 41+41+42=124 points in all.<br />
<br />
Now suppose c is less than 17; then by the n=4 theory the middle slice is one of the eight (6,24,12,0,0) sets. Without loss of generality we may take it to be <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math>; in particular, the middle slice contains the points 21122 21212 21221 23322 23232 23223. In particular, the *1*** and *3*** slices have a "d" value of at least three, and so have at most 41 points. If the *2*** slice has at most 42 points, then we are at 41+42+41=124 points as needed, but if we have 43 or more, then we are back in the first case (as <math>c(*2***) \geq 17</math>) after permuting the indices.<br />
<math>\Box</math><br />
<br />
Since 125=41+41+43, we thus have<br />
<br />
:'''Corollary 1''': Every side slice of <math>{\mathcal A}</math> has at least 41 points. If one side slice does have 41 points, then the other has 43.<br />
<br />
Combining this with the n=4 statistics, we conclude<br />
<br />
:'''Corollary 2''': Every side slice of <math>{\mathcal A}</math> has e=0, and <math>d \leq 3</math>. Given two opposite side slices, e.g. 1**** and 3****, we have <math>d(1****)+d(3****) \leq 4</math>.<br />
<br />
:'''Corollary 3''': E=0.<br />
<br />
:'''Corollary 4''': Any middle slice has a "c" value of at most 8.<br />
<br />
'''Proof''' Let's work with the 2**** slice. By Corollary 2, there are at most four contributions to c(2****) of the form 21*** or 23***, and similarly for the other three positions. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 3''': <math>2B+C \leq 160</math>. <br />
<br />
'''Proof''': There are 160 lines connecting one "C" point to two "B" points (e.g. 11112, 11122, 11132); each "C" point lies in two of these, and each "B" point lies on four. A Moser set can have at most two points out of each of these lines. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 4''' Given m A-points with <math>m \geq 5</math>, there exists at least m-4 pairs (a,b) of such A-points with Hamming distance exactly two (i.e. b differs from a in exactly two places).<br />
<br />
'''Proof''' It suffices to check this for m=5, since the case of larger m then follows by locating a pair, removing a point that contributes to that pair, and using the induction hypothesis. Given 5 A-points, we may assume by the pigeonhole principle and symmetry that at least three of them have an odd number of 1s. Suppose 11111 is one of the points, and that no pair has Hamming distance 2. All points with two 3s are excluded, so the only points allowed with an odd number of 1s are those with four 3s. But all those points differ from each other in two positions, so at most one of them is allowed. <math>\Box</math><br />
<br />
:'''Corollary 5''' <math>C \leq 79</math>.<br />
<br />
'''Proof''': Suppose for contradiction that C=80, then D=0 and <math>B \leq 40</math>. From Lemma 4 we also see that A cannot be 5 or more, leading to the contradiction.<br />
<math>\Box</math><br />
<br />
Define the '''score''' of a Moser set in <math>[3]^4</math> to be the quantity <math>a + 5b/4 + 5c/3 + 5d/2 + 5e</math>. Double-counting (and Lemma 2) gives<br />
<br />
:'''Lemma 5''' The total score of all the ten side-slices of <math>{\mathcal A}</math> is equal to <math>5|{\mathcal A}| = 5 \times 125</math>. In particular, there exists a pair of opposite side-slices whose scores add up to at least 125.<br />
<br />
By Lemma 5 and symmetry, we may assume that the 1**** and 3**** slices have score adding up to at least 125. By Lemma 2, the 2**** slice has at most 41 points, which imply that the 1**** and 3**** have 41, 42, or 43 points. From the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] we know that all 41-point, 42-point, 43-point slices have score less than 62, with the following exceptions:<br />
<br />
# (2,16,24,0,0) [42 points, score: 62]<br />
# (4,16,23,0,0) [43 points, score: 62 1/3]<br />
# (4,15,24,0,0) [43 points, score: 62 3/4]<br />
# (3,16,24,0,0) [43 points, score: 63]<br />
<br />
Thus the 1**** and 3**** slices must come from the above list. Furthermore, if one of the slices is of type 1, then the other must be of type 4, and if one slice is of type 2, then the other must be of type 3 or 4.<br />
<br />
:'''Lemma 6''' There exists one cut in which the side slices have total score strictly greater than 125 (i.e. they thus involve only Type 2, Type 3, and Type 4 slices, with at least one side slice not equal to Type 2, and the cut here is of the form 43+39+43).<br />
<br />
'''Proof''' If not, then all cuts have side slices exactly equal to 125, which by the above table implies that one is Type 1 and one is Type 4, in particular all side slices have c=24. But this forces C=80, contradicting Corollary 5. <math>\Box</math><br />
<br />
Adding up the corners we conclude<br />
<br />
:'''Corollary 6''' <math>6 \leq A \leq 8</math>.<br />
<br />
:'''Lemma 7''' <math>D=0</math>.<br />
<br />
'''Proof''' Firstly, from Lemma 6 we have a cut in which the two side slices are omitting at most one C-point between them (and have no D-point), which forces the middle slice to have at most one D-point; thus D is at most 1.<br />
<br />
Now suppose instead that D=1 (e.g. if 11222 was in the set); then there would be two choices of coordinates in which one of the side slices would have d=1 (e.g. 1**** and *1****). But the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] show that such slices have a score of at most 59 7/12, so that the total score from those two coordinates is at most 63 + 59 7/12 = 122 7/12. On the other hand, the other three slices have a net score of at most 63+63 = 126. This averages out to at most 124.633... < 125, a contradiction. <math>\Box</math><br />
<br />
:'''Corollary 7''' Every middle slice has at most 40 points.<br />
<br />
'''Proof''' From Lemma 7 we see that the middle slice must have c=0, but from the n=4 statistics this is not possible for any slice of size 41 or higher (alternatively, one can use the inequalities <math>4a+b \leq 64</math> and <math>b \leq 32</math>). <math>\Box</math><br />
<br />
Each middle slice now has 39 or 40 points, with c=d=e=0, and so from the inequalities <math>4a+b \leq 64</math>, <math>b \leq 32</math> must have statistics (8,32,0,0,0),(7,32,0,0,0), or (8,31,0,0,0). In particular the "a" index of the middle slices is at most 8. Summing over all middle slices we conclude<br />
<br />
:'''Lemma 9''' B is at most 40.<br />
<br />
:'''Lemma 10''' C is equal to 78 or 79.<br />
<br />
'''Proof''' By Corollary 5, it suffices to show that <math>C \geq 78</math>.<br />
<br />
As 125=43+39+43, we see that every center slice must have at least 39 points. By Lemma 2 and Lemma 7 the center slice has c=d=e=0, thus the center slice has a+b >= 39. On the other hand, from the n=4 theory we have 4a+b <= 64, which forces b >= 31.<br />
<br />
By double counting, we see that 2C is equal to the sum of the b's of all the five center slices. Thus C >= 5*31/2 = 77.5 and the claim follows. <math>\Box</math><br />
<br />
From Lemmas 9 and 10 we see that <math>B+C \leq 119</math>, and thus <math>A \geq 6</math>. Also, if <math>A \geq 7</math>, then by Lemma 4 we have at least three pairs of A points with Hamming distance 2. At most two of these pairs eliminate the same C point, so we would have C=78 in that case. <br />
<br />
Putting all the above facts together, we see that (A,B,C,D,E,F) must be one of the following triples:<br />
<br />
* (6,40,79,0,0,0)<br />
* (7,40,78,0,0,0)<br />
* (8,39,78,0,0,0)<br />
<br />
All three cases can be eliminated, giving <math>c'_5=124</math>.<br />
<br />
=== Elimination of (6,40,79,0,0,0) ===<br />
<br />
Look at the 6 A points. From Lemma 4 we have at least two pairs (a,b), (c,d) of A-points that have Hamming separation 2.<br />
<br />
Now look at the midpoints of these two pairs; these midpoints are C-points cannot lie in the set. But we have exactly one C-point missing from the set, thus the midpoints must be the same. By symmetry, we may thus assume that the two pairs are (11111,11133) and (11113,11131). Thus 11111,11133, 11113, 11131 are in the set, and so every C-point other than 11122 is in the set. On the other hand, the B-points 11121, 11123, 11112, 11132 lie outside the set.<br />
<br />
At most one of 11312, 11332 lie in the set (since 11322 lies in the set). Suppose that 11312 lies outside the set, then we have a pair (xy1z2, xy3z2) with x,y,z = 1,3 that is totally omitted from the set, namely (11112,11312). On the other hand, every other pair of this form can have at most one point in the set, thus there are at most seven points in the set of the form (xyzw2) with x,y,z,w = 1,3. Similarly there are at most 8 points of the form xyz2w, or of xy2zw, x2yzw, 2xyzw, leading to at most 39 B-points in all, contradiction.<br />
<br />
=== Elimination of (7,40,78,0,0,0) ===<br />
<br />
By Lemma 4, we have at least three pairs of A-points of distance two apart that lie in the set.<br />
The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs must have the same midpoint, so we may assume as before that 111xy lies in the set for x,y=1,3, which implies that 1112* and 111*2 lie outside the set.<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 316. On the other hand, every one of the 160 lines can have at most two points, and two of these lines (namely 1112*, 111*2) have no points. Thus all the other lines must have exactly two points. <br />
<br />
We know that the C-point 11122 is missing from the set; there is one other missing C-point. Since 1112x, 111x2 lie outside the set, we conclude from the previous paragraph that 1132x, 113x2 and 1312x, 131x2 lie in the set. Taking midpoints we conclude that 11322 and 13122 lie outside the set. But this is now three C-points missing (together with 11122), a contradiction.<br />
<br />
=== Elimination of (8,39,78,0,0,0) ===<br />
<br />
By Lemma 4 we have at least four pairs of A-points of distance two apart that lie in the set. The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs (a,b), (c,d) must have the same midpoint p, and the other two pairs (a',b'), (c',d') must also have the same midpoint p'. (Note that every C-point is the midpoint of at most two such pairs.)<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 312. Every one of the 160 lines can have at most two points, and four of these (those in the plane of (a,b,c,d) or of (a',b',c',d') have no points. Thus all other lines must have exactly two points.<br />
<br />
Without loss of generality we have (a,b)=(11111,11133), (c,d) = (11113,11131), thus p = 11122. By permuting the first three indices, we may assume that p' is not of the form x2y2z, x2yz2, xy22z, xy2z2. Then 1112x lies outside the set and 1122x lies in the set, so by the above paragraph 1132x lies in the set; similarly for 113x2, 1312x, 131x2. This implies that 13122, 11322 lie outside the set, but this (together with 11122) shows that at least three C-points are missing, a contradiction.<br />
<br />
== General n ==<br />
<br />
General solution for <math>c'_N</math><br />
<br />
* q 2s, all points from A(N-q,1)<br />
* q-1 2s, points from A(N-q+1,2)<br />
* q-2 2s, points from A(N-q+2,3)<br />
* etc.<br />
<br />
where A(m,d) is a subset of <math>[1,3]^m</math> for which any two points differ from each other in at least d places.<br />
<br />
Mathworld’s entry on error-correcting codes suggests it might be NP-complete to find the size of A(m,d) in general.<br />
<br />
* <math>|A(m,1)| = 2^m</math> because it includes all points in <math>[1,3]^m</math><br />
* <math>|A(m,2)| = 2^{m-1}</math> because it can include all points in <math>[1,3]^m</math> with an odd number of 1s<br />
<br />
The integer programming routine from Maple 12 was used to obtain upper bounds for <math>c'_6</math> and <math>c'_7</math>. A large number of linear inequalities, described above in sections (n=3) and (n=4), were combined. The details are in [[Maple calculations]]. The results were that <math>c'_6 \le 361</math> and <math>c'_7 \le 1071</math>.<br />
<br />
A [[genetic algorithm]] has provided the following examples:<br />
<br />
* <math>c'_6 \geq 353</math> (26 examples; [http://twofoldgaze.wordpress.com/2009/03/10/353-element-solution/ here is one])<br />
* <math>c'_7 \geq 988</math> [http://twofoldgaze.wordpress.com/2009/03/10/978-element-solution/ Here is the example]<br />
<br />
== Larger sides (k>3) ==<br />
<br />
The following set gives a lower bound for Moser’s cube <math>[4]^n</math> (values 1,2,3,4): Pick all points where q entries are 2 or 3; and also pick those where q-1 entries are 2 or 3 and an odd number of entries are 1. This is maximized when q is near n/2, giving a lower bound of<br />
<br />
<math>\binom{n}{n/2} 2^n + \binom{n}{n/2-1} 2^{n-1}</math><br />
<br />
which is comparable to <math>4^n/\sqrt{n}</math> by [[Stirling's formula]].<br />
<br />
For k=5 (values 1,2,3,4,5) If A, B, C, D, and E denote the numbers of 1-s, 2-s, 3-s, 4-s and 5-s then the first three points of a geometric line form a 3-term arithmetic progression in A+E+2(B+D)+3C. So, for k=5 we have a similar lower bound for the Moser’s problem as for DHJ k=3, i.e. <math>5^{n - O(\sqrt{\log n})}</math>.<br />
<br />
The k=6 version of Moser implies DHJ(3). Indeed, any k=3 combinatorial line-free set can be "doubled up" into a k=6 geometric line-free set of the same density by pulling back the set from the map <math>\phi: [6]^n \to [3]^n</math> that maps 1, 2, 3, 4, 5, 6 to 1, 2, 3, 3, 2, 1 respectively; note that this map sends k=6 geometric lines to k=3 combinatorial lines.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Moser%27s_cube_problemMoser's cube problem2009-03-22T04:41:34Z<p>121.220.134.232: /* Notation */</p>
<hr />
<div>Define a ''Moser set'' to be a subset of <math>[3]^n</math> which does not contain any [[geometric line]], and let <math>c'_n</math> denote the size of the largest Moser set in <math>[3]^n</math>. The first few values are (see [http://www.research.att.com/~njas/sequences/A003142 OEIS A003142]):<br />
<br />
: <math>c'_0 = 1; c'_1 = 2; c'_2 = 6; c'_3 = 16; c'_4 = 43; c'_5 = 124.</math><br />
<br />
Beyond this point, we only have some upper and lower bounds, in particular <math>353 \leq c'_6 \leq 361</math>; see [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DsU-uZ1tK7VEg this spreadsheet] for the latest bounds.<br />
<br />
The best known asymptotic lower bound for <math>c'_n</math> is<br />
<br />
: <math>c'_n \gg 3^n/\sqrt{n}</math>,<br />
<br />
formed by fixing the number of 2s to a single value near n/3. Compare this to the DHJ(2) or Sperner limit of <math>2^n/\sqrt{n}</math>. Is it possible to do any better? Note that we have a [[Upper_and_lower_bounds#Asymptotics|significantly better bound]] for the DHJ(3) <math>c_n</math>:<br />
<br />
: <math>c_n \geq 3^{n-O(\sqrt{\log n})}</math>.<br />
<br />
A more precise lower bound is<br />
<br />
: <math>c'_n \geq \binom{n+1}{q} 2^{n-q}</math><br />
<br />
where q is the nearest integer to <math>n/3</math>, formed by taking all strings with q 2s, together with all strings with q-1 2s and an odd number of 1s. This for instance gives the lower bound <math>c'_5 \geq 120</math>, which compares with the upper bound <math>c'_5 \leq 3 c'_4 = 129</math>.<br />
<br />
Using [[DHJ(3)]], we have the upper bound<br />
<br />
: <math>c'_n = o(3^n)</math>,<br />
<br />
but no effective decay rate is known. It would be good to have a combinatorial proof of this fact (which is weaker than [[DHJ(3)]], but implies [[Roth's theorem]]).<br />
<br />
== Notation ==<br />
<br />
Given a Moser set A in <math>[3]^n</math>, we let a be the number of points in A with no 2s, b be the number of points in A with one 2, c the number of points with two 2s, etc. We call (a,b,c,...) the ''statistics'' of A. Given a slice S of <math>[3]^n</math>, we let a(S), b(S), etc. denote the statistics of that slice (dropping the fixed coordinates). Thus for instance if A = {11, 12, 22, 32}, then (a,b,c) = (1,2,1), and (a(1*),b(1*)) = (1,1).<br />
<br />
We call a statistic (a,b,c,...) ''attainable'' if it is attained by a Moser set. We say that an attainable statistic is ''Pareto-optimal'' if it cannot be pointwise dominated by any other attainable statistic (a',b',c',...) (thus <math>a' \geq a</math>, <math>b' \geq b</math>, etc.) We say that it is ''extremal'' if it is not a convex combination of any other attainable statistic (this is a stronger property than Pareto-optimal). For the purposes of maximising linear scores of attainable statistics, it suffices to check extremal statistics.<br />
<br />
We let <math>(\alpha_0,\alpha_1,\ldots)</math> be the normalized version of <math>(a,b,\ldots)</math>, in which one divides the number of points of a certain type in the set by the total number of points in the set. Thus for instance <math>\alpha_0 = a/2^n, \alpha_1 = b/(n 2^{n-1}), \alpha_3 = c/(\binom{n}{2} 2^{n-2})</math>, etc., and the <math>\alpha_i</math> range between 0 and 1. Averaging arguments show that any linear inequality obeyed by the <math>\alpha_i</math> at one dimension is automatically inherited by higher dimensions, as are shifted versions of this inequality (in which <math>\alpha_i</math> is replaced by <math>\alpha_{i+1}</math>.<br />
<br />
The idea in 'c-statistics' is to identify 1s and 3s but leave the 2s intact. Let’s use x to denote letters that are either 1 or 3, then the 81 points in <math>[3]^4</math> get split up into 16 groups: 2222 (1 point), 222x, 22×2, 2×22, x222 (two points each), 22xx, 2×2x, x22x, 2xx2, x2×2, xx22 (four points each), 2xxx, x2xx, xx2x, xxx2 (eight points each), xxxx (sixteen points). Let c(w) denote the number of points inside a group w, e.g. c(xx22) is the number of points of the form xx22 inside the set, and is thus an integer from 0 to 4.<br />
<br />
== n=0 ==<br />
<br />
We trivially have <math>c'_0=1.</math><br />
<br />
== n=1 ==<br />
<br />
We trivially have <math>c'_1=2.</math> The Pareto-optimal values of the statistics (a,b) are (1,1) and (2,0); these are also the extremals. We thus have the inequality <math>a+b \leq 2</math>, or in normalized notation<br />
<br />
:<math>2\alpha_0 + \alpha_1 \leq 2</math>.<br />
<br />
== n=2 ==<br />
<br />
We have <math>c'_2 = 6</math>; the upper bound follows since <math>c'_2 \leq 3 c'_1</math>, and the lower bound follows by deleting one of the two diagonals from <math>[3]^2</math> (these are the only extremisers).<br />
<br />
The extremiser has statistics (a,b,c) = (2,4,0), which are of course Pareto-optimal. If c=1 then we must have a, b at most 2 (look at the lines through 22). This is attainable (e.g. {11, 12, 22, 23, 31}, and so (2,2,1) is another Pareto-optimal statistic. If a=4, then b and c must be 0, so we get another Pareto-optimal statistic (4,0,0) (attainable by {11, 13, 31, 33} of course). If a=3, then c=0 and b is at most 2, giving another Pareto-optimal statistic (3,2,0); but this is a convex combination of (4,0,0) and (2,4,0) and is thus not extremal. Thus the complete set of extremal statistics are<br />
<br />
(4,0,0), (2,4,0), (2,2,1).<br />
<br />
The sharp linear inequalities obeyed by a,b,c (other than the trivial ones <math>a,b,c \geq 0</math>) are then<br />
<br />
*<math>2a+b+2c \leq 8</math><br />
*<math>b+2c \leq 4</math><br />
*<math>a+2c \leq 4</math><br />
*<math>c \leq 1</math>.<br />
<br />
In normalized notation, we have<br />
*<math>4\alpha_0 + 2\alpha_1 + \alpha_2 \leq 4</math><br />
*<math>2\alpha_1 + \alpha_2 \leq 2</math><br />
*<math>2\alpha_0 + \alpha_2 \leq 2</math><br />
*<math>\alpha_2 \leq 1</math>.<br />
<br />
The Pareto optimizers for c-statistics are (c22,c2x,cx2,cxx) = (1112),(0222),(0004), which are covered by the linear inequalities <br />
*<math>c22+c2x+cxx \le 4</math>, <br />
*<math>c22+cx2+cxx \le 4</math>.<br />
<br />
== n=3 ==<br />
<br />
We have <math>c'_3 = 16</math>. The lower bound can be seen for instance by taking all the strings with one 2, and half the strings with no 2 (e.g. the strings with an odd number of 1s). The upper bound can be deduced from the corresponding [[upper and lower bounds]] for <math>c_3 = 18</math>; the 17-point and 18-point line-free sets each contain a geometric line.<br />
<br />
If a Moser set in <math>[3]^3</math> contains 222, then it can have at most 14 points, since the remaining 26 points in the cube split into 13 antipodal pairs, and at most one of each pair can lie in the set. By exhausting over the <math>2^{13} = 8192</math> possibilities, it can be shown that it is impossible for a 14-point set to exist; any Moser set containing 222 must in fact omit at least one antipodal pair completely and thus have only 13 points. (A human proof of this fact can be [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
The Pareto-optimal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(3,6,5,0),(4,4,5,0),(3,7,4,0),(4,6,4,0), (3,9,3,0),(4,7,3,0),(5,4,3,0),(4,9,2,0),(5,6,2,0),(6,3,2,0),(3,10,1,0),(5,7,1,0), (6,4,1,0),(4,12,0,0),(5,9,0,0),(6,6,0,0),(7,3,0,0),(8,0,0,0).<br />
<br />
These were found from a search of the <math>2^{27}</math> subsets of the cube.<br />
A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here].<br />
<br />
The extremal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(4,4,5,0),(4,6,4,0),(4,12,0,0),(8,0,0,0)<br />
<br />
The sharp linear bounds are :<br />
<br />
* <math>2a+b+2c+4d \leq 22</math><br />
* <math>3a+2b+3c+6d \leq 36</math><br />
* <math>7a+2b+4c+8d \leq 56</math><br />
* <math>6a+2b+3c+6d \leq 48</math><br />
* <math>b+c+3d \leq 12</math><br />
* <math>a+2c+4d \leq 14</math><br />
* <math>5a+4c+8d \leq 40</math><br />
* <math>a+4d \leq 8</math><br />
* <math>b+6d \leq 12</math><br />
* <math>c+3d \leq 6</math><br />
* <math>d \leq 1</math><br />
<br />
In normalized notation,<br />
* <math>8\alpha_0+ 6\alpha_1 + 6\alpha_2 + 2\alpha_3 \leq 11</math><br />
* <math>4\alpha_0+4\alpha_1+3\alpha_2+\alpha_3 \leq 6</math><br />
* <math>7\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 7</math><br />
* <math>8\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 8</math><br />
* <math>4\alpha_1+2\alpha_2+\alpha_3 \leq 4</math><br />
* <math>4\alpha_0+6\alpha_2+2\alpha_3 \leq 7</math><br />
* <math>5\alpha_0+3\alpha_2+\alpha_3 \leq 5</math><br />
* <math>2\alpha_0+\alpha_3 \leq 2</math><br />
* <math>2\alpha_1+\alpha_3 \leq 2</math><br />
* <math>2\alpha_2+\alpha_3 \leq 2</math><br />
* <math>\alpha_3 \leq 1</math><br />
<br />
The c-statistics can also be found from an exhaustive search of Moser sets. The resulting Pareto sets and linear inequalities can be found on Sheet 8 of [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# this spreadsheet]<br />
<br />
== n=4 ==<br />
<br />
A [http://abel.math.umu.se/~klasm/extremal-moser-n=5-t=3 computer search] has obtained all extremisers to <math>c'_4=43</math>. The 42-point solutions can be found [http://abel.math.umu.se/~klasm/moser-n=4-t=3-p=42.gz here].<br />
<br />
'''Proof that <math>c'_4 \leq 43</math>''': <br />
When e=1 (i.e. the 4D set contains 2222) then we have at most 41 points (in fact at most 39) by counting antipodal points, so assume e=0.<br />
<br />
Define the score of a 3D slice to be a/4+b/3+c/2+d. Observe from double counting that the size of a 4D set is the sum of the scores of its eight side slices.<br />
<br />
But by [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en looking at the extremals] we see that the largest score is 44/8, attained at only one point, namely when (a,b,c,d) = (2,6,6,0). So the only way one can have a 44-point set is if all side slices are (2,6,6,0), or equivalently if the whole set has statistics (a,b,c,d,e) = (4,16,24,0,0). But then we have all the points with two 2s, which means that the four "a" points cannot be separated by Hamming distance 2. We conclude that we must have an antipodal pair among the "a" points with an odd number of 1s, and an antipodal pair among the "a" points with an even number of 1s. By the symmetries of the cube, we may take the a-set to then be 1111, 3333, 1113, 3331. But then the "b" set must exclude both 1112 and 3332, and so can have at most three points in the eight-point set xyz2 (with x,y,z=1,3) rather than four (to get four points one must alternate in a checkerboard pattern). Adding this to the at most four points of the form xy2z, x2yz, 2xyz we see that b is at most 15, a contradiction. <math>\Box</math><br />
<br />
Given a subset of <math>[3]^4</math>, let a be the number of points with no 2s, b be the number of points with 1 2, and so forth. The quintuple (a,b,c,d,e) thus lies between (0,0,0,0,0) and (16,32,24,8,1).<br />
<br />
The 43-point solutions have distributions (a,b,c,d,e) as follows:<br />
<br />
* (5,20,18,0,0) [16 solutions]<br />
* (4,16,23,0,0) [768 solutions]<br />
* (3,16,24,0,0) [512 solutions]<br />
* (4,15,24,0,0) [256 solutions]<br />
<br />
The 42-point solutions are [http://abel.math.umu.se/~klasm/Moser-42-stat.pdf distributed as follows]:<br />
<br />
* (6,24,12,0,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-12 8 solutions]] <br />
* (5,20,17,0,0) [576 solutions]<br />
* (5,19,18,0,0) [384 solutions]<br />
* (6,16,18,2,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-e=2 192 solutions]]<br />
* (4,20,18,0,0) [272 solutions]<br />
* (5,17,20,0,0) [192 solutions]<br />
* (5,16,21,0,0) [3584 solutions]<br />
* (4,17,21,0,0) [768 solutions]<br />
* (4,16,22,0,0) [26880 solutions]<br />
* (5,15,22,0,0) [1536 solutions]<br />
* (4,15,23,0,0) [22272 solutions]<br />
* (3,16,23,0,0) [15744 solutions]<br />
* (4,14,24,0,0) [4224 solutions]<br />
* (3,15,24,0,0) [8704 solutions]<br />
* (2,16,24,0,0) [896 solutions]<br />
<br />
Note how c is usually quite large, and d quite low.<br />
<br />
One of the (6,24,12,0,0) solutions is <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math> (i.e. the set of points containing exactly two 1s, and/or exactly two 3s). The other seven are reflections of this set.<br />
<br />
There are 2,765,200 41-point solutions, [http://abel.math.umu.se/~klasm/solutions-4-t=3-41-moser.gz listed here]. The statistics for such points can be [http://abel.math.umu.se/~klasm/Moser-41-stat.pdf found here]. Noteworthy features of the statistics:<br />
<br />
* d is at most 3 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=3.gz 256 exceptional solutions] of the shape (5,15,18,3,0), have d at most 2; here are the [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=2.gz d=2 solutions] and [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=1.gz d=1 solutions])<br />
* c is at least 6 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-c=6.gz 16 exceptional solutions] of the shape (7,28,6,0,0), have c at least 11).<br />
<br />
Statistics for the 41-point, 42-point, and 43-point solutions can be found [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en here].<br />
<br />
If a Moser set in <math>[3]^4</math> contains 2222, then by the n=3 theory, any middle slice (i.e. 2***, *2**, **2*, or ***2) is missing at least one antipodal pair. But each antipodal pair belongs to at most three middle slices, thus two of the 40 antipodal pairs must be completely missing. As a consequence, any Moser set containing 2222 can have at most 39 points.<br />
(A more refined analysis can be found at [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
We have the following inequalities connecting a,b,c,d,e:<br />
<br />
* <math>4a+b \leq 64</math>: There are 32 lines connecting two "a" points with a "b" point; each "a" point belongs to four of these lines, and each "b" point belongs to one. But each such line can have at most two points in the set, and the claim follows.<br />
* This can be refined to <math>4a+b+\frac{2}{3} c \leq 64</math>: There are 24 planes connecting four "a" points, four "b" points, and one "c" point; each "a" point belongs to six of these, each "b" point belongs to three, and each "c" point belongs to one. For each of these planes, we have <math>2a + b + 2c \leq 8</math> from the n=2 theory, and the claim follows.<br />
* <math>6a+2c \leq 96</math>: There are 96 lines connecting two "a" points with a "c" point; each "a" point belongs to six of these lines, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
* <math>3b+2c \leq 96</math>: There are 48 lines connecting two "b" points to a "c" point; each "b" point belongs to three of these points, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
<br />
The inequalities for n=3 imply inequalities for n=4. Indeed, there are eight side slices of <math>[3]^4</math>; each "a" point belongs to four of these, each "b" point belongs to three, each "c" point belongs to two, and each "d" point belongs to one. Thus, any inequality of the form<br />
:<math> \alpha a + \beta b + \gamma c + \delta d \leq M</math><br />
in three dimensions implies the inequality<br />
:<math> 4 \alpha a + 3 \beta b + 2 \gamma c + \delta d \leq 8M</math><br />
in four dimensions. Thus we have<br />
<br />
* <math>8a+3b+4c+4d \leq 176</math><br />
* <math>2a+b+c+d \leq 48</math><br />
* <math>14a+3b+4c+4d \leq 244</math><br />
* <math>4a+b+c+d \leq 64</math> <br />
* <math>3b+2c+3d \leq 96</math><br />
* <math>a+c+d \leq 28</math><br />
* <math>5a+2c+2d \leq 80</math><br />
* <math>a+d \leq 16</math><br />
* <math>b+2d \leq 32</math><br />
* <math>2c+3d \leq 48</math><br />
<br />
Cubes also sit diagonally in the 4-dimensional cube. They may have coordinates xxyz, where xx runs over (11,22,33) or (13,22,31), while y and z run over (1,2,3). These cubes have a different distribution of 2s than the ordinary slices: (a,b,c,d,e) = (8,8,6,4,1) instead of (8,12,6,1,0) for the side slices and (0,8,12,6,1) for the middle slices. So a different set of inequalities arise. It is possible to run through the same procedure as described above for n=3, and the inequalities that arise are <math>2a+b+2c+2d+4e \le 24</math> and <math>4a+b+2c+2d+4e \le 32</math> within the 3D cube, which when averaged becomes <math>4a+b+2c+4d+16e \le 96</math> and <math>8a+b+2c+4d+16e \le 128</math> for the 4D cube. A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here] on sheet 2. Other sheets of this spreadsheet contain results for a 3D cube sitting diagonally in a 5D, 6D or 7D cube. Notice the similarity of the equations that arise for the xxxyz, xxyyz and xxxxyz diagonals. Also notice the xxxxyyz diagonals have the same Pareto sets and linear inequalities as the cube's c-statistics<br />
<br />
== n=5 ==<br />
<br />
Let (A,B,C,D,E,F) be the statistics of a five-dimensional Moser set, thus (A,B,C,D,E,F) varies between (0,0,0,0,0,0) and (32,80,80,40,10,1).<br />
<br />
There are several Moser sets with the statistics (4,40,80,0,0,0), which thus have 124 points. Indeed, one can take<br />
<br />
* all points with two 2s;<br />
* all points with one 2 and an even number of 1s; and<br />
* (13111),(13113),(31311),(13333). Any two of these four points differ in three places, except for one pair of points that differ in one place.<br />
<br />
For the rest of this section, we assume that the Moser set <math>{\mathcal A}</math> is a 125-point set.<br />
<br />
:'''Lemma 1''': F=0.<br />
<br />
'''Proof''' If F is non-zero, then the Moser set contains 22222, then each of the 121 antipodal pairs can have at most one point in the set, leading to only 122 points. <math>\Box</math><br />
<br />
:'''Lemma 2''': Every middle slice of <math>{\mathcal A}</math> has at most 41 points.<br />
<br />
'''Proof''' Without loss of generality we may consider the 2**** slice. There are two cases, depending on the value of c(2****).<br />
<br />
Suppose first that c is at least 17; thus there are at least 17 points of the form 222xy, 22x2y, 22xy2, 2x2y2, 2xy22, or 2x22y, where the x, y denote 1 or 3. This gives 34 "xy" wildcards in all in four coordinate slots; by the pigeonhole principle one of the slots sees at least 9 of the wildcards. By symmetry, we may assume that the second coordinate slot sees at least 9 of these wildcards, thus there are at least 9 points of the form 2x22y, 2x2y2, 2xy22. The x=1, x=3 cases can absorb at most six of these, thus each of these cases must absorb at least three points, with at least one absorbing at least five. Let's say that it's the x=3 case that absorbs 5; thus <math>d(*1***) \geq 3</math> and <math>d(*3***) \geq 5</math>. From the n=4 theory this means that the *1*** slice has at most 41 points, and the *3*** slice has at most 40. Meanwhile, the middle slice has at most 43, leading to 41+41+42=124 points in all.<br />
<br />
Now suppose c is less than 17; then by the n=4 theory the middle slice is one of the eight (6,24,12,0,0) sets. Without loss of generality we may take it to be <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math>; in particular, the middle slice contains the points 21122 21212 21221 23322 23232 23223. In particular, the *1*** and *3*** slices have a "d" value of at least three, and so have at most 41 points. If the *2*** slice has at most 42 points, then we are at 41+42+41=124 points as needed, but if we have 43 or more, then we are back in the first case (as <math>c(*2***) \geq 17</math>) after permuting the indices.<br />
<math>\Box</math><br />
<br />
Since 125=41+41+43, we thus have<br />
<br />
:'''Corollary 1''': Every side slice of <math>{\mathcal A}</math> has at least 41 points. If one side slice does have 41 points, then the other has 43.<br />
<br />
Combining this with the n=4 statistics, we conclude<br />
<br />
:'''Corollary 2''': Every side slice of <math>{\mathcal A}</math> has e=0, and <math>d \leq 3</math>. Given two opposite side slices, e.g. 1**** and 3****, we have <math>d(1****)+d(3****) \leq 4</math>.<br />
<br />
:'''Corollary 3''': E=0.<br />
<br />
:'''Corollary 4''': Any middle slice has a "c" value of at most 8.<br />
<br />
'''Proof''' Let's work with the 2**** slice. By Corollary 2, there are at most four contributions to c(2****) of the form 21*** or 23***, and similarly for the other three positions. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 3''': <math>2B+C \leq 160</math>. <br />
<br />
'''Proof''': There are 160 lines connecting one "C" point to two "B" points (e.g. 11112, 11122, 11132); each "C" point lies in two of these, and each "B" point lies on four. A Moser set can have at most two points out of each of these lines. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 4''' Given m A-points with <math>m \geq 5</math>, there exists at least m-4 pairs (a,b) of such A-points with Hamming distance exactly two (i.e. b differs from a in exactly two places).<br />
<br />
'''Proof''' It suffices to check this for m=5, since the case of larger m then follows by locating a pair, removing a point that contributes to that pair, and using the induction hypothesis. Given 5 A-points, we may assume by the pigeonhole principle and symmetry that at least three of them have an odd number of 1s. Suppose 11111 is one of the points, and that no pair has Hamming distance 2. All points with two 3s are excluded, so the only points allowed with an odd number of 1s are those with four 3s. But all those points differ from each other in two positions, so at most one of them is allowed. <math>\Box</math><br />
<br />
:'''Corollary 5''' <math>C \leq 79</math>.<br />
<br />
'''Proof''': Suppose for contradiction that C=80, then D=0 and <math>B \leq 40</math>. From Lemma 4 we also see that A cannot be 5 or more, leading to the contradiction.<br />
<math>\Box</math><br />
<br />
Define the '''score''' of a Moser set in <math>[3]^4</math> to be the quantity <math>a + 5b/4 + 5c/3 + 5d/2 + 5e</math>. Double-counting (and Lemma 2) gives<br />
<br />
:'''Lemma 5''' The total score of all the ten side-slices of <math>{\mathcal A}</math> is equal to <math>5|{\mathcal A}| = 5 \times 125</math>. In particular, there exists a pair of opposite side-slices whose scores add up to at least 125.<br />
<br />
By Lemma 5 and symmetry, we may assume that the 1**** and 3**** slices have score adding up to at least 125. By Lemma 2, the 2**** slice has at most 41 points, which imply that the 1**** and 3**** have 41, 42, or 43 points. From the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] we know that all 41-point, 42-point, 43-point slices have score less than 62, with the following exceptions:<br />
<br />
# (2,16,24,0,0) [42 points, score: 62]<br />
# (4,16,23,0,0) [43 points, score: 62 1/3]<br />
# (4,15,24,0,0) [43 points, score: 62 3/4]<br />
# (3,16,24,0,0) [43 points, score: 63]<br />
<br />
Thus the 1**** and 3**** slices must come from the above list. Furthermore, if one of the slices is of type 1, then the other must be of type 4, and if one slice is of type 2, then the other must be of type 3 or 4.<br />
<br />
:'''Lemma 6''' There exists one cut in which the side slices have total score strictly greater than 125 (i.e. they thus involve only Type 2, Type 3, and Type 4 slices, with at least one side slice not equal to Type 2, and the cut here is of the form 43+39+43).<br />
<br />
'''Proof''' If not, then all cuts have side slices exactly equal to 125, which by the above table implies that one is Type 1 and one is Type 4, in particular all side slices have c=24. But this forces C=80, contradicting Corollary 5. <math>\Box</math><br />
<br />
Adding up the corners we conclude<br />
<br />
:'''Corollary 6''' <math>6 \leq A \leq 8</math>.<br />
<br />
:'''Lemma 7''' <math>D=0</math>.<br />
<br />
'''Proof''' Firstly, from Lemma 6 we have a cut in which the two side slices are omitting at most one C-point between them (and have no D-point), which forces the middle slice to have at most one D-point; thus D is at most 1.<br />
<br />
Now suppose instead that D=1 (e.g. if 11222 was in the set); then there would be two choices of coordinates in which one of the side slices would have d=1 (e.g. 1**** and *1****). But the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] show that such slices have a score of at most 59 7/12, so that the total score from those two coordinates is at most 63 + 59 7/12 = 122 7/12. On the other hand, the other three slices have a net score of at most 63+63 = 126. This averages out to at most 124.633... < 125, a contradiction. <math>\Box</math><br />
<br />
:'''Corollary 7''' Every middle slice has at most 40 points.<br />
<br />
'''Proof''' From Lemma 7 we see that the middle slice must have c=0, but from the n=4 statistics this is not possible for any slice of size 41 or higher (alternatively, one can use the inequalities <math>4a+b \leq 64</math> and <math>b \leq 32</math>). <math>\Box</math><br />
<br />
Each middle slice now has 39 or 40 points, with c=d=e=0, and so from the inequalities <math>4a+b \leq 64</math>, <math>b \leq 32</math> must have statistics (8,32,0,0,0),(7,32,0,0,0), or (8,31,0,0,0). In particular the "a" index of the middle slices is at most 8. Summing over all middle slices we conclude<br />
<br />
:'''Lemma 9''' B is at most 40.<br />
<br />
:'''Lemma 10''' C is equal to 78 or 79.<br />
<br />
'''Proof''' By Corollary 5, it suffices to show that <math>C \geq 78</math>.<br />
<br />
As 125=43+39+43, we see that every center slice must have at least 39 points. By Lemma 2 and Lemma 7 the center slice has c=d=e=0, thus the center slice has a+b >= 39. On the other hand, from the n=4 theory we have 4a+b <= 64, which forces b >= 31.<br />
<br />
By double counting, we see that 2C is equal to the sum of the b's of all the five center slices. Thus C >= 5*31/2 = 77.5 and the claim follows. <math>\Box</math><br />
<br />
From Lemmas 9 and 10 we see that <math>B+C \leq 119</math>, and thus <math>A \geq 6</math>. Also, if <math>A \geq 7</math>, then by Lemma 4 we have at least three pairs of A points with Hamming distance 2. At most two of these pairs eliminate the same C point, so we would have C=78 in that case. <br />
<br />
Putting all the above facts together, we see that (A,B,C,D,E,F) must be one of the following triples:<br />
<br />
* (6,40,79,0,0,0)<br />
* (7,40,78,0,0,0)<br />
* (8,39,78,0,0,0)<br />
<br />
All three cases can be eliminated, giving <math>c'_5=124</math>.<br />
<br />
=== Elimination of (6,40,79,0,0,0) ===<br />
<br />
Look at the 6 A points. From Lemma 4 we have at least two pairs (a,b), (c,d) of A-points that have Hamming separation 2.<br />
<br />
Now look at the midpoints of these two pairs; these midpoints are C-points cannot lie in the set. But we have exactly one C-point missing from the set, thus the midpoints must be the same. By symmetry, we may thus assume that the two pairs are (11111,11133) and (11113,11131). Thus 11111,11133, 11113, 11131 are in the set, and so every C-point other than 11122 is in the set. On the other hand, the B-points 11121, 11123, 11112, 11132 lie outside the set.<br />
<br />
At most one of 11312, 11332 lie in the set (since 11322 lies in the set). Suppose that 11312 lies outside the set, then we have a pair (xy1z2, xy3z2) with x,y,z = 1,3 that is totally omitted from the set, namely (11112,11312). On the other hand, every other pair of this form can have at most one point in the set, thus there are at most seven points in the set of the form (xyzw2) with x,y,z,w = 1,3. Similarly there are at most 8 points of the form xyz2w, or of xy2zw, x2yzw, 2xyzw, leading to at most 39 B-points in all, contradiction.<br />
<br />
=== Elimination of (7,40,78,0,0,0) ===<br />
<br />
By Lemma 4, we have at least three pairs of A-points of distance two apart that lie in the set.<br />
The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs must have the same midpoint, so we may assume as before that 111xy lies in the set for x,y=1,3, which implies that 1112* and 111*2 lie outside the set.<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 316. On the other hand, every one of the 160 lines can have at most two points, and two of these lines (namely 1112*, 111*2) have no points. Thus all the other lines must have exactly two points. <br />
<br />
We know that the C-point 11122 is missing from the set; there is one other missing C-point. Since 1112x, 111x2 lie outside the set, we conclude from the previous paragraph that 1132x, 113x2 and 1312x, 131x2 lie in the set. Taking midpoints we conclude that 11322 and 13122 lie outside the set. But this is now three C-points missing (together with 11122), a contradiction.<br />
<br />
=== Elimination of (8,39,78,0,0,0) ===<br />
<br />
By Lemma 4 we have at least four pairs of A-points of distance two apart that lie in the set. The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs (a,b), (c,d) must have the same midpoint p, and the other two pairs (a',b'), (c',d') must also have the same midpoint p'. (Note that every C-point is the midpoint of at most two such pairs.)<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 312. Every one of the 160 lines can have at most two points, and four of these (those in the plane of (a,b,c,d) or of (a',b',c',d') have no points. Thus all other lines must have exactly two points.<br />
<br />
Without loss of generality we have (a,b)=(11111,11133), (c,d) = (11113,11131), thus p = 11122. By permuting the first three indices, we may assume that p' is not of the form x2y2z, x2yz2, xy22z, xy2z2. Then 1112x lies outside the set and 1122x lies in the set, so by the above paragraph 1132x lies in the set; similarly for 113x2, 1312x, 131x2. This implies that 13122, 11322 lie outside the set, but this (together with 11122) shows that at least three C-points are missing, a contradiction.<br />
<br />
== General n ==<br />
<br />
General solution for <math>c'_N</math><br />
<br />
* q 2s, all points from A(N-q,1)<br />
* q-1 2s, points from A(N-q+1,2)<br />
* q-2 2s, points from A(N-q+2,3)<br />
* etc.<br />
<br />
where A(m,d) is a subset of <math>[1,3]^m</math> for which any two points differ from each other in at least d places.<br />
<br />
Mathworld’s entry on error-correcting codes suggests it might be NP-complete to find the size of A(m,d) in general.<br />
<br />
* <math>|A(m,1)| = 2^m</math> because it includes all points in <math>[1,3]^m</math><br />
* <math>|A(m,2)| = 2^{m-1}</math> because it can include all points in <math>[1,3]^m</math> with an odd number of 1s<br />
<br />
The integer programming routine from Maple 12 was used to obtain upper bounds for <math>c'_6</math> and <math>c'_7</math>. A large number of linear inequalities, described above in sections (n=3) and (n=4), were combined. The details are in [[Maple calculations]]. The results were that <math>c'_6 \le 361</math> and <math>c'_7 \le 1071</math>.<br />
<br />
A [[genetic algorithm]] has provided the following examples:<br />
<br />
* <math>c'_6 \geq 353</math> (26 examples; [http://twofoldgaze.wordpress.com/2009/03/10/353-element-solution/ here is one])<br />
* <math>c'_7 \geq 988</math> [http://twofoldgaze.wordpress.com/2009/03/10/978-element-solution/ Here is the example]<br />
<br />
== Larger sides (k>3) ==<br />
<br />
The following set gives a lower bound for Moser’s cube <math>[4]^n</math> (values 1,2,3,4): Pick all points where q entries are 2 or 3; and also pick those where q-1 entries are 2 or 3 and an odd number of entries are 1. This is maximized when q is near n/2, giving a lower bound of<br />
<br />
<math>\binom{n}{n/2} 2^n + \binom{n}{n/2-1} 2^{n-1}</math><br />
<br />
which is comparable to <math>4^n/\sqrt{n}</math> by [[Stirling's formula]].<br />
<br />
For k=5 (values 1,2,3,4,5) If A, B, C, D, and E denote the numbers of 1-s, 2-s, 3-s, 4-s and 5-s then the first three points of a geometric line form a 3-term arithmetic progression in A+E+2(B+D)+3C. So, for k=5 we have a similar lower bound for the Moser’s problem as for DHJ k=3, i.e. <math>5^{n - O(\sqrt{\log n})}</math>.<br />
<br />
The k=6 version of Moser implies DHJ(3). Indeed, any k=3 combinatorial line-free set can be "doubled up" into a k=6 geometric line-free set of the same density by pulling back the set from the map <math>\phi: [6]^n \to [3]^n</math> that maps 1, 2, 3, 4, 5, 6 to 1, 2, 3, 3, 2, 1 respectively; note that this map sends k=6 geometric lines to k=3 combinatorial lines.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Moser%27s_cube_problemMoser's cube problem2009-03-21T14:06:11Z<p>121.220.134.232: /* Elimination of (7,40,78,0,0,0) */</p>
<hr />
<div>Define a ''Moser set'' to be a subset of <math>[3]^n</math> which does not contain any [[geometric line]], and let <math>c'_n</math> denote the size of the largest Moser set in <math>[3]^n</math>. The first few values are (see [http://www.research.att.com/~njas/sequences/A003142 OEIS A003142]):<br />
<br />
: <math>c'_0 = 1; c'_1 = 2; c'_2 = 6; c'_3 = 16; c'_4 = 43; c'_5 = 124.</math><br />
<br />
Beyond this point, we only have some upper and lower bounds, in particular <math>353 \leq c'_6 \leq 361</math>; see [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DsU-uZ1tK7VEg this spreadsheet] for the latest bounds.<br />
<br />
The best known asymptotic lower bound for <math>c'_n</math> is<br />
<br />
: <math>c'_n \gg 3^n/\sqrt{n}</math>,<br />
<br />
formed by fixing the number of 2s to a single value near n/3. Compare this to the DHJ(2) or Sperner limit of <math>2^n/\sqrt{n}</math>. Is it possible to do any better? Note that we have a [[Upper_and_lower_bounds#Asymptotics|significantly better bound]] for the DHJ(3) <math>c_n</math>:<br />
<br />
: <math>c_n \geq 3^{n-O(\sqrt{\log n})}</math>.<br />
<br />
A more precise lower bound is<br />
<br />
: <math>c'_n \geq \binom{n+1}{q} 2^{n-q}</math><br />
<br />
where q is the nearest integer to <math>n/3</math>, formed by taking all strings with q 2s, together with all strings with q-1 2s and an odd number of 1s. This for instance gives the lower bound <math>c'_5 \geq 120</math>, which compares with the upper bound <math>c'_5 \leq 3 c'_4 = 129</math>.<br />
<br />
Using [[DHJ(3)]], we have the upper bound<br />
<br />
: <math>c'_n = o(3^n)</math>,<br />
<br />
but no effective decay rate is known. It would be good to have a combinatorial proof of this fact (which is weaker than [[DHJ(3)]], but implies [[Roth's theorem]]).<br />
<br />
== Notation ==<br />
<br />
Given a Moser set A in <math>[3]^n</math>, we let a be the number of points in A with no 2s, b be the number of points in A with one 2, c the number of points with two 2s, etc. We call (a,b,c,...) the ''statistics'' of A. Given a slice S of <math>[3]^n</math>, we let a(S), b(S), etc. denote the statistics of that slice (dropping the fixed coordinates). Thus for instance if A = {11, 12, 22, 32}, then (a,b,c) = (1,2,1), and (a(1*),b(1*)) = (1,1).<br />
<br />
We call a statistic (a,b,c,...) ''attainable'' if it is attained by a Moser set. We say that an attainable statistic is ''Pareto-optimal'' if it cannot be pointwise dominated by any other attainable statistic (a',b',c',...) (thus <math>a' \geq a</math>, <math>b' \geq b</math>, etc.) We say that it is ''extremal'' if it is not a convex combination of any other attainable statistic (this is a stronger property than Pareto-optimal). For the purposes of maximising linear scores of attainable statistics, it suffices to check extremal statistics.<br />
<br />
We let <math>(\alpha_0,\alpha_1,\ldots)</math> be the normalized version of <math>(a,b,\ldots)</math>, in which one divides the number of points of a certain type in the set by the total number of points in the set. Thus for instance <math>\alpha_0 = a/2^n, \alpha_1 = b/(n 2^{n-1}), \alpha_3 = c/(\binom{n}{2} 2^{n-2})</math>, etc., and the <math>\alpha_i</math> range between 0 and 1. Averaging arguments show that any linear inequality obeyed by the <math>\alpha_i</math> at one dimension is automatically inherited by higher dimensions, as are shifted versions of this inequality (in which <math>\alpha_i</math> is replaced by <math>\alpha_{i+1}</math>.<br />
<br />
The idea in 'c-statistics' is to identify 1s and 3s but leave the 2s intact. Let’s use x to denote letters that are either 1 or 3, then the 81 points get split up into 16 groups: 2222 (1 point), 222x, 22×2, 2×22, x222 (two points each), 22xx, 2×2x, x22x, 2xx2, x2×2, xx22 (four points each), 2xxx, x2xx, xx2x, xxx2 (eight points each), xxxx (sixteen points). Let c(w) denote the number of points inside a group w, e.g. c(xx22) is the number of points of the form xx22 inside the set, and is thus an integer from 0 to 4.<br />
<br />
== n=0 ==<br />
<br />
We trivially have <math>c'_0=1.</math><br />
<br />
== n=1 ==<br />
<br />
We trivially have <math>c'_1=2.</math> The Pareto-optimal values of the statistics (a,b) are (1,1) and (2,0); these are also the extremals. We thus have the inequality <math>a+b \leq 2</math>, or in normalized notation<br />
<br />
:<math>2\alpha_0 + \alpha_1 \leq 2</math>.<br />
<br />
== n=2 ==<br />
<br />
We have <math>c'_2 = 6</math>; the upper bound follows since <math>c'_2 \leq 3 c'_1</math>, and the lower bound follows by deleting one of the two diagonals from <math>[3]^2</math> (these are the only extremisers).<br />
<br />
The extremiser has statistics (a,b,c) = (2,4,0), which are of course Pareto-optimal. If c=1 then we must have a, b at most 2 (look at the lines through 22). This is attainable (e.g. {11, 12, 22, 23, 31}, and so (2,2,1) is another Pareto-optimal statistic. If a=4, then b and c must be 0, so we get another Pareto-optimal statistic (4,0,0) (attainable by {11, 13, 31, 33} of course). If a=3, then c=0 and b is at most 2, giving another Pareto-optimal statistic (3,2,0); but this is a convex combination of (4,0,0) and (2,4,0) and is thus not extremal. Thus the complete set of extremal statistics are<br />
<br />
(4,0,0), (2,4,0), (2,2,1).<br />
<br />
The sharp linear inequalities obeyed by a,b,c (other than the trivial ones <math>a,b,c \geq 0</math>) are then<br />
<br />
*<math>2a+b+2c \leq 8</math><br />
*<math>b+2c \leq 4</math><br />
*<math>a+2c \leq 4</math><br />
*<math>c \leq 1</math>.<br />
<br />
In normalized notation, we have<br />
*<math>4\alpha_0 + 2\alpha_1 + \alpha_2 \leq 4</math><br />
*<math>2\alpha_1 + \alpha_2 \leq 2</math><br />
*<math>2\alpha_0 + \alpha_2 \leq 2</math><br />
*<math>\alpha_2 \leq 1</math>.<br />
<br />
The Pareto optimizers for c-statistics are (c22,c2x,cx2,cxx) = (1112),(0222),(0004), which are covered by the linear inequalities <br />
*<math>c22+c2x+cxx \le 4</math>, <br />
*<math>c22+cx2+cxx \le 4</math>.<br />
<br />
== n=3 ==<br />
<br />
We have <math>c'_3 = 16</math>. The lower bound can be seen for instance by taking all the strings with one 2, and half the strings with no 2 (e.g. the strings with an odd number of 1s). The upper bound can be deduced from the corresponding [[upper and lower bounds]] for <math>c_3 = 18</math>; the 17-point and 18-point line-free sets each contain a geometric line.<br />
<br />
If a Moser set in <math>[3]^3</math> contains 222, then it can have at most 14 points, since the remaining 26 points in the cube split into 13 antipodal pairs, and at most one of each pair can lie in the set. By exhausting over the <math>2^{13} = 8192</math> possibilities, it can be shown that it is impossible for a 14-point set to exist; any Moser set containing 222 must in fact omit at least one antipodal pair completely and thus have only 13 points. (A human proof of this fact can be [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
The Pareto-optimal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(3,6,5,0),(4,4,5,0),(3,7,4,0),(4,6,4,0), (3,9,3,0),(4,7,3,0),(5,4,3,0),(4,9,2,0),(5,6,2,0),(6,3,2,0),(3,10,1,0),(5,7,1,0), (6,4,1,0),(4,12,0,0),(5,9,0,0),(6,6,0,0),(7,3,0,0),(8,0,0,0).<br />
<br />
These were found from a search of the <math>2^{27}</math> subsets of the cube.<br />
A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here].<br />
<br />
The extremal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(4,4,5,0),(4,6,4,0),(4,12,0,0),(8,0,0,0)<br />
<br />
The sharp linear bounds are :<br />
<br />
* <math>2a+b+2c+4d \leq 22</math><br />
* <math>3a+2b+3c+6d \leq 36</math><br />
* <math>7a+2b+4c+8d \leq 56</math><br />
* <math>6a+2b+3c+6d \leq 48</math><br />
* <math>b+c+3d \leq 12</math><br />
* <math>a+2c+4d \leq 14</math><br />
* <math>5a+4c+8d \leq 40</math><br />
* <math>a+4d \leq 8</math><br />
* <math>b+6d \leq 12</math><br />
* <math>c+3d \leq 6</math><br />
* <math>d \leq 1</math><br />
<br />
In normalized notation,<br />
* <math>8\alpha_0+ 6\alpha_1 + 6\alpha_2 + 2\alpha_3 \leq 11</math><br />
* <math>4\alpha_0+4\alpha_1+3\alpha_2+\alpha_3 \leq 6</math><br />
* <math>7\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 7</math><br />
* <math>8\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 8</math><br />
* <math>4\alpha_1+2\alpha_2+\alpha_3 \leq 4</math><br />
* <math>4\alpha_0+6\alpha_2+2\alpha_3 \leq 7</math><br />
* <math>5\alpha_0+3\alpha_2+\alpha_3 \leq 5</math><br />
* <math>2\alpha_0+\alpha_3 \leq 2</math><br />
* <math>2\alpha_1+\alpha_3 \leq 2</math><br />
* <math>2\alpha_2+\alpha_3 \leq 2</math><br />
* <math>\alpha_3 \leq 1</math><br />
<br />
The c-statistics can also be found from an exhaustive search of Moser sets. The resulting Pareto sets and linear inequalities can be found on Sheet 8 of [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# this spreadsheet]<br />
<br />
== n=4 ==<br />
<br />
A [http://abel.math.umu.se/~klasm/extremal-moser-n=5-t=3 computer search] has obtained all extremisers to <math>c'_4=43</math>. The 42-point solutions can be found [http://abel.math.umu.se/~klasm/moser-n=4-t=3-p=42.gz here].<br />
<br />
'''Proof that <math>c'_4 \leq 43</math>''': <br />
When e=1 (i.e. the 4D set contains 2222) then we have at most 41 points (in fact at most 39) by counting antipodal points, so assume e=0.<br />
<br />
Define the score of a 3D slice to be a/4+b/3+c/2+d. Observe from double counting that the size of a 4D set is the sum of the scores of its eight side slices.<br />
<br />
But by [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en looking at the extremals] we see that the largest score is 44/8, attained at only one point, namely when (a,b,c,d) = (2,6,6,0). So the only way one can have a 44-point set is if all side slices are (2,6,6,0), or equivalently if the whole set has statistics (a,b,c,d,e) = (4,16,24,0,0). But then we have all the points with two 2s, which means that the four "a" points cannot be separated by Hamming distance 2. We conclude that we must have an antipodal pair among the "a" points with an odd number of 1s, and an antipodal pair among the "a" points with an even number of 1s. By the symmetries of the cube, we may take the a-set to then be 1111, 3333, 1113, 3331. But then the "b" set must exclude both 1112 and 3332, and so can have at most three points in the eight-point set xyz2 (with x,y,z=1,3) rather than four (to get four points one must alternate in a checkerboard pattern). Adding this to the at most four points of the form xy2z, x2yz, 2xyz we see that b is at most 15, a contradiction. <math>\Box</math><br />
<br />
Given a subset of <math>[3]^4</math>, let a be the number of points with no 2s, b be the number of points with 1 2, and so forth. The quintuple (a,b,c,d,e) thus lies between (0,0,0,0,0) and (16,32,24,8,1).<br />
<br />
The 43-point solutions have distributions (a,b,c,d,e) as follows:<br />
<br />
* (5,20,18,0,0) [16 solutions]<br />
* (4,16,23,0,0) [768 solutions]<br />
* (3,16,24,0,0) [512 solutions]<br />
* (4,15,24,0,0) [256 solutions]<br />
<br />
The 42-point solutions are [http://abel.math.umu.se/~klasm/Moser-42-stat.pdf distributed as follows]:<br />
<br />
* (6,24,12,0,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-12 8 solutions]] <br />
* (5,20,17,0,0) [576 solutions]<br />
* (5,19,18,0,0) [384 solutions]<br />
* (6,16,18,2,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-e=2 192 solutions]]<br />
* (4,20,18,0,0) [272 solutions]<br />
* (5,17,20,0,0) [192 solutions]<br />
* (5,16,21,0,0) [3584 solutions]<br />
* (4,17,21,0,0) [768 solutions]<br />
* (4,16,22,0,0) [26880 solutions]<br />
* (5,15,22,0,0) [1536 solutions]<br />
* (4,15,23,0,0) [22272 solutions]<br />
* (3,16,23,0,0) [15744 solutions]<br />
* (4,14,24,0,0) [4224 solutions]<br />
* (3,15,24,0,0) [8704 solutions]<br />
* (2,16,24,0,0) [896 solutions]<br />
<br />
Note how c is usually quite large, and d quite low.<br />
<br />
One of the (6,24,12,0,0) solutions is <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math> (i.e. the set of points containing exactly two 1s, and/or exactly two 3s). The other seven are reflections of this set.<br />
<br />
There are 2,765,200 41-point solutions, [http://abel.math.umu.se/~klasm/solutions-4-t=3-41-moser.gz listed here]. The statistics for such points can be [http://abel.math.umu.se/~klasm/Moser-41-stat.pdf found here]. Noteworthy features of the statistics:<br />
<br />
* d is at most 3 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=3.gz 256 exceptional solutions] of the shape (5,15,18,3,0), have d at most 2; here are the [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=2.gz d=2 solutions] and [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=1.gz d=1 solutions])<br />
* c is at least 6 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-c=6.gz 16 exceptional solutions] of the shape (7,28,6,0,0), have c at least 11).<br />
<br />
Statistics for the 41-point, 42-point, and 43-point solutions can be found [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en here].<br />
<br />
If a Moser set in <math>[3]^4</math> contains 2222, then by the n=3 theory, any middle slice (i.e. 2***, *2**, **2*, or ***2) is missing at least one antipodal pair. But each antipodal pair belongs to at most three middle slices, thus two of the 40 antipodal pairs must be completely missing. As a consequence, any Moser set containing 2222 can have at most 39 points.<br />
(A more refined analysis can be found at [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
We have the following inequalities connecting a,b,c,d,e:<br />
<br />
* <math>4a+b \leq 64</math>: There are 32 lines connecting two "a" points with a "b" point; each "a" point belongs to four of these lines, and each "b" point belongs to one. But each such line can have at most two points in the set, and the claim follows.<br />
* This can be refined to <math>4a+b+\frac{2}{3} c \leq 64</math>: There are 24 planes connecting four "a" points, four "b" points, and one "c" point; each "a" point belongs to six of these, each "b" point belongs to three, and each "c" point belongs to one. For each of these planes, we have <math>2a + b + 2c \leq 8</math> from the n=2 theory, and the claim follows.<br />
* <math>6a+2c \leq 96</math>: There are 96 lines connecting two "a" points with a "c" point; each "a" point belongs to six of these lines, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
* <math>3b+2c \leq 96</math>: There are 48 lines connecting two "b" points to a "c" point; each "b" point belongs to three of these points, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
<br />
The inequalities for n=3 imply inequalities for n=4. Indeed, there are eight side slices of <math>[3]^4</math>; each "a" point belongs to four of these, each "b" point belongs to three, each "c" point belongs to two, and each "d" point belongs to one. Thus, any inequality of the form<br />
:<math> \alpha a + \beta b + \gamma c + \delta d \leq M</math><br />
in three dimensions implies the inequality<br />
:<math> 4 \alpha a + 3 \beta b + 2 \gamma c + \delta d \leq 8M</math><br />
in four dimensions. Thus we have<br />
<br />
* <math>8a+3b+4c+4d \leq 176</math><br />
* <math>2a+b+c+d \leq 48</math><br />
* <math>14a+3b+4c+4d \leq 244</math><br />
* <math>4a+b+c+d \leq 64</math> <br />
* <math>3b+2c+3d \leq 96</math><br />
* <math>a+c+d \leq 28</math><br />
* <math>5a+2c+2d \leq 80</math><br />
* <math>a+d \leq 16</math><br />
* <math>b+2d \leq 32</math><br />
* <math>2c+3d \leq 48</math><br />
<br />
Cubes also sit diagonally in the 4-dimensional cube. They may have coordinates xxyz, where xx runs over (11,22,33) or (13,22,31), while y and z run over (1,2,3). These cubes have a different distribution of 2s than the ordinary slices: (a,b,c,d,e) = (8,8,6,4,1) instead of (8,12,6,1,0) for the side slices and (0,8,12,6,1) for the middle slices. So a different set of inequalities arise. It is possible to run through the same procedure as described above for n=3, and the inequalities that arise are <math>2a+b+2c+2d+4e \le 24</math> and <math>4a+b+2c+2d+4e \le 32</math> within the 3D cube, which when averaged becomes <math>4a+b+2c+4d+16e \le 96</math> and <math>8a+b+2c+4d+16e \le 128</math> for the 4D cube. A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here] on sheet 2. Other sheets of this spreadsheet contain results for a 3D cube sitting diagonally in a 5D, 6D or 7D cube. Notice the similarity of the equations that arise for the xxxyz, xxyyz and xxxxyz diagonals. Also notice the xxxxyyz diagonals have the same Pareto sets and linear inequalities as the cube's c-statistics<br />
<br />
== n=5 ==<br />
<br />
Let (A,B,C,D,E,F) be the statistics of a five-dimensional Moser set, thus (A,B,C,D,E,F) varies between (0,0,0,0,0,0) and (32,80,80,40,10,1).<br />
<br />
There are several Moser sets with the statistics (4,40,80,0,0,0), which thus have 124 points. Indeed, one can take<br />
<br />
* all points with two 2s;<br />
* all points with one 2 and an even number of 1s; and<br />
* (13111),(13113),(31311),(13333). Any two of these four points differ in three places, except for one pair of points that differ in one place.<br />
<br />
For the rest of this section, we assume that the Moser set <math>{\mathcal A}</math> is a 125-point set.<br />
<br />
:'''Lemma 1''': F=0.<br />
<br />
'''Proof''' If F is non-zero, then the Moser set contains 22222, then each of the 121 antipodal pairs can have at most one point in the set, leading to only 122 points. <math>\Box</math><br />
<br />
:'''Lemma 2''': Every middle slice of <math>{\mathcal A}</math> has at most 41 points.<br />
<br />
'''Proof''' Without loss of generality we may consider the 2**** slice. There are two cases, depending on the value of c(2****).<br />
<br />
Suppose first that c is at least 17; thus there are at least 17 points of the form 222xy, 22x2y, 22xy2, 2x2y2, 2xy22, or 2x22y, where the x, y denote 1 or 3. This gives 34 "xy" wildcards in all in four coordinate slots; by the pigeonhole principle one of the slots sees at least 9 of the wildcards. By symmetry, we may assume that the second coordinate slot sees at least 9 of these wildcards, thus there are at least 9 points of the form 2x22y, 2x2y2, 2xy22. The x=1, x=3 cases can absorb at most six of these, thus each of these cases must absorb at least three points, with at least one absorbing at least five. Let's say that it's the x=3 case that absorbs 5; thus <math>d(*1***) \geq 3</math> and <math>d(*3***) \geq 5</math>. From the n=4 theory this means that the *1*** slice has at most 41 points, and the *3*** slice has at most 40. Meanwhile, the middle slice has at most 43, leading to 41+41+42=124 points in all.<br />
<br />
Now suppose c is less than 17; then by the n=4 theory the middle slice is one of the eight (6,24,12,0,0) sets. Without loss of generality we may take it to be <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math>; in particular, the middle slice contains the points 21122 21212 21221 23322 23232 23223. In particular, the *1*** and *3*** slices have a "d" value of at least three, and so have at most 41 points. If the *2*** slice has at most 42 points, then we are at 41+42+41=124 points as needed, but if we have 43 or more, then we are back in the first case (as <math>c(*2***) \geq 17</math>) after permuting the indices.<br />
<math>\Box</math><br />
<br />
Since 125=41+41+43, we thus have<br />
<br />
:'''Corollary 1''': Every side slice of <math>{\mathcal A}</math> has at least 41 points. If one side slice does have 41 points, then the other has 43.<br />
<br />
Combining this with the n=4 statistics, we conclude<br />
<br />
:'''Corollary 2''': Every side slice of <math>{\mathcal A}</math> has e=0, and <math>d \leq 3</math>. Given two opposite side slices, e.g. 1**** and 3****, we have <math>d(1****)+d(3****) \leq 4</math>.<br />
<br />
:'''Corollary 3''': E=0.<br />
<br />
:'''Corollary 4''': Any middle slice has a "c" value of at most 8.<br />
<br />
'''Proof''' Let's work with the 2**** slice. By Corollary 2, there are at most four contributions to c(2****) of the form 21*** or 23***, and similarly for the other three positions. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 3''': <math>2B+C \leq 160</math>. <br />
<br />
'''Proof''': There are 160 lines connecting one "C" point to two "B" points (e.g. 11112, 11122, 11132); each "C" point lies in two of these, and each "B" point lies on four. A Moser set can have at most two points out of each of these lines. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 4''' Given m A-points with <math>m \geq 5</math>, there exists at least m-4 pairs (a,b) of such A-points with Hamming distance exactly two (i.e. b differs from a in exactly two places).<br />
<br />
'''Proof''' It suffices to check this for m=5, since the case of larger m then follows by locating a pair, removing a point that contributes to that pair, and using the induction hypothesis. Given 5 A-points, we may assume by the pigeonhole principle and symmetry that at least three of them have an odd number of 1s. Suppose 11111 is one of the points, and that no pair has Hamming distance 2. All points with two 3s are excluded, so the only points allowed with an odd number of 1s are those with four 3s. But all those points differ from each other in two positions, so at most one of them is allowed. <math>\Box</math><br />
<br />
:'''Corollary 5''' <math>C \leq 79</math>.<br />
<br />
'''Proof''': Suppose for contradiction that C=80, then D=0 and <math>B \leq 40</math>. From Lemma 4 we also see that A cannot be 5 or more, leading to the contradiction.<br />
<math>\Box</math><br />
<br />
Define the '''score''' of a Moser set in <math>[3]^4</math> to be the quantity <math>a + 5b/4 + 5c/3 + 5d/2 + 5e</math>. Double-counting (and Lemma 2) gives<br />
<br />
:'''Lemma 5''' The total score of all the ten side-slices of <math>{\mathcal A}</math> is equal to <math>5|{\mathcal A}| = 5 \times 125</math>. In particular, there exists a pair of opposite side-slices whose scores add up to at least 125.<br />
<br />
By Lemma 5 and symmetry, we may assume that the 1**** and 3**** slices have score adding up to at least 125. By Lemma 2, the 2**** slice has at most 41 points, which imply that the 1**** and 3**** have 41, 42, or 43 points. From the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] we know that all 41-point, 42-point, 43-point slices have score less than 62, with the following exceptions:<br />
<br />
# (2,16,24,0,0) [42 points, score: 62]<br />
# (4,16,23,0,0) [43 points, score: 62 1/3]<br />
# (4,15,24,0,0) [43 points, score: 62 3/4]<br />
# (3,16,24,0,0) [43 points, score: 63]<br />
<br />
Thus the 1**** and 3**** slices must come from the above list. Furthermore, if one of the slices is of type 1, then the other must be of type 4, and if one slice is of type 2, then the other must be of type 3 or 4.<br />
<br />
:'''Lemma 6''' There exists one cut in which the side slices have total score strictly greater than 125 (i.e. they thus involve only Type 2, Type 3, and Type 4 slices, with at least one side slice not equal to Type 2, and the cut here is of the form 43+39+43).<br />
<br />
'''Proof''' If not, then all cuts have side slices exactly equal to 125, which by the above table implies that one is Type 1 and one is Type 4, in particular all side slices have c=24. But this forces C=80, contradicting Corollary 5. <math>\Box</math><br />
<br />
Adding up the corners we conclude<br />
<br />
:'''Corollary 6''' <math>6 \leq A \leq 8</math>.<br />
<br />
:'''Lemma 7''' <math>D=0</math>.<br />
<br />
'''Proof''' Firstly, from Lemma 6 we have a cut in which the two side slices are omitting at most one C-point between them (and have no D-point), which forces the middle slice to have at most one D-point; thus D is at most 1.<br />
<br />
Now suppose instead that D=1 (e.g. if 11222 was in the set); then there would be two choices of coordinates in which one of the side slices would have d=1 (e.g. 1**** and *1****). But the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] show that such slices have a score of at most 59 7/12, so that the total score from those two coordinates is at most 63 + 59 7/12 = 122 7/12. On the other hand, the other three slices have a net score of at most 63+63 = 126. This averages out to at most 124.633... < 125, a contradiction. <math>\Box</math><br />
<br />
:'''Corollary 7''' Every middle slice has at most 40 points.<br />
<br />
'''Proof''' From Lemma 7 we see that the middle slice must have c=0, but from the n=4 statistics this is not possible for any slice of size 41 or higher (alternatively, one can use the inequalities <math>4a+b \leq 64</math> and <math>b \leq 32</math>). <math>\Box</math><br />
<br />
Each middle slice now has 39 or 40 points, with c=d=e=0, and so from the inequalities <math>4a+b \leq 64</math>, <math>b \leq 32</math> must have statistics (8,32,0,0,0),(7,32,0,0,0), or (8,31,0,0,0). In particular the "a" index of the middle slices is at most 8. Summing over all middle slices we conclude<br />
<br />
:'''Lemma 9''' B is at most 40.<br />
<br />
:'''Lemma 10''' C is equal to 78 or 79.<br />
<br />
'''Proof''' By Corollary 5, it suffices to show that <math>C \geq 78</math>.<br />
<br />
As 125=43+39+43, we see that every center slice must have at least 39 points. By Lemma 2 and Lemma 7 the center slice has c=d=e=0, thus the center slice has a+b >= 39. On the other hand, from the n=4 theory we have 4a+b <= 64, which forces b >= 31.<br />
<br />
By double counting, we see that 2C is equal to the sum of the b's of all the five center slices. Thus C >= 5*31/2 = 77.5 and the claim follows. <math>\Box</math><br />
<br />
From Lemmas 9 and 10 we see that <math>B+C \leq 119</math>, and thus <math>A \geq 6</math>. Also, if <math>A \geq 7</math>, then by Lemma 4 we have at least three pairs of A points with Hamming distance 2. At most two of these pairs eliminate the same C point, so we would have C=78 in that case. <br />
<br />
Putting all the above facts together, we see that (A,B,C,D,E,F) must be one of the following triples:<br />
<br />
* (6,40,79,0,0,0)<br />
* (7,40,78,0,0,0)<br />
* (8,39,78,0,0,0)<br />
<br />
All three cases can be eliminated, giving <math>c'_5=124</math>.<br />
<br />
=== Elimination of (6,40,79,0,0,0) ===<br />
<br />
Look at the 6 A points. From Lemma 4 we have at least two pairs (a,b), (c,d) of A-points that have Hamming separation 2.<br />
<br />
Now look at the midpoints of these two pairs; these midpoints are C-points cannot lie in the set. But we have exactly one C-point missing from the set, thus the midpoints must be the same. By symmetry, we may thus assume that the two pairs are (11111,11133) and (11113,11131). Thus 11111,11133, 11113, 11131 are in the set, and so every C-point other than 11122 is in the set. On the other hand, the B-points 11121, 11123, 11112, 11132 lie outside the set.<br />
<br />
At most one of 11312, 11332 lie in the set (since 11322 lies in the set). Suppose that 11312 lies outside the set, then we have a pair (xy1z2, xy3z2) with x,y,z = 1,3 that is totally omitted from the set, namely (11112,11312). On the other hand, every other pair of this form can have at most one point in the set, thus there are at most seven points in the set of the form (xyzw2) with x,y,z,w = 1,3. Similarly there are at most 8 points of the form xyz2w, or of xy2zw, x2yzw, 2xyzw, leading to at most 39 B-points in all, contradiction.<br />
<br />
=== Elimination of (7,40,78,0,0,0) ===<br />
<br />
By Lemma 4, we have at least three pairs of A-points of distance two apart that lie in the set.<br />
The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs must have the same midpoint, so we may assume as before that 111xy lies in the set for x,y=1,3, which implies that 1112* and 111*2 lie outside the set.<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 316. On the other hand, every one of the 160 lines can have at most two points, and two of these lines (namely 1112*, 111*2) have no points. Thus all the other lines must have exactly two points. <br />
<br />
We know that the C-point 11122 is missing from the set; there is one other missing C-point. Since 1112x, 111x2 lie outside the set, we conclude from the previous paragraph that 1132x, 113x2 and 1312x, 131x2 lie in the set. Taking midpoints we conclude that 11322 and 13122 lie outside the set. But this is now three C-points missing (together with 11122), a contradiction.<br />
<br />
=== Elimination of (8,39,78,0,0,0) ===<br />
<br />
By Lemma 4 we have at least four pairs of A-points of distance two apart that lie in the set. The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs (a,b), (c,d) must have the same midpoint p, and the other two pairs (a',b'), (c',d') must also have the same midpoint p'. (Note that every C-point is the midpoint of at most two such pairs.)<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 312. Every one of the 160 lines can have at most two points, and four of these (those in the plane of (a,b,c,d) or of (a',b',c',d') have no points. Thus all other lines must have exactly two points.<br />
<br />
Without loss of generality we have (a,b)=(11111,11133), (c,d) = (11113,11131), thus p = 11122. By permuting the first three indices, we may assume that p' is not of the form x2y2z, x2yz2, xy22z, xy2z2. Then 1112x lies outside the set and 1122x lies in the set, so by the above paragraph 1132x lies in the set; similarly for 113x2, 1312x, 131x2. This implies that 13122, 11322 lie outside the set, but this (together with 11122) shows that at least three C-points are missing, a contradiction.<br />
<br />
== General n ==<br />
<br />
General solution for <math>c'_N</math><br />
<br />
* q 2s, all points from A(N-q,1)<br />
* q-1 2s, points from A(N-q+1,2)<br />
* q-2 2s, points from A(N-q+2,3)<br />
* etc.<br />
<br />
where A(m,d) is a subset of <math>[1,3]^m</math> for which any two points differ from each other in at least d places.<br />
<br />
Mathworld’s entry on error-correcting codes suggests it might be NP-complete to find the size of A(m,d) in general.<br />
<br />
* <math>|A(m,1)| = 2^m</math> because it includes all points in <math>[1,3]^m</math><br />
* <math>|A(m,2)| = 2^{m-1}</math> because it can include all points in <math>[1,3]^m</math> with an odd number of 1s<br />
<br />
The integer programming routine from Maple 12 was used to obtain upper bounds for <math>c'_6</math> and <math>c'_7</math>. A large number of linear inequalities, described above in sections (n=3) and (n=4), were combined. The details are in [[Maple calculations]]. The results were that <math>c'_6 \le 361</math> and <math>c'_7 \le 1071</math>.<br />
<br />
A [[genetic algorithm]] has provided the following examples:<br />
<br />
* <math>c'_6 \geq 353</math> (26 examples; [http://twofoldgaze.wordpress.com/2009/03/10/353-element-solution/ here is one])<br />
* <math>c'_7 \geq 988</math> [http://twofoldgaze.wordpress.com/2009/03/10/978-element-solution/ Here is the example]<br />
<br />
== Larger sides (k>3) ==<br />
<br />
The following set gives a lower bound for Moser’s cube <math>[4]^n</math> (values 1,2,3,4): Pick all points where q entries are 2 or 3; and also pick those where q-1 entries are 2 or 3 and an odd number of entries are 1. This is maximized when q is near n/2, giving a lower bound of<br />
<br />
<math>\binom{n}{n/2} 2^n + \binom{n}{n/2-1} 2^{n-1}</math><br />
<br />
which is comparable to <math>4^n/\sqrt{n}</math> by [[Stirling's formula]].<br />
<br />
For k=5 (values 1,2,3,4,5) If A, B, C, D, and E denote the numbers of 1-s, 2-s, 3-s, 4-s and 5-s then the first three points of a geometric line form a 3-term arithmetic progression in A+E+2(B+D)+3C. So, for k=5 we have a similar lower bound for the Moser’s problem as for DHJ k=3, i.e. <math>5^{n - O(\sqrt{\log n})}</math>.<br />
<br />
The k=6 version of Moser implies DHJ(3). Indeed, any k=3 combinatorial line-free set can be "doubled up" into a k=6 geometric line-free set of the same density by pulling back the set from the map <math>\phi: [6]^n \to [3]^n</math> that maps 1, 2, 3, 4, 5, 6 to 1, 2, 3, 3, 2, 1 respectively; note that this map sends k=6 geometric lines to k=3 combinatorial lines.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Moser%27s_cube_problemMoser's cube problem2009-03-21T14:04:27Z<p>121.220.134.232: /* Elimination of (6,40,79,0,0,0) */</p>
<hr />
<div>Define a ''Moser set'' to be a subset of <math>[3]^n</math> which does not contain any [[geometric line]], and let <math>c'_n</math> denote the size of the largest Moser set in <math>[3]^n</math>. The first few values are (see [http://www.research.att.com/~njas/sequences/A003142 OEIS A003142]):<br />
<br />
: <math>c'_0 = 1; c'_1 = 2; c'_2 = 6; c'_3 = 16; c'_4 = 43; c'_5 = 124.</math><br />
<br />
Beyond this point, we only have some upper and lower bounds, in particular <math>353 \leq c'_6 \leq 361</math>; see [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DsU-uZ1tK7VEg this spreadsheet] for the latest bounds.<br />
<br />
The best known asymptotic lower bound for <math>c'_n</math> is<br />
<br />
: <math>c'_n \gg 3^n/\sqrt{n}</math>,<br />
<br />
formed by fixing the number of 2s to a single value near n/3. Compare this to the DHJ(2) or Sperner limit of <math>2^n/\sqrt{n}</math>. Is it possible to do any better? Note that we have a [[Upper_and_lower_bounds#Asymptotics|significantly better bound]] for the DHJ(3) <math>c_n</math>:<br />
<br />
: <math>c_n \geq 3^{n-O(\sqrt{\log n})}</math>.<br />
<br />
A more precise lower bound is<br />
<br />
: <math>c'_n \geq \binom{n+1}{q} 2^{n-q}</math><br />
<br />
where q is the nearest integer to <math>n/3</math>, formed by taking all strings with q 2s, together with all strings with q-1 2s and an odd number of 1s. This for instance gives the lower bound <math>c'_5 \geq 120</math>, which compares with the upper bound <math>c'_5 \leq 3 c'_4 = 129</math>.<br />
<br />
Using [[DHJ(3)]], we have the upper bound<br />
<br />
: <math>c'_n = o(3^n)</math>,<br />
<br />
but no effective decay rate is known. It would be good to have a combinatorial proof of this fact (which is weaker than [[DHJ(3)]], but implies [[Roth's theorem]]).<br />
<br />
== Notation ==<br />
<br />
Given a Moser set A in <math>[3]^n</math>, we let a be the number of points in A with no 2s, b be the number of points in A with one 2, c the number of points with two 2s, etc. We call (a,b,c,...) the ''statistics'' of A. Given a slice S of <math>[3]^n</math>, we let a(S), b(S), etc. denote the statistics of that slice (dropping the fixed coordinates). Thus for instance if A = {11, 12, 22, 32}, then (a,b,c) = (1,2,1), and (a(1*),b(1*)) = (1,1).<br />
<br />
We call a statistic (a,b,c,...) ''attainable'' if it is attained by a Moser set. We say that an attainable statistic is ''Pareto-optimal'' if it cannot be pointwise dominated by any other attainable statistic (a',b',c',...) (thus <math>a' \geq a</math>, <math>b' \geq b</math>, etc.) We say that it is ''extremal'' if it is not a convex combination of any other attainable statistic (this is a stronger property than Pareto-optimal). For the purposes of maximising linear scores of attainable statistics, it suffices to check extremal statistics.<br />
<br />
We let <math>(\alpha_0,\alpha_1,\ldots)</math> be the normalized version of <math>(a,b,\ldots)</math>, in which one divides the number of points of a certain type in the set by the total number of points in the set. Thus for instance <math>\alpha_0 = a/2^n, \alpha_1 = b/(n 2^{n-1}), \alpha_3 = c/(\binom{n}{2} 2^{n-2})</math>, etc., and the <math>\alpha_i</math> range between 0 and 1. Averaging arguments show that any linear inequality obeyed by the <math>\alpha_i</math> at one dimension is automatically inherited by higher dimensions, as are shifted versions of this inequality (in which <math>\alpha_i</math> is replaced by <math>\alpha_{i+1}</math>.<br />
<br />
The idea in 'c-statistics' is to identify 1s and 3s but leave the 2s intact. Let’s use x to denote letters that are either 1 or 3, then the 81 points get split up into 16 groups: 2222 (1 point), 222x, 22×2, 2×22, x222 (two points each), 22xx, 2×2x, x22x, 2xx2, x2×2, xx22 (four points each), 2xxx, x2xx, xx2x, xxx2 (eight points each), xxxx (sixteen points). Let c(w) denote the number of points inside a group w, e.g. c(xx22) is the number of points of the form xx22 inside the set, and is thus an integer from 0 to 4.<br />
<br />
== n=0 ==<br />
<br />
We trivially have <math>c'_0=1.</math><br />
<br />
== n=1 ==<br />
<br />
We trivially have <math>c'_1=2.</math> The Pareto-optimal values of the statistics (a,b) are (1,1) and (2,0); these are also the extremals. We thus have the inequality <math>a+b \leq 2</math>, or in normalized notation<br />
<br />
:<math>2\alpha_0 + \alpha_1 \leq 2</math>.<br />
<br />
== n=2 ==<br />
<br />
We have <math>c'_2 = 6</math>; the upper bound follows since <math>c'_2 \leq 3 c'_1</math>, and the lower bound follows by deleting one of the two diagonals from <math>[3]^2</math> (these are the only extremisers).<br />
<br />
The extremiser has statistics (a,b,c) = (2,4,0), which are of course Pareto-optimal. If c=1 then we must have a, b at most 2 (look at the lines through 22). This is attainable (e.g. {11, 12, 22, 23, 31}, and so (2,2,1) is another Pareto-optimal statistic. If a=4, then b and c must be 0, so we get another Pareto-optimal statistic (4,0,0) (attainable by {11, 13, 31, 33} of course). If a=3, then c=0 and b is at most 2, giving another Pareto-optimal statistic (3,2,0); but this is a convex combination of (4,0,0) and (2,4,0) and is thus not extremal. Thus the complete set of extremal statistics are<br />
<br />
(4,0,0), (2,4,0), (2,2,1).<br />
<br />
The sharp linear inequalities obeyed by a,b,c (other than the trivial ones <math>a,b,c \geq 0</math>) are then<br />
<br />
*<math>2a+b+2c \leq 8</math><br />
*<math>b+2c \leq 4</math><br />
*<math>a+2c \leq 4</math><br />
*<math>c \leq 1</math>.<br />
<br />
In normalized notation, we have<br />
*<math>4\alpha_0 + 2\alpha_1 + \alpha_2 \leq 4</math><br />
*<math>2\alpha_1 + \alpha_2 \leq 2</math><br />
*<math>2\alpha_0 + \alpha_2 \leq 2</math><br />
*<math>\alpha_2 \leq 1</math>.<br />
<br />
The Pareto optimizers for c-statistics are (c22,c2x,cx2,cxx) = (1112),(0222),(0004), which are covered by the linear inequalities <br />
*<math>c22+c2x+cxx \le 4</math>, <br />
*<math>c22+cx2+cxx \le 4</math>.<br />
<br />
== n=3 ==<br />
<br />
We have <math>c'_3 = 16</math>. The lower bound can be seen for instance by taking all the strings with one 2, and half the strings with no 2 (e.g. the strings with an odd number of 1s). The upper bound can be deduced from the corresponding [[upper and lower bounds]] for <math>c_3 = 18</math>; the 17-point and 18-point line-free sets each contain a geometric line.<br />
<br />
If a Moser set in <math>[3]^3</math> contains 222, then it can have at most 14 points, since the remaining 26 points in the cube split into 13 antipodal pairs, and at most one of each pair can lie in the set. By exhausting over the <math>2^{13} = 8192</math> possibilities, it can be shown that it is impossible for a 14-point set to exist; any Moser set containing 222 must in fact omit at least one antipodal pair completely and thus have only 13 points. (A human proof of this fact can be [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
The Pareto-optimal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(3,6,5,0),(4,4,5,0),(3,7,4,0),(4,6,4,0), (3,9,3,0),(4,7,3,0),(5,4,3,0),(4,9,2,0),(5,6,2,0),(6,3,2,0),(3,10,1,0),(5,7,1,0), (6,4,1,0),(4,12,0,0),(5,9,0,0),(6,6,0,0),(7,3,0,0),(8,0,0,0).<br />
<br />
These were found from a search of the <math>2^{27}</math> subsets of the cube.<br />
A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here].<br />
<br />
The extremal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(4,4,5,0),(4,6,4,0),(4,12,0,0),(8,0,0,0)<br />
<br />
The sharp linear bounds are :<br />
<br />
* <math>2a+b+2c+4d \leq 22</math><br />
* <math>3a+2b+3c+6d \leq 36</math><br />
* <math>7a+2b+4c+8d \leq 56</math><br />
* <math>6a+2b+3c+6d \leq 48</math><br />
* <math>b+c+3d \leq 12</math><br />
* <math>a+2c+4d \leq 14</math><br />
* <math>5a+4c+8d \leq 40</math><br />
* <math>a+4d \leq 8</math><br />
* <math>b+6d \leq 12</math><br />
* <math>c+3d \leq 6</math><br />
* <math>d \leq 1</math><br />
<br />
In normalized notation,<br />
* <math>8\alpha_0+ 6\alpha_1 + 6\alpha_2 + 2\alpha_3 \leq 11</math><br />
* <math>4\alpha_0+4\alpha_1+3\alpha_2+\alpha_3 \leq 6</math><br />
* <math>7\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 7</math><br />
* <math>8\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 8</math><br />
* <math>4\alpha_1+2\alpha_2+\alpha_3 \leq 4</math><br />
* <math>4\alpha_0+6\alpha_2+2\alpha_3 \leq 7</math><br />
* <math>5\alpha_0+3\alpha_2+\alpha_3 \leq 5</math><br />
* <math>2\alpha_0+\alpha_3 \leq 2</math><br />
* <math>2\alpha_1+\alpha_3 \leq 2</math><br />
* <math>2\alpha_2+\alpha_3 \leq 2</math><br />
* <math>\alpha_3 \leq 1</math><br />
<br />
The c-statistics can also be found from an exhaustive search of Moser sets. The resulting Pareto sets and linear inequalities can be found on Sheet 8 of [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# this spreadsheet]<br />
<br />
== n=4 ==<br />
<br />
A [http://abel.math.umu.se/~klasm/extremal-moser-n=5-t=3 computer search] has obtained all extremisers to <math>c'_4=43</math>. The 42-point solutions can be found [http://abel.math.umu.se/~klasm/moser-n=4-t=3-p=42.gz here].<br />
<br />
'''Proof that <math>c'_4 \leq 43</math>''': <br />
When e=1 (i.e. the 4D set contains 2222) then we have at most 41 points (in fact at most 39) by counting antipodal points, so assume e=0.<br />
<br />
Define the score of a 3D slice to be a/4+b/3+c/2+d. Observe from double counting that the size of a 4D set is the sum of the scores of its eight side slices.<br />
<br />
But by [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en looking at the extremals] we see that the largest score is 44/8, attained at only one point, namely when (a,b,c,d) = (2,6,6,0). So the only way one can have a 44-point set is if all side slices are (2,6,6,0), or equivalently if the whole set has statistics (a,b,c,d,e) = (4,16,24,0,0). But then we have all the points with two 2s, which means that the four "a" points cannot be separated by Hamming distance 2. We conclude that we must have an antipodal pair among the "a" points with an odd number of 1s, and an antipodal pair among the "a" points with an even number of 1s. By the symmetries of the cube, we may take the a-set to then be 1111, 3333, 1113, 3331. But then the "b" set must exclude both 1112 and 3332, and so can have at most three points in the eight-point set xyz2 (with x,y,z=1,3) rather than four (to get four points one must alternate in a checkerboard pattern). Adding this to the at most four points of the form xy2z, x2yz, 2xyz we see that b is at most 15, a contradiction. <math>\Box</math><br />
<br />
Given a subset of <math>[3]^4</math>, let a be the number of points with no 2s, b be the number of points with 1 2, and so forth. The quintuple (a,b,c,d,e) thus lies between (0,0,0,0,0) and (16,32,24,8,1).<br />
<br />
The 43-point solutions have distributions (a,b,c,d,e) as follows:<br />
<br />
* (5,20,18,0,0) [16 solutions]<br />
* (4,16,23,0,0) [768 solutions]<br />
* (3,16,24,0,0) [512 solutions]<br />
* (4,15,24,0,0) [256 solutions]<br />
<br />
The 42-point solutions are [http://abel.math.umu.se/~klasm/Moser-42-stat.pdf distributed as follows]:<br />
<br />
* (6,24,12,0,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-12 8 solutions]] <br />
* (5,20,17,0,0) [576 solutions]<br />
* (5,19,18,0,0) [384 solutions]<br />
* (6,16,18,2,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-e=2 192 solutions]]<br />
* (4,20,18,0,0) [272 solutions]<br />
* (5,17,20,0,0) [192 solutions]<br />
* (5,16,21,0,0) [3584 solutions]<br />
* (4,17,21,0,0) [768 solutions]<br />
* (4,16,22,0,0) [26880 solutions]<br />
* (5,15,22,0,0) [1536 solutions]<br />
* (4,15,23,0,0) [22272 solutions]<br />
* (3,16,23,0,0) [15744 solutions]<br />
* (4,14,24,0,0) [4224 solutions]<br />
* (3,15,24,0,0) [8704 solutions]<br />
* (2,16,24,0,0) [896 solutions]<br />
<br />
Note how c is usually quite large, and d quite low.<br />
<br />
One of the (6,24,12,0,0) solutions is <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math> (i.e. the set of points containing exactly two 1s, and/or exactly two 3s). The other seven are reflections of this set.<br />
<br />
There are 2,765,200 41-point solutions, [http://abel.math.umu.se/~klasm/solutions-4-t=3-41-moser.gz listed here]. The statistics for such points can be [http://abel.math.umu.se/~klasm/Moser-41-stat.pdf found here]. Noteworthy features of the statistics:<br />
<br />
* d is at most 3 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=3.gz 256 exceptional solutions] of the shape (5,15,18,3,0), have d at most 2; here are the [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=2.gz d=2 solutions] and [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=1.gz d=1 solutions])<br />
* c is at least 6 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-c=6.gz 16 exceptional solutions] of the shape (7,28,6,0,0), have c at least 11).<br />
<br />
Statistics for the 41-point, 42-point, and 43-point solutions can be found [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en here].<br />
<br />
If a Moser set in <math>[3]^4</math> contains 2222, then by the n=3 theory, any middle slice (i.e. 2***, *2**, **2*, or ***2) is missing at least one antipodal pair. But each antipodal pair belongs to at most three middle slices, thus two of the 40 antipodal pairs must be completely missing. As a consequence, any Moser set containing 2222 can have at most 39 points.<br />
(A more refined analysis can be found at [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
We have the following inequalities connecting a,b,c,d,e:<br />
<br />
* <math>4a+b \leq 64</math>: There are 32 lines connecting two "a" points with a "b" point; each "a" point belongs to four of these lines, and each "b" point belongs to one. But each such line can have at most two points in the set, and the claim follows.<br />
* This can be refined to <math>4a+b+\frac{2}{3} c \leq 64</math>: There are 24 planes connecting four "a" points, four "b" points, and one "c" point; each "a" point belongs to six of these, each "b" point belongs to three, and each "c" point belongs to one. For each of these planes, we have <math>2a + b + 2c \leq 8</math> from the n=2 theory, and the claim follows.<br />
* <math>6a+2c \leq 96</math>: There are 96 lines connecting two "a" points with a "c" point; each "a" point belongs to six of these lines, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
* <math>3b+2c \leq 96</math>: There are 48 lines connecting two "b" points to a "c" point; each "b" point belongs to three of these points, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
<br />
The inequalities for n=3 imply inequalities for n=4. Indeed, there are eight side slices of <math>[3]^4</math>; each "a" point belongs to four of these, each "b" point belongs to three, each "c" point belongs to two, and each "d" point belongs to one. Thus, any inequality of the form<br />
:<math> \alpha a + \beta b + \gamma c + \delta d \leq M</math><br />
in three dimensions implies the inequality<br />
:<math> 4 \alpha a + 3 \beta b + 2 \gamma c + \delta d \leq 8M</math><br />
in four dimensions. Thus we have<br />
<br />
* <math>8a+3b+4c+4d \leq 176</math><br />
* <math>2a+b+c+d \leq 48</math><br />
* <math>14a+3b+4c+4d \leq 244</math><br />
* <math>4a+b+c+d \leq 64</math> <br />
* <math>3b+2c+3d \leq 96</math><br />
* <math>a+c+d \leq 28</math><br />
* <math>5a+2c+2d \leq 80</math><br />
* <math>a+d \leq 16</math><br />
* <math>b+2d \leq 32</math><br />
* <math>2c+3d \leq 48</math><br />
<br />
Cubes also sit diagonally in the 4-dimensional cube. They may have coordinates xxyz, where xx runs over (11,22,33) or (13,22,31), while y and z run over (1,2,3). These cubes have a different distribution of 2s than the ordinary slices: (a,b,c,d,e) = (8,8,6,4,1) instead of (8,12,6,1,0) for the side slices and (0,8,12,6,1) for the middle slices. So a different set of inequalities arise. It is possible to run through the same procedure as described above for n=3, and the inequalities that arise are <math>2a+b+2c+2d+4e \le 24</math> and <math>4a+b+2c+2d+4e \le 32</math> within the 3D cube, which when averaged becomes <math>4a+b+2c+4d+16e \le 96</math> and <math>8a+b+2c+4d+16e \le 128</math> for the 4D cube. A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here] on sheet 2. Other sheets of this spreadsheet contain results for a 3D cube sitting diagonally in a 5D, 6D or 7D cube. Notice the similarity of the equations that arise for the xxxyz, xxyyz and xxxxyz diagonals. Also notice the xxxxyyz diagonals have the same Pareto sets and linear inequalities as the cube's c-statistics<br />
<br />
== n=5 ==<br />
<br />
Let (A,B,C,D,E,F) be the statistics of a five-dimensional Moser set, thus (A,B,C,D,E,F) varies between (0,0,0,0,0,0) and (32,80,80,40,10,1).<br />
<br />
There are several Moser sets with the statistics (4,40,80,0,0,0), which thus have 124 points. Indeed, one can take<br />
<br />
* all points with two 2s;<br />
* all points with one 2 and an even number of 1s; and<br />
* (13111),(13113),(31311),(13333). Any two of these four points differ in three places, except for one pair of points that differ in one place.<br />
<br />
For the rest of this section, we assume that the Moser set <math>{\mathcal A}</math> is a 125-point set.<br />
<br />
:'''Lemma 1''': F=0.<br />
<br />
'''Proof''' If F is non-zero, then the Moser set contains 22222, then each of the 121 antipodal pairs can have at most one point in the set, leading to only 122 points. <math>\Box</math><br />
<br />
:'''Lemma 2''': Every middle slice of <math>{\mathcal A}</math> has at most 41 points.<br />
<br />
'''Proof''' Without loss of generality we may consider the 2**** slice. There are two cases, depending on the value of c(2****).<br />
<br />
Suppose first that c is at least 17; thus there are at least 17 points of the form 222xy, 22x2y, 22xy2, 2x2y2, 2xy22, or 2x22y, where the x, y denote 1 or 3. This gives 34 "xy" wildcards in all in four coordinate slots; by the pigeonhole principle one of the slots sees at least 9 of the wildcards. By symmetry, we may assume that the second coordinate slot sees at least 9 of these wildcards, thus there are at least 9 points of the form 2x22y, 2x2y2, 2xy22. The x=1, x=3 cases can absorb at most six of these, thus each of these cases must absorb at least three points, with at least one absorbing at least five. Let's say that it's the x=3 case that absorbs 5; thus <math>d(*1***) \geq 3</math> and <math>d(*3***) \geq 5</math>. From the n=4 theory this means that the *1*** slice has at most 41 points, and the *3*** slice has at most 40. Meanwhile, the middle slice has at most 43, leading to 41+41+42=124 points in all.<br />
<br />
Now suppose c is less than 17; then by the n=4 theory the middle slice is one of the eight (6,24,12,0,0) sets. Without loss of generality we may take it to be <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math>; in particular, the middle slice contains the points 21122 21212 21221 23322 23232 23223. In particular, the *1*** and *3*** slices have a "d" value of at least three, and so have at most 41 points. If the *2*** slice has at most 42 points, then we are at 41+42+41=124 points as needed, but if we have 43 or more, then we are back in the first case (as <math>c(*2***) \geq 17</math>) after permuting the indices.<br />
<math>\Box</math><br />
<br />
Since 125=41+41+43, we thus have<br />
<br />
:'''Corollary 1''': Every side slice of <math>{\mathcal A}</math> has at least 41 points. If one side slice does have 41 points, then the other has 43.<br />
<br />
Combining this with the n=4 statistics, we conclude<br />
<br />
:'''Corollary 2''': Every side slice of <math>{\mathcal A}</math> has e=0, and <math>d \leq 3</math>. Given two opposite side slices, e.g. 1**** and 3****, we have <math>d(1****)+d(3****) \leq 4</math>.<br />
<br />
:'''Corollary 3''': E=0.<br />
<br />
:'''Corollary 4''': Any middle slice has a "c" value of at most 8.<br />
<br />
'''Proof''' Let's work with the 2**** slice. By Corollary 2, there are at most four contributions to c(2****) of the form 21*** or 23***, and similarly for the other three positions. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 3''': <math>2B+C \leq 160</math>. <br />
<br />
'''Proof''': There are 160 lines connecting one "C" point to two "B" points (e.g. 11112, 11122, 11132); each "C" point lies in two of these, and each "B" point lies on four. A Moser set can have at most two points out of each of these lines. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 4''' Given m A-points with <math>m \geq 5</math>, there exists at least m-4 pairs (a,b) of such A-points with Hamming distance exactly two (i.e. b differs from a in exactly two places).<br />
<br />
'''Proof''' It suffices to check this for m=5, since the case of larger m then follows by locating a pair, removing a point that contributes to that pair, and using the induction hypothesis. Given 5 A-points, we may assume by the pigeonhole principle and symmetry that at least three of them have an odd number of 1s. Suppose 11111 is one of the points, and that no pair has Hamming distance 2. All points with two 3s are excluded, so the only points allowed with an odd number of 1s are those with four 3s. But all those points differ from each other in two positions, so at most one of them is allowed. <math>\Box</math><br />
<br />
:'''Corollary 5''' <math>C \leq 79</math>.<br />
<br />
'''Proof''': Suppose for contradiction that C=80, then D=0 and <math>B \leq 40</math>. From Lemma 4 we also see that A cannot be 5 or more, leading to the contradiction.<br />
<math>\Box</math><br />
<br />
Define the '''score''' of a Moser set in <math>[3]^4</math> to be the quantity <math>a + 5b/4 + 5c/3 + 5d/2 + 5e</math>. Double-counting (and Lemma 2) gives<br />
<br />
:'''Lemma 5''' The total score of all the ten side-slices of <math>{\mathcal A}</math> is equal to <math>5|{\mathcal A}| = 5 \times 125</math>. In particular, there exists a pair of opposite side-slices whose scores add up to at least 125.<br />
<br />
By Lemma 5 and symmetry, we may assume that the 1**** and 3**** slices have score adding up to at least 125. By Lemma 2, the 2**** slice has at most 41 points, which imply that the 1**** and 3**** have 41, 42, or 43 points. From the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] we know that all 41-point, 42-point, 43-point slices have score less than 62, with the following exceptions:<br />
<br />
# (2,16,24,0,0) [42 points, score: 62]<br />
# (4,16,23,0,0) [43 points, score: 62 1/3]<br />
# (4,15,24,0,0) [43 points, score: 62 3/4]<br />
# (3,16,24,0,0) [43 points, score: 63]<br />
<br />
Thus the 1**** and 3**** slices must come from the above list. Furthermore, if one of the slices is of type 1, then the other must be of type 4, and if one slice is of type 2, then the other must be of type 3 or 4.<br />
<br />
:'''Lemma 6''' There exists one cut in which the side slices have total score strictly greater than 125 (i.e. they thus involve only Type 2, Type 3, and Type 4 slices, with at least one side slice not equal to Type 2, and the cut here is of the form 43+39+43).<br />
<br />
'''Proof''' If not, then all cuts have side slices exactly equal to 125, which by the above table implies that one is Type 1 and one is Type 4, in particular all side slices have c=24. But this forces C=80, contradicting Corollary 5. <math>\Box</math><br />
<br />
Adding up the corners we conclude<br />
<br />
:'''Corollary 6''' <math>6 \leq A \leq 8</math>.<br />
<br />
:'''Lemma 7''' <math>D=0</math>.<br />
<br />
'''Proof''' Firstly, from Lemma 6 we have a cut in which the two side slices are omitting at most one C-point between them (and have no D-point), which forces the middle slice to have at most one D-point; thus D is at most 1.<br />
<br />
Now suppose instead that D=1 (e.g. if 11222 was in the set); then there would be two choices of coordinates in which one of the side slices would have d=1 (e.g. 1**** and *1****). But the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] show that such slices have a score of at most 59 7/12, so that the total score from those two coordinates is at most 63 + 59 7/12 = 122 7/12. On the other hand, the other three slices have a net score of at most 63+63 = 126. This averages out to at most 124.633... < 125, a contradiction. <math>\Box</math><br />
<br />
:'''Corollary 7''' Every middle slice has at most 40 points.<br />
<br />
'''Proof''' From Lemma 7 we see that the middle slice must have c=0, but from the n=4 statistics this is not possible for any slice of size 41 or higher (alternatively, one can use the inequalities <math>4a+b \leq 64</math> and <math>b \leq 32</math>). <math>\Box</math><br />
<br />
Each middle slice now has 39 or 40 points, with c=d=e=0, and so from the inequalities <math>4a+b \leq 64</math>, <math>b \leq 32</math> must have statistics (8,32,0,0,0),(7,32,0,0,0), or (8,31,0,0,0). In particular the "a" index of the middle slices is at most 8. Summing over all middle slices we conclude<br />
<br />
:'''Lemma 9''' B is at most 40.<br />
<br />
:'''Lemma 10''' C is equal to 78 or 79.<br />
<br />
'''Proof''' By Corollary 5, it suffices to show that <math>C \geq 78</math>.<br />
<br />
As 125=43+39+43, we see that every center slice must have at least 39 points. By Lemma 2 and Lemma 7 the center slice has c=d=e=0, thus the center slice has a+b >= 39. On the other hand, from the n=4 theory we have 4a+b <= 64, which forces b >= 31.<br />
<br />
By double counting, we see that 2C is equal to the sum of the b's of all the five center slices. Thus C >= 5*31/2 = 77.5 and the claim follows. <math>\Box</math><br />
<br />
From Lemmas 9 and 10 we see that <math>B+C \leq 119</math>, and thus <math>A \geq 6</math>. Also, if <math>A \geq 7</math>, then by Lemma 4 we have at least three pairs of A points with Hamming distance 2. At most two of these pairs eliminate the same C point, so we would have C=78 in that case. <br />
<br />
Putting all the above facts together, we see that (A,B,C,D,E,F) must be one of the following triples:<br />
<br />
* (6,40,79,0,0,0)<br />
* (7,40,78,0,0,0)<br />
* (8,39,78,0,0,0)<br />
<br />
All three cases can be eliminated, giving <math>c'_5=124</math>.<br />
<br />
=== Elimination of (6,40,79,0,0,0) ===<br />
<br />
Look at the 6 A points. From Lemma 4 we have at least two pairs (a,b), (c,d) of A-points that have Hamming separation 2.<br />
<br />
Now look at the midpoints of these two pairs; these midpoints are C-points cannot lie in the set. But we have exactly one C-point missing from the set, thus the midpoints must be the same. By symmetry, we may thus assume that the two pairs are (11111,11133) and (11113,11131). Thus 11111,11133, 11113, 11131 are in the set, and so every C-point other than 11122 is in the set. On the other hand, the B-points 11121, 11123, 11112, 11132 lie outside the set.<br />
<br />
At most one of 11312, 11332 lie in the set (since 11322 lies in the set). Suppose that 11312 lies outside the set, then we have a pair (xy1z2, xy3z2) with x,y,z = 1,3 that is totally omitted from the set, namely (11112,11312). On the other hand, every other pair of this form can have at most one point in the set, thus there are at most seven points in the set of the form (xyzw2) with x,y,z,w = 1,3. Similarly there are at most 8 points of the form xyz2w, or of xy2zw, x2yzw, 2xyzw, leading to at most 39 B-points in all, contradiction.<br />
<br />
=== Elimination of (7,40,78,0,0,0) ===<br />
<br />
By Lemma 4, we have at least three pairs of A-points of distance two apart that lie in the set.<br />
The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs must have the same midpoint, so we may assume as before that 111xy lies in the set for x,y=1,3, which implies that 1112* and 111*2 lie outside the set.<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 316. On the other hand, every one of the 160 lines can have at most two points, and four of these lines (namely 1112*, 111*2) have no points. Thus all the other lines must have exactly two points. <br />
<br />
We know that the C-point 11122 is missing from the set; there is one other missing C-point. Since 1112x, 111x2 lie outside the set, we conclude from the previous paragraph that 1132x, 113x2 and 1312x, 131x2 lie in the set. Taking midpoints we conclude that 11322 and 13122 lie outside the set. But this is now three C-points missing (together with 11122), a contradiction.<br />
<br />
=== Elimination of (8,39,78,0,0,0) ===<br />
<br />
By Lemma 4 we have at least four pairs of A-points of distance two apart that lie in the set. The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs (a,b), (c,d) must have the same midpoint p, and the other two pairs (a',b'), (c',d') must also have the same midpoint p'. (Note that every C-point is the midpoint of at most two such pairs.)<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 312. Every one of the 160 lines can have at most two points, and four of these (those in the plane of (a,b,c,d) or of (a',b',c',d') have no points. Thus all other lines must have exactly two points.<br />
<br />
Without loss of generality we have (a,b)=(11111,11133), (c,d) = (11113,11131), thus p = 11122. By permuting the first three indices, we may assume that p' is not of the form x2y2z, x2yz2, xy22z, xy2z2. Then 1112x lies outside the set and 1122x lies in the set, so by the above paragraph 1132x lies in the set; similarly for 113x2, 1312x, 131x2. This implies that 13122, 11322 lie outside the set, but this (together with 11122) shows that at least three C-points are missing, a contradiction.<br />
<br />
== General n ==<br />
<br />
General solution for <math>c'_N</math><br />
<br />
* q 2s, all points from A(N-q,1)<br />
* q-1 2s, points from A(N-q+1,2)<br />
* q-2 2s, points from A(N-q+2,3)<br />
* etc.<br />
<br />
where A(m,d) is a subset of <math>[1,3]^m</math> for which any two points differ from each other in at least d places.<br />
<br />
Mathworld’s entry on error-correcting codes suggests it might be NP-complete to find the size of A(m,d) in general.<br />
<br />
* <math>|A(m,1)| = 2^m</math> because it includes all points in <math>[1,3]^m</math><br />
* <math>|A(m,2)| = 2^{m-1}</math> because it can include all points in <math>[1,3]^m</math> with an odd number of 1s<br />
<br />
The integer programming routine from Maple 12 was used to obtain upper bounds for <math>c'_6</math> and <math>c'_7</math>. A large number of linear inequalities, described above in sections (n=3) and (n=4), were combined. The details are in [[Maple calculations]]. The results were that <math>c'_6 \le 361</math> and <math>c'_7 \le 1071</math>.<br />
<br />
A [[genetic algorithm]] has provided the following examples:<br />
<br />
* <math>c'_6 \geq 353</math> (26 examples; [http://twofoldgaze.wordpress.com/2009/03/10/353-element-solution/ here is one])<br />
* <math>c'_7 \geq 988</math> [http://twofoldgaze.wordpress.com/2009/03/10/978-element-solution/ Here is the example]<br />
<br />
== Larger sides (k>3) ==<br />
<br />
The following set gives a lower bound for Moser’s cube <math>[4]^n</math> (values 1,2,3,4): Pick all points where q entries are 2 or 3; and also pick those where q-1 entries are 2 or 3 and an odd number of entries are 1. This is maximized when q is near n/2, giving a lower bound of<br />
<br />
<math>\binom{n}{n/2} 2^n + \binom{n}{n/2-1} 2^{n-1}</math><br />
<br />
which is comparable to <math>4^n/\sqrt{n}</math> by [[Stirling's formula]].<br />
<br />
For k=5 (values 1,2,3,4,5) If A, B, C, D, and E denote the numbers of 1-s, 2-s, 3-s, 4-s and 5-s then the first three points of a geometric line form a 3-term arithmetic progression in A+E+2(B+D)+3C. So, for k=5 we have a similar lower bound for the Moser’s problem as for DHJ k=3, i.e. <math>5^{n - O(\sqrt{\log n})}</math>.<br />
<br />
The k=6 version of Moser implies DHJ(3). Indeed, any k=3 combinatorial line-free set can be "doubled up" into a k=6 geometric line-free set of the same density by pulling back the set from the map <math>\phi: [6]^n \to [3]^n</math> that maps 1, 2, 3, 4, 5, 6 to 1, 2, 3, 3, 2, 1 respectively; note that this map sends k=6 geometric lines to k=3 combinatorial lines.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Moser%27s_cube_problemMoser's cube problem2009-03-21T13:53:57Z<p>121.220.134.232: /* n=5 */</p>
<hr />
<div>Define a ''Moser set'' to be a subset of <math>[3]^n</math> which does not contain any [[geometric line]], and let <math>c'_n</math> denote the size of the largest Moser set in <math>[3]^n</math>. The first few values are (see [http://www.research.att.com/~njas/sequences/A003142 OEIS A003142]):<br />
<br />
: <math>c'_0 = 1; c'_1 = 2; c'_2 = 6; c'_3 = 16; c'_4 = 43; c'_5 = 124.</math><br />
<br />
Beyond this point, we only have some upper and lower bounds, in particular <math>353 \leq c'_6 \leq 361</math>; see [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DsU-uZ1tK7VEg this spreadsheet] for the latest bounds.<br />
<br />
The best known asymptotic lower bound for <math>c'_n</math> is<br />
<br />
: <math>c'_n \gg 3^n/\sqrt{n}</math>,<br />
<br />
formed by fixing the number of 2s to a single value near n/3. Compare this to the DHJ(2) or Sperner limit of <math>2^n/\sqrt{n}</math>. Is it possible to do any better? Note that we have a [[Upper_and_lower_bounds#Asymptotics|significantly better bound]] for the DHJ(3) <math>c_n</math>:<br />
<br />
: <math>c_n \geq 3^{n-O(\sqrt{\log n})}</math>.<br />
<br />
A more precise lower bound is<br />
<br />
: <math>c'_n \geq \binom{n+1}{q} 2^{n-q}</math><br />
<br />
where q is the nearest integer to <math>n/3</math>, formed by taking all strings with q 2s, together with all strings with q-1 2s and an odd number of 1s. This for instance gives the lower bound <math>c'_5 \geq 120</math>, which compares with the upper bound <math>c'_5 \leq 3 c'_4 = 129</math>.<br />
<br />
Using [[DHJ(3)]], we have the upper bound<br />
<br />
: <math>c'_n = o(3^n)</math>,<br />
<br />
but no effective decay rate is known. It would be good to have a combinatorial proof of this fact (which is weaker than [[DHJ(3)]], but implies [[Roth's theorem]]).<br />
<br />
== Notation ==<br />
<br />
Given a Moser set A in <math>[3]^n</math>, we let a be the number of points in A with no 2s, b be the number of points in A with one 2, c the number of points with two 2s, etc. We call (a,b,c,...) the ''statistics'' of A. Given a slice S of <math>[3]^n</math>, we let a(S), b(S), etc. denote the statistics of that slice (dropping the fixed coordinates). Thus for instance if A = {11, 12, 22, 32}, then (a,b,c) = (1,2,1), and (a(1*),b(1*)) = (1,1).<br />
<br />
We call a statistic (a,b,c,...) ''attainable'' if it is attained by a Moser set. We say that an attainable statistic is ''Pareto-optimal'' if it cannot be pointwise dominated by any other attainable statistic (a',b',c',...) (thus <math>a' \geq a</math>, <math>b' \geq b</math>, etc.) We say that it is ''extremal'' if it is not a convex combination of any other attainable statistic (this is a stronger property than Pareto-optimal). For the purposes of maximising linear scores of attainable statistics, it suffices to check extremal statistics.<br />
<br />
We let <math>(\alpha_0,\alpha_1,\ldots)</math> be the normalized version of <math>(a,b,\ldots)</math>, in which one divides the number of points of a certain type in the set by the total number of points in the set. Thus for instance <math>\alpha_0 = a/2^n, \alpha_1 = b/(n 2^{n-1}), \alpha_3 = c/(\binom{n}{2} 2^{n-2})</math>, etc., and the <math>\alpha_i</math> range between 0 and 1. Averaging arguments show that any linear inequality obeyed by the <math>\alpha_i</math> at one dimension is automatically inherited by higher dimensions, as are shifted versions of this inequality (in which <math>\alpha_i</math> is replaced by <math>\alpha_{i+1}</math>.<br />
<br />
The idea in 'c-statistics' is to identify 1s and 3s but leave the 2s intact. Let’s use x to denote letters that are either 1 or 3, then the 81 points get split up into 16 groups: 2222 (1 point), 222x, 22×2, 2×22, x222 (two points each), 22xx, 2×2x, x22x, 2xx2, x2×2, xx22 (four points each), 2xxx, x2xx, xx2x, xxx2 (eight points each), xxxx (sixteen points). Let c(w) denote the number of points inside a group w, e.g. c(xx22) is the number of points of the form xx22 inside the set, and is thus an integer from 0 to 4.<br />
<br />
== n=0 ==<br />
<br />
We trivially have <math>c'_0=1.</math><br />
<br />
== n=1 ==<br />
<br />
We trivially have <math>c'_1=2.</math> The Pareto-optimal values of the statistics (a,b) are (1,1) and (2,0); these are also the extremals. We thus have the inequality <math>a+b \leq 2</math>, or in normalized notation<br />
<br />
:<math>2\alpha_0 + \alpha_1 \leq 2</math>.<br />
<br />
== n=2 ==<br />
<br />
We have <math>c'_2 = 6</math>; the upper bound follows since <math>c'_2 \leq 3 c'_1</math>, and the lower bound follows by deleting one of the two diagonals from <math>[3]^2</math> (these are the only extremisers).<br />
<br />
The extremiser has statistics (a,b,c) = (2,4,0), which are of course Pareto-optimal. If c=1 then we must have a, b at most 2 (look at the lines through 22). This is attainable (e.g. {11, 12, 22, 23, 31}, and so (2,2,1) is another Pareto-optimal statistic. If a=4, then b and c must be 0, so we get another Pareto-optimal statistic (4,0,0) (attainable by {11, 13, 31, 33} of course). If a=3, then c=0 and b is at most 2, giving another Pareto-optimal statistic (3,2,0); but this is a convex combination of (4,0,0) and (2,4,0) and is thus not extremal. Thus the complete set of extremal statistics are<br />
<br />
(4,0,0), (2,4,0), (2,2,1).<br />
<br />
The sharp linear inequalities obeyed by a,b,c (other than the trivial ones <math>a,b,c \geq 0</math>) are then<br />
<br />
*<math>2a+b+2c \leq 8</math><br />
*<math>b+2c \leq 4</math><br />
*<math>a+2c \leq 4</math><br />
*<math>c \leq 1</math>.<br />
<br />
In normalized notation, we have<br />
*<math>4\alpha_0 + 2\alpha_1 + \alpha_2 \leq 4</math><br />
*<math>2\alpha_1 + \alpha_2 \leq 2</math><br />
*<math>2\alpha_0 + \alpha_2 \leq 2</math><br />
*<math>\alpha_2 \leq 1</math>.<br />
<br />
The Pareto optimizers for c-statistics are (c22,c2x,cx2,cxx) = (1112),(0222),(0004), which are covered by the linear inequalities <br />
*<math>c22+c2x+cxx \le 4</math>, <br />
*<math>c22+cx2+cxx \le 4</math>.<br />
<br />
== n=3 ==<br />
<br />
We have <math>c'_3 = 16</math>. The lower bound can be seen for instance by taking all the strings with one 2, and half the strings with no 2 (e.g. the strings with an odd number of 1s). The upper bound can be deduced from the corresponding [[upper and lower bounds]] for <math>c_3 = 18</math>; the 17-point and 18-point line-free sets each contain a geometric line.<br />
<br />
If a Moser set in <math>[3]^3</math> contains 222, then it can have at most 14 points, since the remaining 26 points in the cube split into 13 antipodal pairs, and at most one of each pair can lie in the set. By exhausting over the <math>2^{13} = 8192</math> possibilities, it can be shown that it is impossible for a 14-point set to exist; any Moser set containing 222 must in fact omit at least one antipodal pair completely and thus have only 13 points. (A human proof of this fact can be [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
The Pareto-optimal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(3,6,5,0),(4,4,5,0),(3,7,4,0),(4,6,4,0), (3,9,3,0),(4,7,3,0),(5,4,3,0),(4,9,2,0),(5,6,2,0),(6,3,2,0),(3,10,1,0),(5,7,1,0), (6,4,1,0),(4,12,0,0),(5,9,0,0),(6,6,0,0),(7,3,0,0),(8,0,0,0).<br />
<br />
These were found from a search of the <math>2^{27}</math> subsets of the cube.<br />
A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here].<br />
<br />
The extremal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(4,4,5,0),(4,6,4,0),(4,12,0,0),(8,0,0,0)<br />
<br />
The sharp linear bounds are :<br />
<br />
* <math>2a+b+2c+4d \leq 22</math><br />
* <math>3a+2b+3c+6d \leq 36</math><br />
* <math>7a+2b+4c+8d \leq 56</math><br />
* <math>6a+2b+3c+6d \leq 48</math><br />
* <math>b+c+3d \leq 12</math><br />
* <math>a+2c+4d \leq 14</math><br />
* <math>5a+4c+8d \leq 40</math><br />
* <math>a+4d \leq 8</math><br />
* <math>b+6d \leq 12</math><br />
* <math>c+3d \leq 6</math><br />
* <math>d \leq 1</math><br />
<br />
In normalized notation,<br />
* <math>8\alpha_0+ 6\alpha_1 + 6\alpha_2 + 2\alpha_3 \leq 11</math><br />
* <math>4\alpha_0+4\alpha_1+3\alpha_2+\alpha_3 \leq 6</math><br />
* <math>7\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 7</math><br />
* <math>8\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 8</math><br />
* <math>4\alpha_1+2\alpha_2+\alpha_3 \leq 4</math><br />
* <math>4\alpha_0+6\alpha_2+2\alpha_3 \leq 7</math><br />
* <math>5\alpha_0+3\alpha_2+\alpha_3 \leq 5</math><br />
* <math>2\alpha_0+\alpha_3 \leq 2</math><br />
* <math>2\alpha_1+\alpha_3 \leq 2</math><br />
* <math>2\alpha_2+\alpha_3 \leq 2</math><br />
* <math>\alpha_3 \leq 1</math><br />
<br />
The c-statistics can also be found from an exhaustive search of Moser sets. The resulting Pareto sets and linear inequalities can be found on Sheet 8 of [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# this spreadsheet]<br />
<br />
== n=4 ==<br />
<br />
A [http://abel.math.umu.se/~klasm/extremal-moser-n=5-t=3 computer search] has obtained all extremisers to <math>c'_4=43</math>. The 42-point solutions can be found [http://abel.math.umu.se/~klasm/moser-n=4-t=3-p=42.gz here].<br />
<br />
'''Proof that <math>c'_4 \leq 43</math>''': <br />
When e=1 (i.e. the 4D set contains 2222) then we have at most 41 points (in fact at most 39) by counting antipodal points, so assume e=0.<br />
<br />
Define the score of a 3D slice to be a/4+b/3+c/2+d. Observe from double counting that the size of a 4D set is the sum of the scores of its eight side slices.<br />
<br />
But by [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en looking at the extremals] we see that the largest score is 44/8, attained at only one point, namely when (a,b,c,d) = (2,6,6,0). So the only way one can have a 44-point set is if all side slices are (2,6,6,0), or equivalently if the whole set has statistics (a,b,c,d,e) = (4,16,24,0,0). But then we have all the points with two 2s, which means that the four "a" points cannot be separated by Hamming distance 2. We conclude that we must have an antipodal pair among the "a" points with an odd number of 1s, and an antipodal pair among the "a" points with an even number of 1s. By the symmetries of the cube, we may take the a-set to then be 1111, 3333, 1113, 3331. But then the "b" set must exclude both 1112 and 3332, and so can have at most three points in the eight-point set xyz2 (with x,y,z=1,3) rather than four (to get four points one must alternate in a checkerboard pattern). Adding this to the at most four points of the form xy2z, x2yz, 2xyz we see that b is at most 15, a contradiction. <math>\Box</math><br />
<br />
Given a subset of <math>[3]^4</math>, let a be the number of points with no 2s, b be the number of points with 1 2, and so forth. The quintuple (a,b,c,d,e) thus lies between (0,0,0,0,0) and (16,32,24,8,1).<br />
<br />
The 43-point solutions have distributions (a,b,c,d,e) as follows:<br />
<br />
* (5,20,18,0,0) [16 solutions]<br />
* (4,16,23,0,0) [768 solutions]<br />
* (3,16,24,0,0) [512 solutions]<br />
* (4,15,24,0,0) [256 solutions]<br />
<br />
The 42-point solutions are [http://abel.math.umu.se/~klasm/Moser-42-stat.pdf distributed as follows]:<br />
<br />
* (6,24,12,0,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-12 8 solutions]] <br />
* (5,20,17,0,0) [576 solutions]<br />
* (5,19,18,0,0) [384 solutions]<br />
* (6,16,18,2,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-e=2 192 solutions]]<br />
* (4,20,18,0,0) [272 solutions]<br />
* (5,17,20,0,0) [192 solutions]<br />
* (5,16,21,0,0) [3584 solutions]<br />
* (4,17,21,0,0) [768 solutions]<br />
* (4,16,22,0,0) [26880 solutions]<br />
* (5,15,22,0,0) [1536 solutions]<br />
* (4,15,23,0,0) [22272 solutions]<br />
* (3,16,23,0,0) [15744 solutions]<br />
* (4,14,24,0,0) [4224 solutions]<br />
* (3,15,24,0,0) [8704 solutions]<br />
* (2,16,24,0,0) [896 solutions]<br />
<br />
Note how c is usually quite large, and d quite low.<br />
<br />
One of the (6,24,12,0,0) solutions is <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math> (i.e. the set of points containing exactly two 1s, and/or exactly two 3s). The other seven are reflections of this set.<br />
<br />
There are 2,765,200 41-point solutions, [http://abel.math.umu.se/~klasm/solutions-4-t=3-41-moser.gz listed here]. The statistics for such points can be [http://abel.math.umu.se/~klasm/Moser-41-stat.pdf found here]. Noteworthy features of the statistics:<br />
<br />
* d is at most 3 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=3.gz 256 exceptional solutions] of the shape (5,15,18,3,0), have d at most 2; here are the [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=2.gz d=2 solutions] and [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=1.gz d=1 solutions])<br />
* c is at least 6 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-c=6.gz 16 exceptional solutions] of the shape (7,28,6,0,0), have c at least 11).<br />
<br />
Statistics for the 41-point, 42-point, and 43-point solutions can be found [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en here].<br />
<br />
If a Moser set in <math>[3]^4</math> contains 2222, then by the n=3 theory, any middle slice (i.e. 2***, *2**, **2*, or ***2) is missing at least one antipodal pair. But each antipodal pair belongs to at most three middle slices, thus two of the 40 antipodal pairs must be completely missing. As a consequence, any Moser set containing 2222 can have at most 39 points.<br />
(A more refined analysis can be found at [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
We have the following inequalities connecting a,b,c,d,e:<br />
<br />
* <math>4a+b \leq 64</math>: There are 32 lines connecting two "a" points with a "b" point; each "a" point belongs to four of these lines, and each "b" point belongs to one. But each such line can have at most two points in the set, and the claim follows.<br />
* This can be refined to <math>4a+b+\frac{2}{3} c \leq 64</math>: There are 24 planes connecting four "a" points, four "b" points, and one "c" point; each "a" point belongs to six of these, each "b" point belongs to three, and each "c" point belongs to one. For each of these planes, we have <math>2a + b + 2c \leq 8</math> from the n=2 theory, and the claim follows.<br />
* <math>6a+2c \leq 96</math>: There are 96 lines connecting two "a" points with a "c" point; each "a" point belongs to six of these lines, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
* <math>3b+2c \leq 96</math>: There are 48 lines connecting two "b" points to a "c" point; each "b" point belongs to three of these points, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
<br />
The inequalities for n=3 imply inequalities for n=4. Indeed, there are eight side slices of <math>[3]^4</math>; each "a" point belongs to four of these, each "b" point belongs to three, each "c" point belongs to two, and each "d" point belongs to one. Thus, any inequality of the form<br />
:<math> \alpha a + \beta b + \gamma c + \delta d \leq M</math><br />
in three dimensions implies the inequality<br />
:<math> 4 \alpha a + 3 \beta b + 2 \gamma c + \delta d \leq 8M</math><br />
in four dimensions. Thus we have<br />
<br />
* <math>8a+3b+4c+4d \leq 176</math><br />
* <math>2a+b+c+d \leq 48</math><br />
* <math>14a+3b+4c+4d \leq 244</math><br />
* <math>4a+b+c+d \leq 64</math> <br />
* <math>3b+2c+3d \leq 96</math><br />
* <math>a+c+d \leq 28</math><br />
* <math>5a+2c+2d \leq 80</math><br />
* <math>a+d \leq 16</math><br />
* <math>b+2d \leq 32</math><br />
* <math>2c+3d \leq 48</math><br />
<br />
Cubes also sit diagonally in the 4-dimensional cube. They may have coordinates xxyz, where xx runs over (11,22,33) or (13,22,31), while y and z run over (1,2,3). These cubes have a different distribution of 2s than the ordinary slices: (a,b,c,d,e) = (8,8,6,4,1) instead of (8,12,6,1,0) for the side slices and (0,8,12,6,1) for the middle slices. So a different set of inequalities arise. It is possible to run through the same procedure as described above for n=3, and the inequalities that arise are <math>2a+b+2c+2d+4e \le 24</math> and <math>4a+b+2c+2d+4e \le 32</math> within the 3D cube, which when averaged becomes <math>4a+b+2c+4d+16e \le 96</math> and <math>8a+b+2c+4d+16e \le 128</math> for the 4D cube. A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here] on sheet 2. Other sheets of this spreadsheet contain results for a 3D cube sitting diagonally in a 5D, 6D or 7D cube. Notice the similarity of the equations that arise for the xxxyz, xxyyz and xxxxyz diagonals. Also notice the xxxxyyz diagonals have the same Pareto sets and linear inequalities as the cube's c-statistics<br />
<br />
== n=5 ==<br />
<br />
Let (A,B,C,D,E,F) be the statistics of a five-dimensional Moser set, thus (A,B,C,D,E,F) varies between (0,0,0,0,0,0) and (32,80,80,40,10,1).<br />
<br />
There are several Moser sets with the statistics (4,40,80,0,0,0), which thus have 124 points. Indeed, one can take<br />
<br />
* all points with two 2s;<br />
* all points with one 2 and an even number of 1s; and<br />
* (13111),(13113),(31311),(13333). Any two of these four points differ in three places, except for one pair of points that differ in one place.<br />
<br />
For the rest of this section, we assume that the Moser set <math>{\mathcal A}</math> is a 125-point set.<br />
<br />
:'''Lemma 1''': F=0.<br />
<br />
'''Proof''' If F is non-zero, then the Moser set contains 22222, then each of the 121 antipodal pairs can have at most one point in the set, leading to only 122 points. <math>\Box</math><br />
<br />
:'''Lemma 2''': Every middle slice of <math>{\mathcal A}</math> has at most 41 points.<br />
<br />
'''Proof''' Without loss of generality we may consider the 2**** slice. There are two cases, depending on the value of c(2****).<br />
<br />
Suppose first that c is at least 17; thus there are at least 17 points of the form 222xy, 22x2y, 22xy2, 2x2y2, 2xy22, or 2x22y, where the x, y denote 1 or 3. This gives 34 "xy" wildcards in all in four coordinate slots; by the pigeonhole principle one of the slots sees at least 9 of the wildcards. By symmetry, we may assume that the second coordinate slot sees at least 9 of these wildcards, thus there are at least 9 points of the form 2x22y, 2x2y2, 2xy22. The x=1, x=3 cases can absorb at most six of these, thus each of these cases must absorb at least three points, with at least one absorbing at least five. Let's say that it's the x=3 case that absorbs 5; thus <math>d(*1***) \geq 3</math> and <math>d(*3***) \geq 5</math>. From the n=4 theory this means that the *1*** slice has at most 41 points, and the *3*** slice has at most 40. Meanwhile, the middle slice has at most 43, leading to 41+41+42=124 points in all.<br />
<br />
Now suppose c is less than 17; then by the n=4 theory the middle slice is one of the eight (6,24,12,0,0) sets. Without loss of generality we may take it to be <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math>; in particular, the middle slice contains the points 21122 21212 21221 23322 23232 23223. In particular, the *1*** and *3*** slices have a "d" value of at least three, and so have at most 41 points. If the *2*** slice has at most 42 points, then we are at 41+42+41=124 points as needed, but if we have 43 or more, then we are back in the first case (as <math>c(*2***) \geq 17</math>) after permuting the indices.<br />
<math>\Box</math><br />
<br />
Since 125=41+41+43, we thus have<br />
<br />
:'''Corollary 1''': Every side slice of <math>{\mathcal A}</math> has at least 41 points. If one side slice does have 41 points, then the other has 43.<br />
<br />
Combining this with the n=4 statistics, we conclude<br />
<br />
:'''Corollary 2''': Every side slice of <math>{\mathcal A}</math> has e=0, and <math>d \leq 3</math>. Given two opposite side slices, e.g. 1**** and 3****, we have <math>d(1****)+d(3****) \leq 4</math>.<br />
<br />
:'''Corollary 3''': E=0.<br />
<br />
:'''Corollary 4''': Any middle slice has a "c" value of at most 8.<br />
<br />
'''Proof''' Let's work with the 2**** slice. By Corollary 2, there are at most four contributions to c(2****) of the form 21*** or 23***, and similarly for the other three positions. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 3''': <math>2B+C \leq 160</math>. <br />
<br />
'''Proof''': There are 160 lines connecting one "C" point to two "B" points (e.g. 11112, 11122, 11132); each "C" point lies in two of these, and each "B" point lies on four. A Moser set can have at most two points out of each of these lines. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 4''' Given m A-points with <math>m \geq 5</math>, there exists at least m-4 pairs (a,b) of such A-points with Hamming distance exactly two (i.e. b differs from a in exactly two places).<br />
<br />
'''Proof''' It suffices to check this for m=5, since the case of larger m then follows by locating a pair, removing a point that contributes to that pair, and using the induction hypothesis. Given 5 A-points, we may assume by the pigeonhole principle and symmetry that at least three of them have an odd number of 1s. Suppose 11111 is one of the points, and that no pair has Hamming distance 2. All points with two 3s are excluded, so the only points allowed with an odd number of 1s are those with four 3s. But all those points differ from each other in two positions, so at most one of them is allowed. <math>\Box</math><br />
<br />
:'''Corollary 5''' <math>C \leq 79</math>.<br />
<br />
'''Proof''': Suppose for contradiction that C=80, then D=0 and <math>B \leq 40</math>. From Lemma 4 we also see that A cannot be 5 or more, leading to the contradiction.<br />
<math>\Box</math><br />
<br />
Define the '''score''' of a Moser set in <math>[3]^4</math> to be the quantity <math>a + 5b/4 + 5c/3 + 5d/2 + 5e</math>. Double-counting (and Lemma 2) gives<br />
<br />
:'''Lemma 5''' The total score of all the ten side-slices of <math>{\mathcal A}</math> is equal to <math>5|{\mathcal A}| = 5 \times 125</math>. In particular, there exists a pair of opposite side-slices whose scores add up to at least 125.<br />
<br />
By Lemma 5 and symmetry, we may assume that the 1**** and 3**** slices have score adding up to at least 125. By Lemma 2, the 2**** slice has at most 41 points, which imply that the 1**** and 3**** have 41, 42, or 43 points. From the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] we know that all 41-point, 42-point, 43-point slices have score less than 62, with the following exceptions:<br />
<br />
# (2,16,24,0,0) [42 points, score: 62]<br />
# (4,16,23,0,0) [43 points, score: 62 1/3]<br />
# (4,15,24,0,0) [43 points, score: 62 3/4]<br />
# (3,16,24,0,0) [43 points, score: 63]<br />
<br />
Thus the 1**** and 3**** slices must come from the above list. Furthermore, if one of the slices is of type 1, then the other must be of type 4, and if one slice is of type 2, then the other must be of type 3 or 4.<br />
<br />
:'''Lemma 6''' There exists one cut in which the side slices have total score strictly greater than 125 (i.e. they thus involve only Type 2, Type 3, and Type 4 slices, with at least one side slice not equal to Type 2, and the cut here is of the form 43+39+43).<br />
<br />
'''Proof''' If not, then all cuts have side slices exactly equal to 125, which by the above table implies that one is Type 1 and one is Type 4, in particular all side slices have c=24. But this forces C=80, contradicting Corollary 5. <math>\Box</math><br />
<br />
Adding up the corners we conclude<br />
<br />
:'''Corollary 6''' <math>6 \leq A \leq 8</math>.<br />
<br />
:'''Lemma 7''' <math>D=0</math>.<br />
<br />
'''Proof''' Firstly, from Lemma 6 we have a cut in which the two side slices are omitting at most one C-point between them (and have no D-point), which forces the middle slice to have at most one D-point; thus D is at most 1.<br />
<br />
Now suppose instead that D=1 (e.g. if 11222 was in the set); then there would be two choices of coordinates in which one of the side slices would have d=1 (e.g. 1**** and *1****). But the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] show that such slices have a score of at most 59 7/12, so that the total score from those two coordinates is at most 63 + 59 7/12 = 122 7/12. On the other hand, the other three slices have a net score of at most 63+63 = 126. This averages out to at most 124.633... < 125, a contradiction. <math>\Box</math><br />
<br />
:'''Corollary 7''' Every middle slice has at most 40 points.<br />
<br />
'''Proof''' From Lemma 7 we see that the middle slice must have c=0, but from the n=4 statistics this is not possible for any slice of size 41 or higher (alternatively, one can use the inequalities <math>4a+b \leq 64</math> and <math>b \leq 32</math>). <math>\Box</math><br />
<br />
Each middle slice now has 39 or 40 points, with c=d=e=0, and so from the inequalities <math>4a+b \leq 64</math>, <math>b \leq 32</math> must have statistics (8,32,0,0,0),(7,32,0,0,0), or (8,31,0,0,0). In particular the "a" index of the middle slices is at most 8. Summing over all middle slices we conclude<br />
<br />
:'''Lemma 9''' B is at most 40.<br />
<br />
:'''Lemma 10''' C is equal to 78 or 79.<br />
<br />
'''Proof''' By Corollary 5, it suffices to show that <math>C \geq 78</math>.<br />
<br />
As 125=43+39+43, we see that every center slice must have at least 39 points. By Lemma 2 and Lemma 7 the center slice has c=d=e=0, thus the center slice has a+b >= 39. On the other hand, from the n=4 theory we have 4a+b <= 64, which forces b >= 31.<br />
<br />
By double counting, we see that 2C is equal to the sum of the b's of all the five center slices. Thus C >= 5*31/2 = 77.5 and the claim follows. <math>\Box</math><br />
<br />
From Lemmas 9 and 10 we see that <math>B+C \leq 119</math>, and thus <math>A \geq 6</math>. Also, if <math>A \geq 7</math>, then by Lemma 4 we have at least three pairs of A points with Hamming distance 2. At most two of these pairs eliminate the same C point, so we would have C=78 in that case. <br />
<br />
Putting all the above facts together, we see that (A,B,C,D,E,F) must be one of the following triples:<br />
<br />
* (6,40,79,0,0,0)<br />
* (7,40,78,0,0,0)<br />
* (8,39,78,0,0,0)<br />
<br />
All three cases can be eliminated, giving <math>c'_5=124</math>.<br />
<br />
=== Elimination of (6,40,79,0,0,0) ===<br />
<br />
Look at the 6 A points. From Lemma 4 we have at least two pairs (a,b), (c,d) of A-points that have Hamming separation 2.<br />
<br />
Now look at the midpoints of these two pairs; these midpoints are C-points cannot lie in the set. But we have exactly one C-point missing from the set, thus the midpoints must be the same. By symmetry, we may thus assume that the two pairs are (11111,11133) and (11113,11131). Thus 11111,11133, 11113, 11131 are in the set, and so every C-point other than 11122 is in the set. On the other hand, the B-points 11121, 11123, 11112, 11132 lie outside the set.<br />
<br />
At most one of 11312, 11332 lie in the set (since 11322 lies in the set). Suppose that 11312 lies outside the set, then we have a pair (xy1z2, xy3z2) with x,y,z = 1,3 that is totally omitted from the set, namely (11112,11312). On the other hand, every other pair of this form can have at most point in the set, thus there are at most seven points in the set of the form (xyzw2) with x,y,z,w = 1,3. Similarly there are at most 8 points of the form xyz2w, or of xy2zw, x2yzw, 2xyzw, leading to at most 39 B-points in all, contradiction.<br />
<br />
=== Elimination of (7,40,78,0,0,0) ===<br />
<br />
By Lemma 4, we have at least three pairs of A-points of distance two apart that lie in the set.<br />
The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs must have the same midpoint, so we may assume as before that 111xy lies in the set for x,y=1,3, which implies that 1112* and 111*2 lie outside the set.<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 316. On the other hand, every one of the 160 lines can have at most two points, and four of these lines (namely 1112*, 111*2) have no points. Thus all the other lines must have exactly two points. <br />
<br />
We know that the C-point 11122 is missing from the set; there is one other missing C-point. Since 1112x, 111x2 lie outside the set, we conclude from the previous paragraph that 1132x, 113x2 and 1312x, 131x2 lie in the set. Taking midpoints we conclude that 11322 and 13122 lie outside the set. But this is now three C-points missing (together with 11122), a contradiction.<br />
<br />
=== Elimination of (8,39,78,0,0,0) ===<br />
<br />
By Lemma 4 we have at least four pairs of A-points of distance two apart that lie in the set. The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs (a,b), (c,d) must have the same midpoint p, and the other two pairs (a',b'), (c',d') must also have the same midpoint p'. (Note that every C-point is the midpoint of at most two such pairs.)<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 312. Every one of the 160 lines can have at most two points, and four of these (those in the plane of (a,b,c,d) or of (a',b',c',d') have no points. Thus all other lines must have exactly two points.<br />
<br />
Without loss of generality we have (a,b)=(11111,11133), (c,d) = (11113,11131), thus p = 11122. By permuting the first three indices, we may assume that p' is not of the form x2y2z, x2yz2, xy22z, xy2z2. Then 1112x lies outside the set and 1122x lies in the set, so by the above paragraph 1132x lies in the set; similarly for 113x2, 1312x, 131x2. This implies that 13122, 11322 lie outside the set, but this (together with 11122) shows that at least three C-points are missing, a contradiction.<br />
<br />
== General n ==<br />
<br />
General solution for <math>c'_N</math><br />
<br />
* q 2s, all points from A(N-q,1)<br />
* q-1 2s, points from A(N-q+1,2)<br />
* q-2 2s, points from A(N-q+2,3)<br />
* etc.<br />
<br />
where A(m,d) is a subset of <math>[1,3]^m</math> for which any two points differ from each other in at least d places.<br />
<br />
Mathworld’s entry on error-correcting codes suggests it might be NP-complete to find the size of A(m,d) in general.<br />
<br />
* <math>|A(m,1)| = 2^m</math> because it includes all points in <math>[1,3]^m</math><br />
* <math>|A(m,2)| = 2^{m-1}</math> because it can include all points in <math>[1,3]^m</math> with an odd number of 1s<br />
<br />
The integer programming routine from Maple 12 was used to obtain upper bounds for <math>c'_6</math> and <math>c'_7</math>. A large number of linear inequalities, described above in sections (n=3) and (n=4), were combined. The details are in [[Maple calculations]]. The results were that <math>c'_6 \le 361</math> and <math>c'_7 \le 1071</math>.<br />
<br />
A [[genetic algorithm]] has provided the following examples:<br />
<br />
* <math>c'_6 \geq 353</math> (26 examples; [http://twofoldgaze.wordpress.com/2009/03/10/353-element-solution/ here is one])<br />
* <math>c'_7 \geq 988</math> [http://twofoldgaze.wordpress.com/2009/03/10/978-element-solution/ Here is the example]<br />
<br />
== Larger sides (k>3) ==<br />
<br />
The following set gives a lower bound for Moser’s cube <math>[4]^n</math> (values 1,2,3,4): Pick all points where q entries are 2 or 3; and also pick those where q-1 entries are 2 or 3 and an odd number of entries are 1. This is maximized when q is near n/2, giving a lower bound of<br />
<br />
<math>\binom{n}{n/2} 2^n + \binom{n}{n/2-1} 2^{n-1}</math><br />
<br />
which is comparable to <math>4^n/\sqrt{n}</math> by [[Stirling's formula]].<br />
<br />
For k=5 (values 1,2,3,4,5) If A, B, C, D, and E denote the numbers of 1-s, 2-s, 3-s, 4-s and 5-s then the first three points of a geometric line form a 3-term arithmetic progression in A+E+2(B+D)+3C. So, for k=5 we have a similar lower bound for the Moser’s problem as for DHJ k=3, i.e. <math>5^{n - O(\sqrt{\log n})}</math>.<br />
<br />
The k=6 version of Moser implies DHJ(3). Indeed, any k=3 combinatorial line-free set can be "doubled up" into a k=6 geometric line-free set of the same density by pulling back the set from the map <math>\phi: [6]^n \to [3]^n</math> that maps 1, 2, 3, 4, 5, 6 to 1, 2, 3, 3, 2, 1 respectively; note that this map sends k=6 geometric lines to k=3 combinatorial lines.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Moser%27s_cube_problemMoser's cube problem2009-03-21T13:51:31Z<p>121.220.134.232: /* n=5 */</p>
<hr />
<div>Define a ''Moser set'' to be a subset of <math>[3]^n</math> which does not contain any [[geometric line]], and let <math>c'_n</math> denote the size of the largest Moser set in <math>[3]^n</math>. The first few values are (see [http://www.research.att.com/~njas/sequences/A003142 OEIS A003142]):<br />
<br />
: <math>c'_0 = 1; c'_1 = 2; c'_2 = 6; c'_3 = 16; c'_4 = 43; c'_5 = 124.</math><br />
<br />
Beyond this point, we only have some upper and lower bounds, in particular <math>353 \leq c'_6 \leq 361</math>; see [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DsU-uZ1tK7VEg this spreadsheet] for the latest bounds.<br />
<br />
The best known asymptotic lower bound for <math>c'_n</math> is<br />
<br />
: <math>c'_n \gg 3^n/\sqrt{n}</math>,<br />
<br />
formed by fixing the number of 2s to a single value near n/3. Compare this to the DHJ(2) or Sperner limit of <math>2^n/\sqrt{n}</math>. Is it possible to do any better? Note that we have a [[Upper_and_lower_bounds#Asymptotics|significantly better bound]] for the DHJ(3) <math>c_n</math>:<br />
<br />
: <math>c_n \geq 3^{n-O(\sqrt{\log n})}</math>.<br />
<br />
A more precise lower bound is<br />
<br />
: <math>c'_n \geq \binom{n+1}{q} 2^{n-q}</math><br />
<br />
where q is the nearest integer to <math>n/3</math>, formed by taking all strings with q 2s, together with all strings with q-1 2s and an odd number of 1s. This for instance gives the lower bound <math>c'_5 \geq 120</math>, which compares with the upper bound <math>c'_5 \leq 3 c'_4 = 129</math>.<br />
<br />
Using [[DHJ(3)]], we have the upper bound<br />
<br />
: <math>c'_n = o(3^n)</math>,<br />
<br />
but no effective decay rate is known. It would be good to have a combinatorial proof of this fact (which is weaker than [[DHJ(3)]], but implies [[Roth's theorem]]).<br />
<br />
== Notation ==<br />
<br />
Given a Moser set A in <math>[3]^n</math>, we let a be the number of points in A with no 2s, b be the number of points in A with one 2, c the number of points with two 2s, etc. We call (a,b,c,...) the ''statistics'' of A. Given a slice S of <math>[3]^n</math>, we let a(S), b(S), etc. denote the statistics of that slice (dropping the fixed coordinates). Thus for instance if A = {11, 12, 22, 32}, then (a,b,c) = (1,2,1), and (a(1*),b(1*)) = (1,1).<br />
<br />
We call a statistic (a,b,c,...) ''attainable'' if it is attained by a Moser set. We say that an attainable statistic is ''Pareto-optimal'' if it cannot be pointwise dominated by any other attainable statistic (a',b',c',...) (thus <math>a' \geq a</math>, <math>b' \geq b</math>, etc.) We say that it is ''extremal'' if it is not a convex combination of any other attainable statistic (this is a stronger property than Pareto-optimal). For the purposes of maximising linear scores of attainable statistics, it suffices to check extremal statistics.<br />
<br />
We let <math>(\alpha_0,\alpha_1,\ldots)</math> be the normalized version of <math>(a,b,\ldots)</math>, in which one divides the number of points of a certain type in the set by the total number of points in the set. Thus for instance <math>\alpha_0 = a/2^n, \alpha_1 = b/(n 2^{n-1}), \alpha_3 = c/(\binom{n}{2} 2^{n-2})</math>, etc., and the <math>\alpha_i</math> range between 0 and 1. Averaging arguments show that any linear inequality obeyed by the <math>\alpha_i</math> at one dimension is automatically inherited by higher dimensions, as are shifted versions of this inequality (in which <math>\alpha_i</math> is replaced by <math>\alpha_{i+1}</math>.<br />
<br />
The idea in 'c-statistics' is to identify 1s and 3s but leave the 2s intact. Let’s use x to denote letters that are either 1 or 3, then the 81 points get split up into 16 groups: 2222 (1 point), 222x, 22×2, 2×22, x222 (two points each), 22xx, 2×2x, x22x, 2xx2, x2×2, xx22 (four points each), 2xxx, x2xx, xx2x, xxx2 (eight points each), xxxx (sixteen points). Let c(w) denote the number of points inside a group w, e.g. c(xx22) is the number of points of the form xx22 inside the set, and is thus an integer from 0 to 4.<br />
<br />
== n=0 ==<br />
<br />
We trivially have <math>c'_0=1.</math><br />
<br />
== n=1 ==<br />
<br />
We trivially have <math>c'_1=2.</math> The Pareto-optimal values of the statistics (a,b) are (1,1) and (2,0); these are also the extremals. We thus have the inequality <math>a+b \leq 2</math>, or in normalized notation<br />
<br />
:<math>2\alpha_0 + \alpha_1 \leq 2</math>.<br />
<br />
== n=2 ==<br />
<br />
We have <math>c'_2 = 6</math>; the upper bound follows since <math>c'_2 \leq 3 c'_1</math>, and the lower bound follows by deleting one of the two diagonals from <math>[3]^2</math> (these are the only extremisers).<br />
<br />
The extremiser has statistics (a,b,c) = (2,4,0), which are of course Pareto-optimal. If c=1 then we must have a, b at most 2 (look at the lines through 22). This is attainable (e.g. {11, 12, 22, 23, 31}, and so (2,2,1) is another Pareto-optimal statistic. If a=4, then b and c must be 0, so we get another Pareto-optimal statistic (4,0,0) (attainable by {11, 13, 31, 33} of course). If a=3, then c=0 and b is at most 2, giving another Pareto-optimal statistic (3,2,0); but this is a convex combination of (4,0,0) and (2,4,0) and is thus not extremal. Thus the complete set of extremal statistics are<br />
<br />
(4,0,0), (2,4,0), (2,2,1).<br />
<br />
The sharp linear inequalities obeyed by a,b,c (other than the trivial ones <math>a,b,c \geq 0</math>) are then<br />
<br />
*<math>2a+b+2c \leq 8</math><br />
*<math>b+2c \leq 4</math><br />
*<math>a+2c \leq 4</math><br />
*<math>c \leq 1</math>.<br />
<br />
In normalized notation, we have<br />
*<math>4\alpha_0 + 2\alpha_1 + \alpha_2 \leq 4</math><br />
*<math>2\alpha_1 + \alpha_2 \leq 2</math><br />
*<math>2\alpha_0 + \alpha_2 \leq 2</math><br />
*<math>\alpha_2 \leq 1</math>.<br />
<br />
The Pareto optimizers for c-statistics are (c22,c2x,cx2,cxx) = (1112),(0222),(0004), which are covered by the linear inequalities <br />
*<math>c22+c2x+cxx \le 4</math>, <br />
*<math>c22+cx2+cxx \le 4</math>.<br />
<br />
== n=3 ==<br />
<br />
We have <math>c'_3 = 16</math>. The lower bound can be seen for instance by taking all the strings with one 2, and half the strings with no 2 (e.g. the strings with an odd number of 1s). The upper bound can be deduced from the corresponding [[upper and lower bounds]] for <math>c_3 = 18</math>; the 17-point and 18-point line-free sets each contain a geometric line.<br />
<br />
If a Moser set in <math>[3]^3</math> contains 222, then it can have at most 14 points, since the remaining 26 points in the cube split into 13 antipodal pairs, and at most one of each pair can lie in the set. By exhausting over the <math>2^{13} = 8192</math> possibilities, it can be shown that it is impossible for a 14-point set to exist; any Moser set containing 222 must in fact omit at least one antipodal pair completely and thus have only 13 points. (A human proof of this fact can be [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
The Pareto-optimal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(3,6,5,0),(4,4,5,0),(3,7,4,0),(4,6,4,0), (3,9,3,0),(4,7,3,0),(5,4,3,0),(4,9,2,0),(5,6,2,0),(6,3,2,0),(3,10,1,0),(5,7,1,0), (6,4,1,0),(4,12,0,0),(5,9,0,0),(6,6,0,0),(7,3,0,0),(8,0,0,0).<br />
<br />
These were found from a search of the <math>2^{27}</math> subsets of the cube.<br />
A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here].<br />
<br />
The extremal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(4,4,5,0),(4,6,4,0),(4,12,0,0),(8,0,0,0)<br />
<br />
The sharp linear bounds are :<br />
<br />
* <math>2a+b+2c+4d \leq 22</math><br />
* <math>3a+2b+3c+6d \leq 36</math><br />
* <math>7a+2b+4c+8d \leq 56</math><br />
* <math>6a+2b+3c+6d \leq 48</math><br />
* <math>b+c+3d \leq 12</math><br />
* <math>a+2c+4d \leq 14</math><br />
* <math>5a+4c+8d \leq 40</math><br />
* <math>a+4d \leq 8</math><br />
* <math>b+6d \leq 12</math><br />
* <math>c+3d \leq 6</math><br />
* <math>d \leq 1</math><br />
<br />
In normalized notation,<br />
* <math>8\alpha_0+ 6\alpha_1 + 6\alpha_2 + 2\alpha_3 \leq 11</math><br />
* <math>4\alpha_0+4\alpha_1+3\alpha_2+\alpha_3 \leq 6</math><br />
* <math>7\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 7</math><br />
* <math>8\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 8</math><br />
* <math>4\alpha_1+2\alpha_2+\alpha_3 \leq 4</math><br />
* <math>4\alpha_0+6\alpha_2+2\alpha_3 \leq 7</math><br />
* <math>5\alpha_0+3\alpha_2+\alpha_3 \leq 5</math><br />
* <math>2\alpha_0+\alpha_3 \leq 2</math><br />
* <math>2\alpha_1+\alpha_3 \leq 2</math><br />
* <math>2\alpha_2+\alpha_3 \leq 2</math><br />
* <math>\alpha_3 \leq 1</math><br />
<br />
The c-statistics can also be found from an exhaustive search of Moser sets. The resulting Pareto sets and linear inequalities can be found on Sheet 8 of [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# this spreadsheet]<br />
<br />
== n=4 ==<br />
<br />
A [http://abel.math.umu.se/~klasm/extremal-moser-n=5-t=3 computer search] has obtained all extremisers to <math>c'_4=43</math>. The 42-point solutions can be found [http://abel.math.umu.se/~klasm/moser-n=4-t=3-p=42.gz here].<br />
<br />
'''Proof that <math>c'_4 \leq 43</math>''': <br />
When e=1 (i.e. the 4D set contains 2222) then we have at most 41 points (in fact at most 39) by counting antipodal points, so assume e=0.<br />
<br />
Define the score of a 3D slice to be a/4+b/3+c/2+d. Observe from double counting that the size of a 4D set is the sum of the scores of its eight side slices.<br />
<br />
But by [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en looking at the extremals] we see that the largest score is 44/8, attained at only one point, namely when (a,b,c,d) = (2,6,6,0). So the only way one can have a 44-point set is if all side slices are (2,6,6,0), or equivalently if the whole set has statistics (a,b,c,d,e) = (4,16,24,0,0). But then we have all the points with two 2s, which means that the four "a" points cannot be separated by Hamming distance 2. We conclude that we must have an antipodal pair among the "a" points with an odd number of 1s, and an antipodal pair among the "a" points with an even number of 1s. By the symmetries of the cube, we may take the a-set to then be 1111, 3333, 1113, 3331. But then the "b" set must exclude both 1112 and 3332, and so can have at most three points in the eight-point set xyz2 (with x,y,z=1,3) rather than four (to get four points one must alternate in a checkerboard pattern). Adding this to the at most four points of the form xy2z, x2yz, 2xyz we see that b is at most 15, a contradiction. <math>\Box</math><br />
<br />
Given a subset of <math>[3]^4</math>, let a be the number of points with no 2s, b be the number of points with 1 2, and so forth. The quintuple (a,b,c,d,e) thus lies between (0,0,0,0,0) and (16,32,24,8,1).<br />
<br />
The 43-point solutions have distributions (a,b,c,d,e) as follows:<br />
<br />
* (5,20,18,0,0) [16 solutions]<br />
* (4,16,23,0,0) [768 solutions]<br />
* (3,16,24,0,0) [512 solutions]<br />
* (4,15,24,0,0) [256 solutions]<br />
<br />
The 42-point solutions are [http://abel.math.umu.se/~klasm/Moser-42-stat.pdf distributed as follows]:<br />
<br />
* (6,24,12,0,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-12 8 solutions]] <br />
* (5,20,17,0,0) [576 solutions]<br />
* (5,19,18,0,0) [384 solutions]<br />
* (6,16,18,2,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-e=2 192 solutions]]<br />
* (4,20,18,0,0) [272 solutions]<br />
* (5,17,20,0,0) [192 solutions]<br />
* (5,16,21,0,0) [3584 solutions]<br />
* (4,17,21,0,0) [768 solutions]<br />
* (4,16,22,0,0) [26880 solutions]<br />
* (5,15,22,0,0) [1536 solutions]<br />
* (4,15,23,0,0) [22272 solutions]<br />
* (3,16,23,0,0) [15744 solutions]<br />
* (4,14,24,0,0) [4224 solutions]<br />
* (3,15,24,0,0) [8704 solutions]<br />
* (2,16,24,0,0) [896 solutions]<br />
<br />
Note how c is usually quite large, and d quite low.<br />
<br />
One of the (6,24,12,0,0) solutions is <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math> (i.e. the set of points containing exactly two 1s, and/or exactly two 3s). The other seven are reflections of this set.<br />
<br />
There are 2,765,200 41-point solutions, [http://abel.math.umu.se/~klasm/solutions-4-t=3-41-moser.gz listed here]. The statistics for such points can be [http://abel.math.umu.se/~klasm/Moser-41-stat.pdf found here]. Noteworthy features of the statistics:<br />
<br />
* d is at most 3 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=3.gz 256 exceptional solutions] of the shape (5,15,18,3,0), have d at most 2; here are the [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=2.gz d=2 solutions] and [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=1.gz d=1 solutions])<br />
* c is at least 6 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-c=6.gz 16 exceptional solutions] of the shape (7,28,6,0,0), have c at least 11).<br />
<br />
Statistics for the 41-point, 42-point, and 43-point solutions can be found [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en here].<br />
<br />
If a Moser set in <math>[3]^4</math> contains 2222, then by the n=3 theory, any middle slice (i.e. 2***, *2**, **2*, or ***2) is missing at least one antipodal pair. But each antipodal pair belongs to at most three middle slices, thus two of the 40 antipodal pairs must be completely missing. As a consequence, any Moser set containing 2222 can have at most 39 points.<br />
(A more refined analysis can be found at [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
We have the following inequalities connecting a,b,c,d,e:<br />
<br />
* <math>4a+b \leq 64</math>: There are 32 lines connecting two "a" points with a "b" point; each "a" point belongs to four of these lines, and each "b" point belongs to one. But each such line can have at most two points in the set, and the claim follows.<br />
* This can be refined to <math>4a+b+\frac{2}{3} c \leq 64</math>: There are 24 planes connecting four "a" points, four "b" points, and one "c" point; each "a" point belongs to six of these, each "b" point belongs to three, and each "c" point belongs to one. For each of these planes, we have <math>2a + b + 2c \leq 8</math> from the n=2 theory, and the claim follows.<br />
* <math>6a+2c \leq 96</math>: There are 96 lines connecting two "a" points with a "c" point; each "a" point belongs to six of these lines, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
* <math>3b+2c \leq 96</math>: There are 48 lines connecting two "b" points to a "c" point; each "b" point belongs to three of these points, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
<br />
The inequalities for n=3 imply inequalities for n=4. Indeed, there are eight side slices of <math>[3]^4</math>; each "a" point belongs to four of these, each "b" point belongs to three, each "c" point belongs to two, and each "d" point belongs to one. Thus, any inequality of the form<br />
:<math> \alpha a + \beta b + \gamma c + \delta d \leq M</math><br />
in three dimensions implies the inequality<br />
:<math> 4 \alpha a + 3 \beta b + 2 \gamma c + \delta d \leq 8M</math><br />
in four dimensions. Thus we have<br />
<br />
* <math>8a+3b+4c+4d \leq 176</math><br />
* <math>2a+b+c+d \leq 48</math><br />
* <math>14a+3b+4c+4d \leq 244</math><br />
* <math>4a+b+c+d \leq 64</math> <br />
* <math>3b+2c+3d \leq 96</math><br />
* <math>a+c+d \leq 28</math><br />
* <math>5a+2c+2d \leq 80</math><br />
* <math>a+d \leq 16</math><br />
* <math>b+2d \leq 32</math><br />
* <math>2c+3d \leq 48</math><br />
<br />
Cubes also sit diagonally in the 4-dimensional cube. They may have coordinates xxyz, where xx runs over (11,22,33) or (13,22,31), while y and z run over (1,2,3). These cubes have a different distribution of 2s than the ordinary slices: (a,b,c,d,e) = (8,8,6,4,1) instead of (8,12,6,1,0) for the side slices and (0,8,12,6,1) for the middle slices. So a different set of inequalities arise. It is possible to run through the same procedure as described above for n=3, and the inequalities that arise are <math>2a+b+2c+2d+4e \le 24</math> and <math>4a+b+2c+2d+4e \le 32</math> within the 3D cube, which when averaged becomes <math>4a+b+2c+4d+16e \le 96</math> and <math>8a+b+2c+4d+16e \le 128</math> for the 4D cube. A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here] on sheet 2. Other sheets of this spreadsheet contain results for a 3D cube sitting diagonally in a 5D, 6D or 7D cube. Notice the similarity of the equations that arise for the xxxyz, xxyyz and xxxxyz diagonals. Also notice the xxxxyyz diagonals have the same Pareto sets and linear inequalities as the cube's c-statistics<br />
<br />
== n=5 ==<br />
<br />
Let (A,B,C,D,E,F) be the statistics of a five-dimensional Moser set, thus (A,B,C,D,E,F) varies between (0,0,0,0,0,0) and (32,80,80,40,10,1).<br />
<br />
There are several Moser sets with the statistics (4,40,80,0,0,0), which thus have 124 points. Indeed, one can take<br />
<br />
* all points with two 2s;<br />
* all points with one 2 and an even number of 1s; and<br />
* (13111),(13113),(31311),(13333). Any two of these four points differ in three places, except for one pair of points that differ in one place.<br />
<br />
For the rest of this section, we assume that the Moser set <math>{\mathcal A}</math> is a 125-point set.<br />
<br />
:'''Lemma 1''': F=0.<br />
<br />
'''Proof''' If F is non-zero, then the Moser set contains 22222, then each of the 121 antipodal pairs can have at most one point in the set, leading to only 122 points. <math>\Box</math><br />
<br />
:'''Lemma 2''': Every middle slice of <math>{\mathcal A}</math> has at most 41 points.<br />
<br />
'''Proof''' Without loss of generality we may consider the 2**** slice. There are two cases, depending on the value of c(2****).<br />
<br />
Suppose first that c is at least 17; thus there are at least 17 points of the form 222xy, 22x2y, 22xy2, 2x2y2, 2xy22, or 2x22y, where the x, y denote 1 or 3. This gives 34 "xy" wildcards in all in four coordinate slots; by the pigeonhole principle one of the slots sees at least 9 of the wildcards. By symmetry, we may assume that the second coordinate slot sees at least 9 of these wildcards, thus there are at least 9 points of the form 2x22y, 2x2y2, 2xy22. The x=1, x=3 cases can absorb at most six of these, thus each of these cases must absorb at least three points, with at least one absorbing at least five. Let's say that it's the x=3 case that absorbs 5; thus <math>d(*1***) \geq 3</math> and <math>d(*3***) \geq 5</math>. From the n=4 theory this means that the *1*** slice has at most 41 points, and the *3*** slice has at most 40. Meanwhile, the middle slice has at most 43, leading to 41+41+42=124 points in all.<br />
<br />
Now suppose c is less than 17; then by the n=4 theory the middle slice is one of the eight (6,24,12,0,0) sets. Without loss of generality we may take it to be <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math>; in particular, the middle slice contains the points 21122 21212 21221 23322 23232 23223. In particular, the *1*** and *3*** slices have a "d" value of at least three, and so have at most 41 points. If the *2*** slice has at most 42 points, then we are at 41+42+41=124 points as needed, but if we have 43 or more, then we are back in the first case (as <math>c(*2***) \geq 17</math>) after permuting the indices.<br />
<math>\Box</math><br />
<br />
Since 125=41+41+43, we thus have<br />
<br />
:'''Corollary 1''': Every side slice of <math>{\mathcal A}</math> has at least 41 points. If one side slice does have 41 points, then the other has 43.<br />
<br />
Combining this with the n=4 statistics, we conclude<br />
<br />
:'''Corollary 2''': Every side slice of <math>{\mathcal A}</math> has e=0, and <math>d \leq 3</math>. Given two opposite side slices, e.g. 1**** and 3****, we have <math>d(1****)+d(3****) \leq 4</math>.<br />
<br />
:'''Corollary 3''': E=0.<br />
<br />
:'''Corollary 4''': Any middle slice has a "c" value of at most 8.<br />
<br />
'''Proof''' Let's work with the 2**** slice. By Corollary 2, there are at most four contributions to c(2****) of the form 21*** or 23***, and similarly for the other three positions. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 3''': <math>2B+C \leq 160</math>. <br />
<br />
'''Proof''': There are 160 lines connecting one "C" point to two "B" points (e.g. 11112, 11122, 11132); each "C" point lies in two of these, and each "B" point lies on four. A Moser set can have at most two points out of each of these lines. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 4''' Given m A-points with <math>m \geq 5</math>, there exists at least m-4 pairs (a,b) of such A-points with Hamming distance exactly two (i.e. b differs from a in exactly two places).<br />
<br />
'''Proof''' It suffices to check this for m=5, since the case of larger m then follows by locating a pair, removing a point that contributes to that pair, and using the induction hypothesis. Given 5 A-points, we may assume by the pigeonhole principle and symmetry that at least three of them have an odd number of 1s. Suppose 11111 is one of the points, and that no pair has Hamming distance 2. All points with two 3s are excluded, so the only points allowed with an odd number of 1s are those with four 3s. But all those points differ from each other in two positions, so at most one of them is allowed. <math>\Box</math><br />
<br />
:'''Corollary 5''' <math>C \leq 79</math>.<br />
<br />
'''Proof''': Suppose for contradiction that C=80, then D=0 and <math>B \leq 40</math>. From Lemma 4 we also see that A cannot be 5 or more, leading to the contradiction.<br />
<math>\Box</math><br />
<br />
Define the '''score''' of a Moser set in <math>[3]^4</math> to be the quantity <math>a + 5b/4 + 5c/3 + 5d/2 + 5e</math>. Double-counting (and Lemma 2) gives<br />
<br />
:'''Lemma 5''' The total score of all the ten side-slices of <math>{\mathcal A}</math> is equal to <math>5|{\mathcal A}| = 5 \times 125</math>. In particular, there exists a pair of opposite side-slices whose scores add up to at least 125.<br />
<br />
By Lemma 5 and symmetry, we may assume that the 1**** and 3**** slices have score adding up to at least 125. By Lemma 2, the 2**** slice has at most 41 points, which imply that the 1**** and 3**** have 41, 42, or 43 points. From the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] we know that all 41-point, 42-point, 43-point slices have score less than 62, with the following exceptions:<br />
<br />
# (2,16,24,0,0) [42 points, score: 62]<br />
# (4,16,23,0,0) [43 points, score: 62 1/3]<br />
# (4,15,24,0,0) [43 points, score: 62 3/4]<br />
# (3,16,24,0,0) [43 points, score: 63]<br />
<br />
Thus the 1**** and 3**** slices must from the above list. Furthermore, if one of the slices is of type 1, then the other must be of type 4, and if one slice is of type 2, then the other must be of type 3 or 4.<br />
<br />
:'''Lemma 6''' There exists one cut in which the side slices have total score strictly greater than 125 (i.e. they thus involve only Type 2, Type 3, and Type 4 slices, with at least one side slice not equal to Type 2, and the cut here is of the form 43+39+43).<br />
<br />
'''Proof''' If not, then all cuts have side slices exactly equal to 125, which by the above table implies that one is Type 1 and one is Type 4, in particular all side slices have c=24. But this forces C=80, contradicting Corollary 5. <math>\Box</math><br />
<br />
Adding up the corners we conclude<br />
<br />
:'''Corollary 6''' <math>6 \leq A \leq 8</math>.<br />
<br />
:'''Lemma 7''' <math>D=0</math>.<br />
<br />
'''Proof''' Firstly, from Lemma 6 we have a cut in which the two side slices are omitting at most one C-point between them (and have no D-point), which forces the middle slice to have at most one D-point; thus D is at most 1.<br />
<br />
Now suppose instead that D=1 (e.g. if 11222 was in the set); then there would be two choices of coordinates in which one of the side slices would have d=1 (e.g. 1**** and *1****). But the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] show that such slices have a score of at most 59 7/12, so that the total score from those two coordinates is at most 63 + 59 7/12 = 122 7/12. On the other hand, the other three slices have a net score of at most 63+63 = 126. This averages out to at most 124.633... < 125, a contradiction. <math>\Box</math><br />
<br />
:'''Corollary 7''' Every middle slice has at most 40 points.<br />
<br />
'''Proof''' From Lemma 7 we see that the middle slice must have c=0, but from the n=4 statistics this is not possible for any slice of size 41 or higher (alternatively, one can use the inequalities <math>4a+b \leq 64</math> and <math>b \leq 32</math>). <math>\Box</math><br />
<br />
Each middle slice now has 39 or 40 points, with c=d=e=0, and so from the inequalities <math>4a+b \leq 64</math>, <math>b \leq 32</math> must have statistics (8,32,0,0,0),(7,32,0,0,0), or (8,31,0,0,0). In particular the "a" index of the middle slices is at most 8. Summing over all middle slices we conclude<br />
<br />
:'''Lemma 9''' B is at most 40.<br />
<br />
:'''Lemma 10''' C is equal to 78 or 79.<br />
<br />
'''Proof''' By Corollary 5, it suffices to show that <math>C \geq 78</math>.<br />
<br />
As 125=43+39+43, we see that every center slice must have at least 39 points. By Lemma 2 and Lemma 7 the center slice has c=d=e=0, thus the center slice has a+b >= 39. On the other hand, from the n=4 theory we have 4a+b <= 64, which forces b >= 31.<br />
<br />
By double counting, we see that 2C is equal to the sum of the b's of all the five center slices. Thus C >= 5*31/2 = 77.5 and the claim follows. <math>\Box</math><br />
<br />
From Lemmas 9 and 10 we see that <math>B+C \leq 119</math>, and thus <math>A \geq 6</math>. Also, if <math>A \geq 7</math>, then by Lemma 4 we have at least three pairs of A points with Hamming distance 2. At most two of these pairs eliminate the same C point, so we would have C=78 in that case. <br />
<br />
Putting all the above facts together, we see that (A,B,C,D,E,F) must be one of the following triples:<br />
<br />
* (6,40,79,0,0,0)<br />
* (7,40,78,0,0,0)<br />
* (8,39,78,0,0,0)<br />
<br />
All three cases can be eliminated, giving <math>c'_5=124</math>.<br />
<br />
=== Elimination of (6,40,79,0,0,0) ===<br />
<br />
Look at the 6 A points. From Lemma 4 we have at least two pairs (a,b), (c,d) of A-points that have Hamming separation 2.<br />
<br />
Now look at the midpoints of these two pairs; these midpoints are C-points cannot lie in the set. But we have exactly one C-point missing from the set, thus the midpoints must be the same. By symmetry, we may thus assume that the two pairs are (11111,11133) and (11113,11131). Thus 11111,11133, 11113, 11131 are in the set, and so every C-point other than 11122 is in the set. On the other hand, the B-points 11121, 11123, 11112, 11132 lie outside the set.<br />
<br />
At most one of 11312, 11332 lie in the set (since 11322 lies in the set). Suppose that 11312 lies outside the set, then we have a pair (xy1z2, xy3z2) with x,y,z = 1,3 that is totally omitted from the set, namely (11112,11312). On the other hand, every other pair of this form can have at most point in the set, thus there are at most seven points in the set of the form (xyzw2) with x,y,z,w = 1,3. Similarly there are at most 8 points of the form xyz2w, or of xy2zw, x2yzw, 2xyzw, leading to at most 39 B-points in all, contradiction.<br />
<br />
=== Elimination of (7,40,78,0,0,0) ===<br />
<br />
By Lemma 4, we have at least three pairs of A-points of distance two apart that lie in the set.<br />
The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs must have the same midpoint, so we may assume as before that 111xy lies in the set for x,y=1,3, which implies that 1112* and 111*2 lie outside the set.<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 316. On the other hand, every one of the 160 lines can have at most two points, and four of these lines (namely 1112*, 111*2) have no points. Thus all the other lines must have exactly two points. <br />
<br />
We know that the C-point 11122 is missing from the set; there is one other missing C-point. Since 1112x, 111x2 lie outside the set, we conclude from the previous paragraph that 1132x, 113x2 and 1312x, 131x2 lie in the set. Taking midpoints we conclude that 11322 and 13122 lie outside the set. But this is now three C-points missing (together with 11122), a contradiction.<br />
<br />
=== Elimination of (8,39,78,0,0,0) ===<br />
<br />
By Lemma 4 we have at least four pairs of A-points of distance two apart that lie in the set. The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs (a,b), (c,d) must have the same midpoint p, and the other two pairs (a',b'), (c',d') must also have the same midpoint p'. (Note that every C-point is the midpoint of at most two such pairs.)<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 312. Every one of the 160 lines can have at most two points, and four of these (those in the plane of (a,b,c,d) or of (a',b',c',d') have no points. Thus all other lines must have exactly two points.<br />
<br />
Without loss of generality we have (a,b)=(11111,11133), (c,d) = (11113,11131), thus p = 11122. By permuting the first three indices, we may assume that p' is not of the form x2y2z, x2yz2, xy22z, xy2z2. Then 1112x lies outside the set and 1122x lies in the set, so by the above paragraph 1132x lies in the set; similarly for 113x2, 1312x, 131x2. This implies that 13122, 11322 lie outside the set, but this (together with 11122) shows that at least three C-points are missing, a contradiction.<br />
<br />
== General n ==<br />
<br />
General solution for <math>c'_N</math><br />
<br />
* q 2s, all points from A(N-q,1)<br />
* q-1 2s, points from A(N-q+1,2)<br />
* q-2 2s, points from A(N-q+2,3)<br />
* etc.<br />
<br />
where A(m,d) is a subset of <math>[1,3]^m</math> for which any two points differ from each other in at least d places.<br />
<br />
Mathworld’s entry on error-correcting codes suggests it might be NP-complete to find the size of A(m,d) in general.<br />
<br />
* <math>|A(m,1)| = 2^m</math> because it includes all points in <math>[1,3]^m</math><br />
* <math>|A(m,2)| = 2^{m-1}</math> because it can include all points in <math>[1,3]^m</math> with an odd number of 1s<br />
<br />
The integer programming routine from Maple 12 was used to obtain upper bounds for <math>c'_6</math> and <math>c'_7</math>. A large number of linear inequalities, described above in sections (n=3) and (n=4), were combined. The details are in [[Maple calculations]]. The results were that <math>c'_6 \le 361</math> and <math>c'_7 \le 1071</math>.<br />
<br />
A [[genetic algorithm]] has provided the following examples:<br />
<br />
* <math>c'_6 \geq 353</math> (26 examples; [http://twofoldgaze.wordpress.com/2009/03/10/353-element-solution/ here is one])<br />
* <math>c'_7 \geq 988</math> [http://twofoldgaze.wordpress.com/2009/03/10/978-element-solution/ Here is the example]<br />
<br />
== Larger sides (k>3) ==<br />
<br />
The following set gives a lower bound for Moser’s cube <math>[4]^n</math> (values 1,2,3,4): Pick all points where q entries are 2 or 3; and also pick those where q-1 entries are 2 or 3 and an odd number of entries are 1. This is maximized when q is near n/2, giving a lower bound of<br />
<br />
<math>\binom{n}{n/2} 2^n + \binom{n}{n/2-1} 2^{n-1}</math><br />
<br />
which is comparable to <math>4^n/\sqrt{n}</math> by [[Stirling's formula]].<br />
<br />
For k=5 (values 1,2,3,4,5) If A, B, C, D, and E denote the numbers of 1-s, 2-s, 3-s, 4-s and 5-s then the first three points of a geometric line form a 3-term arithmetic progression in A+E+2(B+D)+3C. So, for k=5 we have a similar lower bound for the Moser’s problem as for DHJ k=3, i.e. <math>5^{n - O(\sqrt{\log n})}</math>.<br />
<br />
The k=6 version of Moser implies DHJ(3). Indeed, any k=3 combinatorial line-free set can be "doubled up" into a k=6 geometric line-free set of the same density by pulling back the set from the map <math>\phi: [6]^n \to [3]^n</math> that maps 1, 2, 3, 4, 5, 6 to 1, 2, 3, 3, 2, 1 respectively; note that this map sends k=6 geometric lines to k=3 combinatorial lines.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Moser%27s_cube_problemMoser's cube problem2009-03-21T13:48:13Z<p>121.220.134.232: /* n=5 */</p>
<hr />
<div>Define a ''Moser set'' to be a subset of <math>[3]^n</math> which does not contain any [[geometric line]], and let <math>c'_n</math> denote the size of the largest Moser set in <math>[3]^n</math>. The first few values are (see [http://www.research.att.com/~njas/sequences/A003142 OEIS A003142]):<br />
<br />
: <math>c'_0 = 1; c'_1 = 2; c'_2 = 6; c'_3 = 16; c'_4 = 43; c'_5 = 124.</math><br />
<br />
Beyond this point, we only have some upper and lower bounds, in particular <math>353 \leq c'_6 \leq 361</math>; see [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DsU-uZ1tK7VEg this spreadsheet] for the latest bounds.<br />
<br />
The best known asymptotic lower bound for <math>c'_n</math> is<br />
<br />
: <math>c'_n \gg 3^n/\sqrt{n}</math>,<br />
<br />
formed by fixing the number of 2s to a single value near n/3. Compare this to the DHJ(2) or Sperner limit of <math>2^n/\sqrt{n}</math>. Is it possible to do any better? Note that we have a [[Upper_and_lower_bounds#Asymptotics|significantly better bound]] for the DHJ(3) <math>c_n</math>:<br />
<br />
: <math>c_n \geq 3^{n-O(\sqrt{\log n})}</math>.<br />
<br />
A more precise lower bound is<br />
<br />
: <math>c'_n \geq \binom{n+1}{q} 2^{n-q}</math><br />
<br />
where q is the nearest integer to <math>n/3</math>, formed by taking all strings with q 2s, together with all strings with q-1 2s and an odd number of 1s. This for instance gives the lower bound <math>c'_5 \geq 120</math>, which compares with the upper bound <math>c'_5 \leq 3 c'_4 = 129</math>.<br />
<br />
Using [[DHJ(3)]], we have the upper bound<br />
<br />
: <math>c'_n = o(3^n)</math>,<br />
<br />
but no effective decay rate is known. It would be good to have a combinatorial proof of this fact (which is weaker than [[DHJ(3)]], but implies [[Roth's theorem]]).<br />
<br />
== Notation ==<br />
<br />
Given a Moser set A in <math>[3]^n</math>, we let a be the number of points in A with no 2s, b be the number of points in A with one 2, c the number of points with two 2s, etc. We call (a,b,c,...) the ''statistics'' of A. Given a slice S of <math>[3]^n</math>, we let a(S), b(S), etc. denote the statistics of that slice (dropping the fixed coordinates). Thus for instance if A = {11, 12, 22, 32}, then (a,b,c) = (1,2,1), and (a(1*),b(1*)) = (1,1).<br />
<br />
We call a statistic (a,b,c,...) ''attainable'' if it is attained by a Moser set. We say that an attainable statistic is ''Pareto-optimal'' if it cannot be pointwise dominated by any other attainable statistic (a',b',c',...) (thus <math>a' \geq a</math>, <math>b' \geq b</math>, etc.) We say that it is ''extremal'' if it is not a convex combination of any other attainable statistic (this is a stronger property than Pareto-optimal). For the purposes of maximising linear scores of attainable statistics, it suffices to check extremal statistics.<br />
<br />
We let <math>(\alpha_0,\alpha_1,\ldots)</math> be the normalized version of <math>(a,b,\ldots)</math>, in which one divides the number of points of a certain type in the set by the total number of points in the set. Thus for instance <math>\alpha_0 = a/2^n, \alpha_1 = b/(n 2^{n-1}), \alpha_3 = c/(\binom{n}{2} 2^{n-2})</math>, etc., and the <math>\alpha_i</math> range between 0 and 1. Averaging arguments show that any linear inequality obeyed by the <math>\alpha_i</math> at one dimension is automatically inherited by higher dimensions, as are shifted versions of this inequality (in which <math>\alpha_i</math> is replaced by <math>\alpha_{i+1}</math>.<br />
<br />
The idea in 'c-statistics' is to identify 1s and 3s but leave the 2s intact. Let’s use x to denote letters that are either 1 or 3, then the 81 points get split up into 16 groups: 2222 (1 point), 222x, 22×2, 2×22, x222 (two points each), 22xx, 2×2x, x22x, 2xx2, x2×2, xx22 (four points each), 2xxx, x2xx, xx2x, xxx2 (eight points each), xxxx (sixteen points). Let c(w) denote the number of points inside a group w, e.g. c(xx22) is the number of points of the form xx22 inside the set, and is thus an integer from 0 to 4.<br />
<br />
== n=0 ==<br />
<br />
We trivially have <math>c'_0=1.</math><br />
<br />
== n=1 ==<br />
<br />
We trivially have <math>c'_1=2.</math> The Pareto-optimal values of the statistics (a,b) are (1,1) and (2,0); these are also the extremals. We thus have the inequality <math>a+b \leq 2</math>, or in normalized notation<br />
<br />
:<math>2\alpha_0 + \alpha_1 \leq 2</math>.<br />
<br />
== n=2 ==<br />
<br />
We have <math>c'_2 = 6</math>; the upper bound follows since <math>c'_2 \leq 3 c'_1</math>, and the lower bound follows by deleting one of the two diagonals from <math>[3]^2</math> (these are the only extremisers).<br />
<br />
The extremiser has statistics (a,b,c) = (2,4,0), which are of course Pareto-optimal. If c=1 then we must have a, b at most 2 (look at the lines through 22). This is attainable (e.g. {11, 12, 22, 23, 31}, and so (2,2,1) is another Pareto-optimal statistic. If a=4, then b and c must be 0, so we get another Pareto-optimal statistic (4,0,0) (attainable by {11, 13, 31, 33} of course). If a=3, then c=0 and b is at most 2, giving another Pareto-optimal statistic (3,2,0); but this is a convex combination of (4,0,0) and (2,4,0) and is thus not extremal. Thus the complete set of extremal statistics are<br />
<br />
(4,0,0), (2,4,0), (2,2,1).<br />
<br />
The sharp linear inequalities obeyed by a,b,c (other than the trivial ones <math>a,b,c \geq 0</math>) are then<br />
<br />
*<math>2a+b+2c \leq 8</math><br />
*<math>b+2c \leq 4</math><br />
*<math>a+2c \leq 4</math><br />
*<math>c \leq 1</math>.<br />
<br />
In normalized notation, we have<br />
*<math>4\alpha_0 + 2\alpha_1 + \alpha_2 \leq 4</math><br />
*<math>2\alpha_1 + \alpha_2 \leq 2</math><br />
*<math>2\alpha_0 + \alpha_2 \leq 2</math><br />
*<math>\alpha_2 \leq 1</math>.<br />
<br />
The Pareto optimizers for c-statistics are (c22,c2x,cx2,cxx) = (1112),(0222),(0004), which are covered by the linear inequalities <br />
*<math>c22+c2x+cxx \le 4</math>, <br />
*<math>c22+cx2+cxx \le 4</math>.<br />
<br />
== n=3 ==<br />
<br />
We have <math>c'_3 = 16</math>. The lower bound can be seen for instance by taking all the strings with one 2, and half the strings with no 2 (e.g. the strings with an odd number of 1s). The upper bound can be deduced from the corresponding [[upper and lower bounds]] for <math>c_3 = 18</math>; the 17-point and 18-point line-free sets each contain a geometric line.<br />
<br />
If a Moser set in <math>[3]^3</math> contains 222, then it can have at most 14 points, since the remaining 26 points in the cube split into 13 antipodal pairs, and at most one of each pair can lie in the set. By exhausting over the <math>2^{13} = 8192</math> possibilities, it can be shown that it is impossible for a 14-point set to exist; any Moser set containing 222 must in fact omit at least one antipodal pair completely and thus have only 13 points. (A human proof of this fact can be [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
The Pareto-optimal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(3,6,5,0),(4,4,5,0),(3,7,4,0),(4,6,4,0), (3,9,3,0),(4,7,3,0),(5,4,3,0),(4,9,2,0),(5,6,2,0),(6,3,2,0),(3,10,1,0),(5,7,1,0), (6,4,1,0),(4,12,0,0),(5,9,0,0),(6,6,0,0),(7,3,0,0),(8,0,0,0).<br />
<br />
These were found from a search of the <math>2^{27}</math> subsets of the cube.<br />
A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here].<br />
<br />
The extremal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(4,4,5,0),(4,6,4,0),(4,12,0,0),(8,0,0,0)<br />
<br />
The sharp linear bounds are :<br />
<br />
* <math>2a+b+2c+4d \leq 22</math><br />
* <math>3a+2b+3c+6d \leq 36</math><br />
* <math>7a+2b+4c+8d \leq 56</math><br />
* <math>6a+2b+3c+6d \leq 48</math><br />
* <math>b+c+3d \leq 12</math><br />
* <math>a+2c+4d \leq 14</math><br />
* <math>5a+4c+8d \leq 40</math><br />
* <math>a+4d \leq 8</math><br />
* <math>b+6d \leq 12</math><br />
* <math>c+3d \leq 6</math><br />
* <math>d \leq 1</math><br />
<br />
In normalized notation,<br />
* <math>8\alpha_0+ 6\alpha_1 + 6\alpha_2 + 2\alpha_3 \leq 11</math><br />
* <math>4\alpha_0+4\alpha_1+3\alpha_2+\alpha_3 \leq 6</math><br />
* <math>7\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 7</math><br />
* <math>8\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 8</math><br />
* <math>4\alpha_1+2\alpha_2+\alpha_3 \leq 4</math><br />
* <math>4\alpha_0+6\alpha_2+2\alpha_3 \leq 7</math><br />
* <math>5\alpha_0+3\alpha_2+\alpha_3 \leq 5</math><br />
* <math>2\alpha_0+\alpha_3 \leq 2</math><br />
* <math>2\alpha_1+\alpha_3 \leq 2</math><br />
* <math>2\alpha_2+\alpha_3 \leq 2</math><br />
* <math>\alpha_3 \leq 1</math><br />
<br />
The c-statistics can also be found from an exhaustive search of Moser sets. The resulting Pareto sets and linear inequalities can be found on Sheet 8 of [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# this spreadsheet]<br />
<br />
== n=4 ==<br />
<br />
A [http://abel.math.umu.se/~klasm/extremal-moser-n=5-t=3 computer search] has obtained all extremisers to <math>c'_4=43</math>. The 42-point solutions can be found [http://abel.math.umu.se/~klasm/moser-n=4-t=3-p=42.gz here].<br />
<br />
'''Proof that <math>c'_4 \leq 43</math>''': <br />
When e=1 (i.e. the 4D set contains 2222) then we have at most 41 points (in fact at most 39) by counting antipodal points, so assume e=0.<br />
<br />
Define the score of a 3D slice to be a/4+b/3+c/2+d. Observe from double counting that the size of a 4D set is the sum of the scores of its eight side slices.<br />
<br />
But by [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en looking at the extremals] we see that the largest score is 44/8, attained at only one point, namely when (a,b,c,d) = (2,6,6,0). So the only way one can have a 44-point set is if all side slices are (2,6,6,0), or equivalently if the whole set has statistics (a,b,c,d,e) = (4,16,24,0,0). But then we have all the points with two 2s, which means that the four "a" points cannot be separated by Hamming distance 2. We conclude that we must have an antipodal pair among the "a" points with an odd number of 1s, and an antipodal pair among the "a" points with an even number of 1s. By the symmetries of the cube, we may take the a-set to then be 1111, 3333, 1113, 3331. But then the "b" set must exclude both 1112 and 3332, and so can have at most three points in the eight-point set xyz2 (with x,y,z=1,3) rather than four (to get four points one must alternate in a checkerboard pattern). Adding this to the at most four points of the form xy2z, x2yz, 2xyz we see that b is at most 15, a contradiction. <math>\Box</math><br />
<br />
Given a subset of <math>[3]^4</math>, let a be the number of points with no 2s, b be the number of points with 1 2, and so forth. The quintuple (a,b,c,d,e) thus lies between (0,0,0,0,0) and (16,32,24,8,1).<br />
<br />
The 43-point solutions have distributions (a,b,c,d,e) as follows:<br />
<br />
* (5,20,18,0,0) [16 solutions]<br />
* (4,16,23,0,0) [768 solutions]<br />
* (3,16,24,0,0) [512 solutions]<br />
* (4,15,24,0,0) [256 solutions]<br />
<br />
The 42-point solutions are [http://abel.math.umu.se/~klasm/Moser-42-stat.pdf distributed as follows]:<br />
<br />
* (6,24,12,0,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-12 8 solutions]] <br />
* (5,20,17,0,0) [576 solutions]<br />
* (5,19,18,0,0) [384 solutions]<br />
* (6,16,18,2,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-e=2 192 solutions]]<br />
* (4,20,18,0,0) [272 solutions]<br />
* (5,17,20,0,0) [192 solutions]<br />
* (5,16,21,0,0) [3584 solutions]<br />
* (4,17,21,0,0) [768 solutions]<br />
* (4,16,22,0,0) [26880 solutions]<br />
* (5,15,22,0,0) [1536 solutions]<br />
* (4,15,23,0,0) [22272 solutions]<br />
* (3,16,23,0,0) [15744 solutions]<br />
* (4,14,24,0,0) [4224 solutions]<br />
* (3,15,24,0,0) [8704 solutions]<br />
* (2,16,24,0,0) [896 solutions]<br />
<br />
Note how c is usually quite large, and d quite low.<br />
<br />
One of the (6,24,12,0,0) solutions is <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math> (i.e. the set of points containing exactly two 1s, and/or exactly two 3s). The other seven are reflections of this set.<br />
<br />
There are 2,765,200 41-point solutions, [http://abel.math.umu.se/~klasm/solutions-4-t=3-41-moser.gz listed here]. The statistics for such points can be [http://abel.math.umu.se/~klasm/Moser-41-stat.pdf found here]. Noteworthy features of the statistics:<br />
<br />
* d is at most 3 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=3.gz 256 exceptional solutions] of the shape (5,15,18,3,0), have d at most 2; here are the [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=2.gz d=2 solutions] and [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=1.gz d=1 solutions])<br />
* c is at least 6 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-c=6.gz 16 exceptional solutions] of the shape (7,28,6,0,0), have c at least 11).<br />
<br />
Statistics for the 41-point, 42-point, and 43-point solutions can be found [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en here].<br />
<br />
If a Moser set in <math>[3]^4</math> contains 2222, then by the n=3 theory, any middle slice (i.e. 2***, *2**, **2*, or ***2) is missing at least one antipodal pair. But each antipodal pair belongs to at most three middle slices, thus two of the 40 antipodal pairs must be completely missing. As a consequence, any Moser set containing 2222 can have at most 39 points.<br />
(A more refined analysis can be found at [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
We have the following inequalities connecting a,b,c,d,e:<br />
<br />
* <math>4a+b \leq 64</math>: There are 32 lines connecting two "a" points with a "b" point; each "a" point belongs to four of these lines, and each "b" point belongs to one. But each such line can have at most two points in the set, and the claim follows.<br />
* This can be refined to <math>4a+b+\frac{2}{3} c \leq 64</math>: There are 24 planes connecting four "a" points, four "b" points, and one "c" point; each "a" point belongs to six of these, each "b" point belongs to three, and each "c" point belongs to one. For each of these planes, we have <math>2a + b + 2c \leq 8</math> from the n=2 theory, and the claim follows.<br />
* <math>6a+2c \leq 96</math>: There are 96 lines connecting two "a" points with a "c" point; each "a" point belongs to six of these lines, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
* <math>3b+2c \leq 96</math>: There are 48 lines connecting two "b" points to a "c" point; each "b" point belongs to three of these points, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
<br />
The inequalities for n=3 imply inequalities for n=4. Indeed, there are eight side slices of <math>[3]^4</math>; each "a" point belongs to four of these, each "b" point belongs to three, each "c" point belongs to two, and each "d" point belongs to one. Thus, any inequality of the form<br />
:<math> \alpha a + \beta b + \gamma c + \delta d \leq M</math><br />
in three dimensions implies the inequality<br />
:<math> 4 \alpha a + 3 \beta b + 2 \gamma c + \delta d \leq 8M</math><br />
in four dimensions. Thus we have<br />
<br />
* <math>8a+3b+4c+4d \leq 176</math><br />
* <math>2a+b+c+d \leq 48</math><br />
* <math>14a+3b+4c+4d \leq 244</math><br />
* <math>4a+b+c+d \leq 64</math> <br />
* <math>3b+2c+3d \leq 96</math><br />
* <math>a+c+d \leq 28</math><br />
* <math>5a+2c+2d \leq 80</math><br />
* <math>a+d \leq 16</math><br />
* <math>b+2d \leq 32</math><br />
* <math>2c+3d \leq 48</math><br />
<br />
Cubes also sit diagonally in the 4-dimensional cube. They may have coordinates xxyz, where xx runs over (11,22,33) or (13,22,31), while y and z run over (1,2,3). These cubes have a different distribution of 2s than the ordinary slices: (a,b,c,d,e) = (8,8,6,4,1) instead of (8,12,6,1,0) for the side slices and (0,8,12,6,1) for the middle slices. So a different set of inequalities arise. It is possible to run through the same procedure as described above for n=3, and the inequalities that arise are <math>2a+b+2c+2d+4e \le 24</math> and <math>4a+b+2c+2d+4e \le 32</math> within the 3D cube, which when averaged becomes <math>4a+b+2c+4d+16e \le 96</math> and <math>8a+b+2c+4d+16e \le 128</math> for the 4D cube. A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here] on sheet 2. Other sheets of this spreadsheet contain results for a 3D cube sitting diagonally in a 5D, 6D or 7D cube. Notice the similarity of the equations that arise for the xxxyz, xxyyz and xxxxyz diagonals. Also notice the xxxxyyz diagonals have the same Pareto sets and linear inequalities as the cube's c-statistics<br />
<br />
== n=5 ==<br />
<br />
Let (A,B,C,D,E,F) be the statistics of a five-dimensional Moser set, thus (A,B,C,D,E,F) varies between (0,0,0,0,0,0) and (32,80,80,40,10,1).<br />
<br />
There are several Moser sets with the statistics (4,40,80,0,0,0), which thus have 124 points. Indeed, one can take<br />
<br />
* all points with two 2s;<br />
* all points with one 2 and an even number of 1s; and<br />
* (13111),(13113),(31311),(13333). Any two of these four points differ in three places, except for one pair of points that differ in one place.<br />
<br />
For the rest of this section, we assume that the Moser set <math>{\mathcal A}</math> is a 125-point set.<br />
<br />
:'''Lemma 1''': F=0.<br />
<br />
'''Proof''' If F is non-zero, then the Moser set contains 22222, then each of the 121 antipodal pairs can have at most one point in the set, leading to only 122 points. <math>\Box</math><br />
<br />
:'''Lemma 2''': Every middle slice of <math>{\mathcal A}</math> has at most 41 points.<br />
<br />
'''Proof''' Without loss of generality we may consider the 2**** slice. There are two cases, depending on the value of c(2****).<br />
<br />
Suppose first that c is at least 17; thus there are at least 17 points of the form 222xy, 22x2y, 22xy2, 2x2y2, 2xy22, or 2x22y, where the x, y denote 1 or 3. This gives 34 "xy" wildcards in all in four coordinate slots; by the pigeonhole principle one of the slots sees at least 9 of the wildcards. By symmetry, we may assume that the second coordinate slot sees at least 9 of these wildcards, thus there are at least 9 points of the form 2x22y, 2x2y2, 2xy22. The x=1, x=3 cases can absorb at most six of these, thus each of these cases must absorb at least three points, with at least one absorbing at least five. Let's say that it's the x=3 case that absorbs 5; thus <math>d(*1***) \geq 3</math> and <math>d(*3***) \geq 5</math>. From the n=4 theory this means that the *1*** slice has at most 41 points, and the *3*** slice has at most 40. Meanwhile, the middle slice has at most 43, leading to 41+41+42=124 points in all.<br />
<br />
Now suppose c is less than 17; then by the n=4 theory the middle slice is one of the eight (6,24,12,0,0) sets. Without loss of generality we may take it to be <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math>; in particular, the middle slice contains the points 21122 21212 21221 23322 23232 23223. In particular, the *1*** and *3*** slices have a "d" value of at least three, and so have at most 41 points. If the *2*** slice has at most 42 points, then we are at 41+42+41=124 points as needed, but if we have 43 or more, then we are back in the first case (as <math>c(*2***) \geq 17</math>) after permuting the indices.<br />
<math>\Box</math><br />
<br />
Since 125=41+41+43, we thus have<br />
<br />
:'''Corollary 1''': Every side slice of <math>{\mathcal A}</math> has at least 41 points. If one side slice does have 41 points, then the other has 43.<br />
<br />
Combining this with the n=4 statistics, we conclude<br />
<br />
:'''Corollary 2''': Every side slice of <math>{\mathcal A}</math> has e=0, and <math>d \leq 3</math>. Given two opposite side slices, e.g. 1**** and 3****, we have <math>d(1****)+d(3****) \leq 4</math>.<br />
<br />
:'''Corollary 3''': E=0.<br />
<br />
:'''Corollary 4''': Any middle slice has a "c" value of at most 8.<br />
<br />
'''Proof''' Let's work with the 2**** slice. By Corollary 2, there are at most four contributions to c(2****) of the form 21*** or 23***, and similarly for the other three positions. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 3''': <math>2B+C \leq 160</math>. <br />
<br />
'''Proof''': There are 160 lines connecting one "C" point to two "B" points (e.g. 11112, 11122, 11132); each "C" point lies in two of these, and each "B" point lies on four. A Moser set can have at most two points out of each of these lines. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 4''' Given m A-points with <math>m \geq 5</math>, there exists at least m-4 pairs (a,b) of such A-points with Hamming distance exactly two (i.e. b differs from a in exactly two places).<br />
<br />
'''Proof''' It suffices to check this for m=5, since the case of larger m then follows by locating a pair, removing a point that contributes to that pair, and using the induction hypothesis. Given 5 A-points, we may assume by the pigeonhole principle and symmetry that at least three of them have an odd number of 1s. Suppose 11111 is one of the points, and that no pair has Hamming distance 2. All points with two 3s are excluded, so the only points allowed with an odd number of 1s are those with four 3s. But all those points differ from each other in two positions, so at most one of them is allowed. <math>\Box</math><br />
<br />
:'''Corollary 5''' <math>C \leq 79</math>.<br />
<br />
'''Proof''': Suppose for contradiction that C=80, then D=0 and <math>B \leq 40</math>. From Lemma 4 we also see that A cannot exceed 5, leading to the contradiction.<br />
<math>\Box</math><br />
<br />
Define the '''score''' of a Moser set in <math>[3]^4</math> to be the quantity <math>a + 5b/4 + 5c/3 + 5d/2 + 5e</math>. Double-counting (and Lemma 2) gives<br />
<br />
:'''Lemma 5''' The total score of all the ten side-slices of <math>{\mathcal A}</math> is equal to <math>5|{\mathcal A}| = 5 \times 125</math>. In particular, there exists a pair of opposite side-slices whose scores add up to at least 125.<br />
<br />
By Lemma 5 and symmetry, we may assume that the 1**** and 3**** slices have score adding up to at least 125. By Lemma 2, the 2**** slice has at most 41 points, which imply that the 1**** and 3**** have 41, 42, or 43 points. From the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] we know that all 41-point, 42-point, 43-point slices have score less than 62, with the following exceptions:<br />
<br />
# (2,16,24,0,0) [42 points, score: 62]<br />
# (4,16,23,0,0) [43 points, score: 62 1/3]<br />
# (4,15,24,0,0) [43 points, score: 62 3/4]<br />
# (3,16,24,0,0) [43 points, score: 63]<br />
<br />
Thus the 1**** and 3**** slices must from the above list. Furthermore, if one of the slices is of type 1, then the other must be of type 4, and if one slice is of type 2, then the other must be of type 3 or 4.<br />
<br />
:'''Lemma 6''' There exists one cut in which the side slices have total score strictly greater than 125 (i.e. they thus involve only Type 2, Type 3, and Type 4 slices, with at least one side slice not equal to Type 2, and the cut here is of the form 43+39+43).<br />
<br />
'''Proof''' If not, then all cuts have side slices exactly equal to 125, which by the above table implies that one is Type 1 and one is Type 4, in particular all side slices have c=24. But this forces C=80, contradicting Corollary 5. <math>\Box</math><br />
<br />
Adding up the corners we conclude<br />
<br />
:'''Corollary 6''' <math>6 \leq A \leq 8</math>.<br />
<br />
:'''Lemma 7''' <math>D=0</math>.<br />
<br />
'''Proof''' Firstly, from Lemma 6 we have a cut in which the two side slices are omitting at most one C-point between them (and have no D-point), which forces the middle slice to have at most one D-point; thus D is at most 1.<br />
<br />
Now suppose instead that D=1 (e.g. if 11222 was in the set); then there would be two choices of coordinates in which one of the side slices would have d=1 (e.g. 1**** and *1****). But the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] show that such slices have a score of at most 59 7/12, so that the total score from those two coordinates is at most 63 + 59 7/12 = 122 7/12. On the other hand, the other three slices have a net score of at most 63+63 = 126. This averages out to at most 124.633... < 125, a contradiction. <math>\Box</math><br />
<br />
:'''Corollary 7''' Every middle slice has at most 40 points.<br />
<br />
'''Proof''' From Lemma 7 we see that the middle slice must have c=0, but from the n=4 statistics this is not possible for any slice of size 41 or higher (alternatively, one can use the inequalities <math>4a+b \leq 64</math> and <math>b \leq 32</math>). <math>\Box</math><br />
<br />
Each middle slice now has 39 or 40 points, with c=d=e=0, and so from the inequalities <math>4a+b \leq 64</math>, <math>b \leq 32</math> must have statistics (8,32,0,0,0),(7,32,0,0,0), or (8,31,0,0,0). In particular the "a" index of the middle slices is at most 8. Summing over all middle slices we conclude<br />
<br />
:'''Lemma 9''' B is at most 40.<br />
<br />
:'''Lemma 10''' C is equal to 78 or 79.<br />
<br />
'''Proof''' By Corollary 5, it suffices to show that <math>C \geq 78</math>.<br />
<br />
As 125=43+39+43, we see that every center slice must have at least 39 points. By Lemma 2 and Lemma 7 the center slice has c=d=e=0, thus the center slice has a+b >= 39. On the other hand, from the n=4 theory we have 4a+b <= 64, which forces b >= 31.<br />
<br />
By double counting, we see that 2C is equal to the sum of the b's of all the five center slices. Thus C >= 5*31/2 = 77.5 and the claim follows. <math>\Box</math><br />
<br />
From Lemmas 9 and 10 we see that <math>B+C \leq 119</math>, and thus <math>A \geq 6</math>. Also, if <math>A \geq 7</math>, then by Lemma 4 we have at least three pairs of A points with Hamming distance 2. At most two of these pairs eliminate the same C point, so we would have C=78 in that case. <br />
<br />
Putting all the above facts together, we see that (A,B,C,D,E,F) must be one of the following triples:<br />
<br />
* (6,40,79,0,0,0)<br />
* (7,40,78,0,0,0)<br />
* (8,39,78,0,0,0)<br />
<br />
All three cases can be eliminated, giving <math>c'_5=124</math>.<br />
<br />
=== Elimination of (6,40,79,0,0,0) ===<br />
<br />
Look at the 6 A points. From Lemma 4 we have at least two pairs (a,b), (c,d) of A-points that have Hamming separation 2.<br />
<br />
Now look at the midpoints of these two pairs; these midpoints are C-points cannot lie in the set. But we have exactly one C-point missing from the set, thus the midpoints must be the same. By symmetry, we may thus assume that the two pairs are (11111,11133) and (11113,11131). Thus 11111,11133, 11113, 11131 are in the set, and so every C-point other than 11122 is in the set. On the other hand, the B-points 11121, 11123, 11112, 11132 lie outside the set.<br />
<br />
At most one of 11312, 11332 lie in the set (since 11322 lies in the set). Suppose that 11312 lies outside the set, then we have a pair (xy1z2, xy3z2) with x,y,z = 1,3 that is totally omitted from the set, namely (11112,11312). On the other hand, every other pair of this form can have at most point in the set, thus there are at most seven points in the set of the form (xyzw2) with x,y,z,w = 1,3. Similarly there are at most 8 points of the form xyz2w, or of xy2zw, x2yzw, 2xyzw, leading to at most 39 B-points in all, contradiction.<br />
<br />
=== Elimination of (7,40,78,0,0,0) ===<br />
<br />
By Lemma 4, we have at least three pairs of A-points of distance two apart that lie in the set.<br />
The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs must have the same midpoint, so we may assume as before that 111xy lies in the set for x,y=1,3, which implies that 1112* and 111*2 lie outside the set.<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 316. On the other hand, every one of the 160 lines can have at most two points, and four of these lines (namely 1112*, 111*2) have no points. Thus all the other lines must have exactly two points. <br />
<br />
We know that the C-point 11122 is missing from the set; there is one other missing C-point. Since 1112x, 111x2 lie outside the set, we conclude from the previous paragraph that 1132x, 113x2 and 1312x, 131x2 lie in the set. Taking midpoints we conclude that 11322 and 13122 lie outside the set. But this is now three C-points missing (together with 11122), a contradiction.<br />
<br />
=== Elimination of (8,39,78,0,0,0) ===<br />
<br />
By Lemma 4 we have at least four pairs of A-points of distance two apart that lie in the set. The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs (a,b), (c,d) must have the same midpoint p, and the other two pairs (a',b'), (c',d') must also have the same midpoint p'. (Note that every C-point is the midpoint of at most two such pairs.)<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 312. Every one of the 160 lines can have at most two points, and four of these (those in the plane of (a,b,c,d) or of (a',b',c',d') have no points. Thus all other lines must have exactly two points.<br />
<br />
Without loss of generality we have (a,b)=(11111,11133), (c,d) = (11113,11131), thus p = 11122. By permuting the first three indices, we may assume that p' is not of the form x2y2z, x2yz2, xy22z, xy2z2. Then 1112x lies outside the set and 1122x lies in the set, so by the above paragraph 1132x lies in the set; similarly for 113x2, 1312x, 131x2. This implies that 13122, 11322 lie outside the set, but this (together with 11122) shows that at least three C-points are missing, a contradiction.<br />
<br />
== General n ==<br />
<br />
General solution for <math>c'_N</math><br />
<br />
* q 2s, all points from A(N-q,1)<br />
* q-1 2s, points from A(N-q+1,2)<br />
* q-2 2s, points from A(N-q+2,3)<br />
* etc.<br />
<br />
where A(m,d) is a subset of <math>[1,3]^m</math> for which any two points differ from each other in at least d places.<br />
<br />
Mathworld’s entry on error-correcting codes suggests it might be NP-complete to find the size of A(m,d) in general.<br />
<br />
* <math>|A(m,1)| = 2^m</math> because it includes all points in <math>[1,3]^m</math><br />
* <math>|A(m,2)| = 2^{m-1}</math> because it can include all points in <math>[1,3]^m</math> with an odd number of 1s<br />
<br />
The integer programming routine from Maple 12 was used to obtain upper bounds for <math>c'_6</math> and <math>c'_7</math>. A large number of linear inequalities, described above in sections (n=3) and (n=4), were combined. The details are in [[Maple calculations]]. The results were that <math>c'_6 \le 361</math> and <math>c'_7 \le 1071</math>.<br />
<br />
A [[genetic algorithm]] has provided the following examples:<br />
<br />
* <math>c'_6 \geq 353</math> (26 examples; [http://twofoldgaze.wordpress.com/2009/03/10/353-element-solution/ here is one])<br />
* <math>c'_7 \geq 988</math> [http://twofoldgaze.wordpress.com/2009/03/10/978-element-solution/ Here is the example]<br />
<br />
== Larger sides (k>3) ==<br />
<br />
The following set gives a lower bound for Moser’s cube <math>[4]^n</math> (values 1,2,3,4): Pick all points where q entries are 2 or 3; and also pick those where q-1 entries are 2 or 3 and an odd number of entries are 1. This is maximized when q is near n/2, giving a lower bound of<br />
<br />
<math>\binom{n}{n/2} 2^n + \binom{n}{n/2-1} 2^{n-1}</math><br />
<br />
which is comparable to <math>4^n/\sqrt{n}</math> by [[Stirling's formula]].<br />
<br />
For k=5 (values 1,2,3,4,5) If A, B, C, D, and E denote the numbers of 1-s, 2-s, 3-s, 4-s and 5-s then the first three points of a geometric line form a 3-term arithmetic progression in A+E+2(B+D)+3C. So, for k=5 we have a similar lower bound for the Moser’s problem as for DHJ k=3, i.e. <math>5^{n - O(\sqrt{\log n})}</math>.<br />
<br />
The k=6 version of Moser implies DHJ(3). Indeed, any k=3 combinatorial line-free set can be "doubled up" into a k=6 geometric line-free set of the same density by pulling back the set from the map <math>\phi: [6]^n \to [3]^n</math> that maps 1, 2, 3, 4, 5, 6 to 1, 2, 3, 3, 2, 1 respectively; note that this map sends k=6 geometric lines to k=3 combinatorial lines.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Moser%27s_cube_problemMoser's cube problem2009-03-21T13:41:48Z<p>121.220.134.232: /* n=5 */</p>
<hr />
<div>Define a ''Moser set'' to be a subset of <math>[3]^n</math> which does not contain any [[geometric line]], and let <math>c'_n</math> denote the size of the largest Moser set in <math>[3]^n</math>. The first few values are (see [http://www.research.att.com/~njas/sequences/A003142 OEIS A003142]):<br />
<br />
: <math>c'_0 = 1; c'_1 = 2; c'_2 = 6; c'_3 = 16; c'_4 = 43; c'_5 = 124.</math><br />
<br />
Beyond this point, we only have some upper and lower bounds, in particular <math>353 \leq c'_6 \leq 361</math>; see [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DsU-uZ1tK7VEg this spreadsheet] for the latest bounds.<br />
<br />
The best known asymptotic lower bound for <math>c'_n</math> is<br />
<br />
: <math>c'_n \gg 3^n/\sqrt{n}</math>,<br />
<br />
formed by fixing the number of 2s to a single value near n/3. Compare this to the DHJ(2) or Sperner limit of <math>2^n/\sqrt{n}</math>. Is it possible to do any better? Note that we have a [[Upper_and_lower_bounds#Asymptotics|significantly better bound]] for the DHJ(3) <math>c_n</math>:<br />
<br />
: <math>c_n \geq 3^{n-O(\sqrt{\log n})}</math>.<br />
<br />
A more precise lower bound is<br />
<br />
: <math>c'_n \geq \binom{n+1}{q} 2^{n-q}</math><br />
<br />
where q is the nearest integer to <math>n/3</math>, formed by taking all strings with q 2s, together with all strings with q-1 2s and an odd number of 1s. This for instance gives the lower bound <math>c'_5 \geq 120</math>, which compares with the upper bound <math>c'_5 \leq 3 c'_4 = 129</math>.<br />
<br />
Using [[DHJ(3)]], we have the upper bound<br />
<br />
: <math>c'_n = o(3^n)</math>,<br />
<br />
but no effective decay rate is known. It would be good to have a combinatorial proof of this fact (which is weaker than [[DHJ(3)]], but implies [[Roth's theorem]]).<br />
<br />
== Notation ==<br />
<br />
Given a Moser set A in <math>[3]^n</math>, we let a be the number of points in A with no 2s, b be the number of points in A with one 2, c the number of points with two 2s, etc. We call (a,b,c,...) the ''statistics'' of A. Given a slice S of <math>[3]^n</math>, we let a(S), b(S), etc. denote the statistics of that slice (dropping the fixed coordinates). Thus for instance if A = {11, 12, 22, 32}, then (a,b,c) = (1,2,1), and (a(1*),b(1*)) = (1,1).<br />
<br />
We call a statistic (a,b,c,...) ''attainable'' if it is attained by a Moser set. We say that an attainable statistic is ''Pareto-optimal'' if it cannot be pointwise dominated by any other attainable statistic (a',b',c',...) (thus <math>a' \geq a</math>, <math>b' \geq b</math>, etc.) We say that it is ''extremal'' if it is not a convex combination of any other attainable statistic (this is a stronger property than Pareto-optimal). For the purposes of maximising linear scores of attainable statistics, it suffices to check extremal statistics.<br />
<br />
We let <math>(\alpha_0,\alpha_1,\ldots)</math> be the normalized version of <math>(a,b,\ldots)</math>, in which one divides the number of points of a certain type in the set by the total number of points in the set. Thus for instance <math>\alpha_0 = a/2^n, \alpha_1 = b/(n 2^{n-1}), \alpha_3 = c/(\binom{n}{2} 2^{n-2})</math>, etc., and the <math>\alpha_i</math> range between 0 and 1. Averaging arguments show that any linear inequality obeyed by the <math>\alpha_i</math> at one dimension is automatically inherited by higher dimensions, as are shifted versions of this inequality (in which <math>\alpha_i</math> is replaced by <math>\alpha_{i+1}</math>.<br />
<br />
The idea in 'c-statistics' is to identify 1s and 3s but leave the 2s intact. Let’s use x to denote letters that are either 1 or 3, then the 81 points get split up into 16 groups: 2222 (1 point), 222x, 22×2, 2×22, x222 (two points each), 22xx, 2×2x, x22x, 2xx2, x2×2, xx22 (four points each), 2xxx, x2xx, xx2x, xxx2 (eight points each), xxxx (sixteen points). Let c(w) denote the number of points inside a group w, e.g. c(xx22) is the number of points of the form xx22 inside the set, and is thus an integer from 0 to 4.<br />
<br />
== n=0 ==<br />
<br />
We trivially have <math>c'_0=1.</math><br />
<br />
== n=1 ==<br />
<br />
We trivially have <math>c'_1=2.</math> The Pareto-optimal values of the statistics (a,b) are (1,1) and (2,0); these are also the extremals. We thus have the inequality <math>a+b \leq 2</math>, or in normalized notation<br />
<br />
:<math>2\alpha_0 + \alpha_1 \leq 2</math>.<br />
<br />
== n=2 ==<br />
<br />
We have <math>c'_2 = 6</math>; the upper bound follows since <math>c'_2 \leq 3 c'_1</math>, and the lower bound follows by deleting one of the two diagonals from <math>[3]^2</math> (these are the only extremisers).<br />
<br />
The extremiser has statistics (a,b,c) = (2,4,0), which are of course Pareto-optimal. If c=1 then we must have a, b at most 2 (look at the lines through 22). This is attainable (e.g. {11, 12, 22, 23, 31}, and so (2,2,1) is another Pareto-optimal statistic. If a=4, then b and c must be 0, so we get another Pareto-optimal statistic (4,0,0) (attainable by {11, 13, 31, 33} of course). If a=3, then c=0 and b is at most 2, giving another Pareto-optimal statistic (3,2,0); but this is a convex combination of (4,0,0) and (2,4,0) and is thus not extremal. Thus the complete set of extremal statistics are<br />
<br />
(4,0,0), (2,4,0), (2,2,1).<br />
<br />
The sharp linear inequalities obeyed by a,b,c (other than the trivial ones <math>a,b,c \geq 0</math>) are then<br />
<br />
*<math>2a+b+2c \leq 8</math><br />
*<math>b+2c \leq 4</math><br />
*<math>a+2c \leq 4</math><br />
*<math>c \leq 1</math>.<br />
<br />
In normalized notation, we have<br />
*<math>4\alpha_0 + 2\alpha_1 + \alpha_2 \leq 4</math><br />
*<math>2\alpha_1 + \alpha_2 \leq 2</math><br />
*<math>2\alpha_0 + \alpha_2 \leq 2</math><br />
*<math>\alpha_2 \leq 1</math>.<br />
<br />
The Pareto optimizers for c-statistics are (c22,c2x,cx2,cxx) = (1112),(0222),(0004), which are covered by the linear inequalities <br />
*<math>c22+c2x+cxx \le 4</math>, <br />
*<math>c22+cx2+cxx \le 4</math>.<br />
<br />
== n=3 ==<br />
<br />
We have <math>c'_3 = 16</math>. The lower bound can be seen for instance by taking all the strings with one 2, and half the strings with no 2 (e.g. the strings with an odd number of 1s). The upper bound can be deduced from the corresponding [[upper and lower bounds]] for <math>c_3 = 18</math>; the 17-point and 18-point line-free sets each contain a geometric line.<br />
<br />
If a Moser set in <math>[3]^3</math> contains 222, then it can have at most 14 points, since the remaining 26 points in the cube split into 13 antipodal pairs, and at most one of each pair can lie in the set. By exhausting over the <math>2^{13} = 8192</math> possibilities, it can be shown that it is impossible for a 14-point set to exist; any Moser set containing 222 must in fact omit at least one antipodal pair completely and thus have only 13 points. (A human proof of this fact can be [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
The Pareto-optimal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(3,6,5,0),(4,4,5,0),(3,7,4,0),(4,6,4,0), (3,9,3,0),(4,7,3,0),(5,4,3,0),(4,9,2,0),(5,6,2,0),(6,3,2,0),(3,10,1,0),(5,7,1,0), (6,4,1,0),(4,12,0,0),(5,9,0,0),(6,6,0,0),(7,3,0,0),(8,0,0,0).<br />
<br />
These were found from a search of the <math>2^{27}</math> subsets of the cube.<br />
A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here].<br />
<br />
The extremal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(4,4,5,0),(4,6,4,0),(4,12,0,0),(8,0,0,0)<br />
<br />
The sharp linear bounds are :<br />
<br />
* <math>2a+b+2c+4d \leq 22</math><br />
* <math>3a+2b+3c+6d \leq 36</math><br />
* <math>7a+2b+4c+8d \leq 56</math><br />
* <math>6a+2b+3c+6d \leq 48</math><br />
* <math>b+c+3d \leq 12</math><br />
* <math>a+2c+4d \leq 14</math><br />
* <math>5a+4c+8d \leq 40</math><br />
* <math>a+4d \leq 8</math><br />
* <math>b+6d \leq 12</math><br />
* <math>c+3d \leq 6</math><br />
* <math>d \leq 1</math><br />
<br />
In normalized notation,<br />
* <math>8\alpha_0+ 6\alpha_1 + 6\alpha_2 + 2\alpha_3 \leq 11</math><br />
* <math>4\alpha_0+4\alpha_1+3\alpha_2+\alpha_3 \leq 6</math><br />
* <math>7\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 7</math><br />
* <math>8\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 8</math><br />
* <math>4\alpha_1+2\alpha_2+\alpha_3 \leq 4</math><br />
* <math>4\alpha_0+6\alpha_2+2\alpha_3 \leq 7</math><br />
* <math>5\alpha_0+3\alpha_2+\alpha_3 \leq 5</math><br />
* <math>2\alpha_0+\alpha_3 \leq 2</math><br />
* <math>2\alpha_1+\alpha_3 \leq 2</math><br />
* <math>2\alpha_2+\alpha_3 \leq 2</math><br />
* <math>\alpha_3 \leq 1</math><br />
<br />
The c-statistics can also be found from an exhaustive search of Moser sets. The resulting Pareto sets and linear inequalities can be found on Sheet 8 of [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# this spreadsheet]<br />
<br />
== n=4 ==<br />
<br />
A [http://abel.math.umu.se/~klasm/extremal-moser-n=5-t=3 computer search] has obtained all extremisers to <math>c'_4=43</math>. The 42-point solutions can be found [http://abel.math.umu.se/~klasm/moser-n=4-t=3-p=42.gz here].<br />
<br />
'''Proof that <math>c'_4 \leq 43</math>''': <br />
When e=1 (i.e. the 4D set contains 2222) then we have at most 41 points (in fact at most 39) by counting antipodal points, so assume e=0.<br />
<br />
Define the score of a 3D slice to be a/4+b/3+c/2+d. Observe from double counting that the size of a 4D set is the sum of the scores of its eight side slices.<br />
<br />
But by [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en looking at the extremals] we see that the largest score is 44/8, attained at only one point, namely when (a,b,c,d) = (2,6,6,0). So the only way one can have a 44-point set is if all side slices are (2,6,6,0), or equivalently if the whole set has statistics (a,b,c,d,e) = (4,16,24,0,0). But then we have all the points with two 2s, which means that the four "a" points cannot be separated by Hamming distance 2. We conclude that we must have an antipodal pair among the "a" points with an odd number of 1s, and an antipodal pair among the "a" points with an even number of 1s. By the symmetries of the cube, we may take the a-set to then be 1111, 3333, 1113, 3331. But then the "b" set must exclude both 1112 and 3332, and so can have at most three points in the eight-point set xyz2 (with x,y,z=1,3) rather than four (to get four points one must alternate in a checkerboard pattern). Adding this to the at most four points of the form xy2z, x2yz, 2xyz we see that b is at most 15, a contradiction. <math>\Box</math><br />
<br />
Given a subset of <math>[3]^4</math>, let a be the number of points with no 2s, b be the number of points with 1 2, and so forth. The quintuple (a,b,c,d,e) thus lies between (0,0,0,0,0) and (16,32,24,8,1).<br />
<br />
The 43-point solutions have distributions (a,b,c,d,e) as follows:<br />
<br />
* (5,20,18,0,0) [16 solutions]<br />
* (4,16,23,0,0) [768 solutions]<br />
* (3,16,24,0,0) [512 solutions]<br />
* (4,15,24,0,0) [256 solutions]<br />
<br />
The 42-point solutions are [http://abel.math.umu.se/~klasm/Moser-42-stat.pdf distributed as follows]:<br />
<br />
* (6,24,12,0,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-12 8 solutions]] <br />
* (5,20,17,0,0) [576 solutions]<br />
* (5,19,18,0,0) [384 solutions]<br />
* (6,16,18,2,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-e=2 192 solutions]]<br />
* (4,20,18,0,0) [272 solutions]<br />
* (5,17,20,0,0) [192 solutions]<br />
* (5,16,21,0,0) [3584 solutions]<br />
* (4,17,21,0,0) [768 solutions]<br />
* (4,16,22,0,0) [26880 solutions]<br />
* (5,15,22,0,0) [1536 solutions]<br />
* (4,15,23,0,0) [22272 solutions]<br />
* (3,16,23,0,0) [15744 solutions]<br />
* (4,14,24,0,0) [4224 solutions]<br />
* (3,15,24,0,0) [8704 solutions]<br />
* (2,16,24,0,0) [896 solutions]<br />
<br />
Note how c is usually quite large, and d quite low.<br />
<br />
One of the (6,24,12,0,0) solutions is <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math> (i.e. the set of points containing exactly two 1s, and/or exactly two 3s). The other seven are reflections of this set.<br />
<br />
There are 2,765,200 41-point solutions, [http://abel.math.umu.se/~klasm/solutions-4-t=3-41-moser.gz listed here]. The statistics for such points can be [http://abel.math.umu.se/~klasm/Moser-41-stat.pdf found here]. Noteworthy features of the statistics:<br />
<br />
* d is at most 3 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=3.gz 256 exceptional solutions] of the shape (5,15,18,3,0), have d at most 2; here are the [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=2.gz d=2 solutions] and [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=1.gz d=1 solutions])<br />
* c is at least 6 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-c=6.gz 16 exceptional solutions] of the shape (7,28,6,0,0), have c at least 11).<br />
<br />
Statistics for the 41-point, 42-point, and 43-point solutions can be found [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en here].<br />
<br />
If a Moser set in <math>[3]^4</math> contains 2222, then by the n=3 theory, any middle slice (i.e. 2***, *2**, **2*, or ***2) is missing at least one antipodal pair. But each antipodal pair belongs to at most three middle slices, thus two of the 40 antipodal pairs must be completely missing. As a consequence, any Moser set containing 2222 can have at most 39 points.<br />
(A more refined analysis can be found at [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
We have the following inequalities connecting a,b,c,d,e:<br />
<br />
* <math>4a+b \leq 64</math>: There are 32 lines connecting two "a" points with a "b" point; each "a" point belongs to four of these lines, and each "b" point belongs to one. But each such line can have at most two points in the set, and the claim follows.<br />
* This can be refined to <math>4a+b+\frac{2}{3} c \leq 64</math>: There are 24 planes connecting four "a" points, four "b" points, and one "c" point; each "a" point belongs to six of these, each "b" point belongs to three, and each "c" point belongs to one. For each of these planes, we have <math>2a + b + 2c \leq 8</math> from the n=2 theory, and the claim follows.<br />
* <math>6a+2c \leq 96</math>: There are 96 lines connecting two "a" points with a "c" point; each "a" point belongs to six of these lines, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
* <math>3b+2c \leq 96</math>: There are 48 lines connecting two "b" points to a "c" point; each "b" point belongs to three of these points, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
<br />
The inequalities for n=3 imply inequalities for n=4. Indeed, there are eight side slices of <math>[3]^4</math>; each "a" point belongs to four of these, each "b" point belongs to three, each "c" point belongs to two, and each "d" point belongs to one. Thus, any inequality of the form<br />
:<math> \alpha a + \beta b + \gamma c + \delta d \leq M</math><br />
in three dimensions implies the inequality<br />
:<math> 4 \alpha a + 3 \beta b + 2 \gamma c + \delta d \leq 8M</math><br />
in four dimensions. Thus we have<br />
<br />
* <math>8a+3b+4c+4d \leq 176</math><br />
* <math>2a+b+c+d \leq 48</math><br />
* <math>14a+3b+4c+4d \leq 244</math><br />
* <math>4a+b+c+d \leq 64</math> <br />
* <math>3b+2c+3d \leq 96</math><br />
* <math>a+c+d \leq 28</math><br />
* <math>5a+2c+2d \leq 80</math><br />
* <math>a+d \leq 16</math><br />
* <math>b+2d \leq 32</math><br />
* <math>2c+3d \leq 48</math><br />
<br />
Cubes also sit diagonally in the 4-dimensional cube. They may have coordinates xxyz, where xx runs over (11,22,33) or (13,22,31), while y and z run over (1,2,3). These cubes have a different distribution of 2s than the ordinary slices: (a,b,c,d,e) = (8,8,6,4,1) instead of (8,12,6,1,0) for the side slices and (0,8,12,6,1) for the middle slices. So a different set of inequalities arise. It is possible to run through the same procedure as described above for n=3, and the inequalities that arise are <math>2a+b+2c+2d+4e \le 24</math> and <math>4a+b+2c+2d+4e \le 32</math> within the 3D cube, which when averaged becomes <math>4a+b+2c+4d+16e \le 96</math> and <math>8a+b+2c+4d+16e \le 128</math> for the 4D cube. A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here] on sheet 2. Other sheets of this spreadsheet contain results for a 3D cube sitting diagonally in a 5D, 6D or 7D cube. Notice the similarity of the equations that arise for the xxxyz, xxyyz and xxxxyz diagonals. Also notice the xxxxyyz diagonals have the same Pareto sets and linear inequalities as the cube's c-statistics<br />
<br />
== n=5 ==<br />
<br />
Let (A,B,C,D,E,F) be the statistics of a five-dimensional Moser set, thus (A,B,C,D,E,F) varies between (0,0,0,0,0,0) and (32,80,80,40,10,1).<br />
<br />
There are several Moser sets with the statistics (4,40,80,0,0,0), which thus have 124 points. Indeed, one can take<br />
<br />
* all points with two 2s;<br />
* all points with one 2 and an even number of 1s; and<br />
* (13111),(13113),(31311),(13333). Any two of these four points differ in three places, except for one pair of points that differ in one place.<br />
<br />
For the rest of this section, we assume that the Moser set <math>{\mathcal A}</math> is a 125-point set.<br />
<br />
:'''Lemma 1''': F=0.<br />
<br />
'''Proof''' If F is non-zero, then the Moser set contains 22222, then each of the 121 antipodal pairs can have at most one point in the set, leading to only 122 points. <math>\Box</math><br />
<br />
:'''Lemma 2''': Every middle slice of <math>{\mathcal A}</math> has at most 41 points.<br />
<br />
'''Proof''' Without loss of generality we may consider the 2**** slice. There are two cases, depending on the value of c(2****).<br />
<br />
Suppose first that c is at least 17; thus there are at least 17 points of the form 222xy, 22x2y, 22xy2, 2x2y2, 2xy22, or 2x22y, where the x, y denote 1 or 3. This gives 34 "xy" wildcards in all in four coordinate slots; by the pigeonhole principle one of the slots sees at least 9 of the wildcards. By symmetry, we may assume that the second coordinate slot sees at least 9 of these wildcards, thus there are at least 9 points of the form 2x22y, 2x2y2, 2xy22. The x=1, x=3 cases can absorb at most six of these, thus each of these cases must absorb at least three points, with at least one absorbing at least five. Let's say that it's the x=3 case that absorbs 5; thus <math>d(*1***) \geq 3</math> and <math>d(*3***) \geq 5</math>. From the n=4 theory this means that the *1*** slice has at most 41 points, and the *3*** slice has at most 40. Meanwhile, the middle slice has at most 43, leading to 41+41+42=124 points in all.<br />
<br />
Now suppose c is less than 16; then by the n=4 theory the middle slice is one of the eight (6,24,12,0,0) sets. Without loss of generality we may take it to be <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math>; in particular, the middle slice contains the points 21122 21212 21221 23322 23232 23223. In particular, the *1*** and *3*** slices have a "d" value of at least three, and so have at most 41 points. If the *2*** slice has at most 42 points, then we are at 41+42+41=124 points as needed, but if we have 43 or more, then we are back in the first case (as <math>c(*2***) \geq 17</math>) after permuting the indices.<br />
<math>\Box</math><br />
<br />
Since 125=41+41+43, we thus have<br />
<br />
:'''Corollary 1''': Every side slice of <math>{\mathcal A}</math> has at least 41 points. If one side slice does have 41 points, then the other has 43.<br />
<br />
Combining this with the n=4 statistics, we conclude<br />
<br />
:'''Corollary 2''': Every side slice of <math>{\mathcal A}</math> has e=0, and <math>d \leq 3</math>. Given two opposite side slices, e.g. 1**** and 3****, we have <math>d(1****)+d(3****) \leq 4</math>.<br />
<br />
:'''Corollary 3''': E=0.<br />
<br />
:'''Corollary 4''': Any middle slice has a "c" value of at most 8.<br />
<br />
'''Proof''' Let's work with the 2**** slice. By Corollary 2, there are at most four contributions to c(2****) of the form 21*** or 23***, and similarly for the other three positions. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 3''': <math>2B+C \leq 160</math>. <br />
<br />
'''Proof''': There are 160 lines connecting one "C" point to two "B" points (e.g. 11112, 11122, 11132); each "C" point lies in two of these, and each "B" point lies on four. A Moser set can have at most two points out of each of these lines. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 4''' Given m A-points with <math>m \geq 5</math>, there exists at least m-4 pairs (a,b) of such A-points with Hamming distance exactly two (i.e. b differs from a in exactly two places).<br />
<br />
'''Proof''' It suffices to check this for m=5, since the case of larger m then follows by locating a pair, removing a point that contributes to that pair, and using the induction hypothesis. Given 5 A-points, we may assume by the pigeonhole principle and symmetry that at least three of them have an odd number of 1s. Suppose 11111 is one of the points, and that no pair has Hamming distance 2. All points with two 3s are excluded, so the only points allowed with an odd number of 1s are those with four 3s. But all those points differ from each other in two positions, so at most one of them is allowed. <math>\Box</math><br />
<br />
:'''Corollary 5''' <math>C \leq 79</math>.<br />
<br />
'''Proof''': Suppose for contradiction that C=80, then D=0 and <math>B \leq 40</math>. From Lemma 4 we also see that A cannot exceed 5, leading to the contradiction.<br />
<math>\Box</math><br />
<br />
Define the '''score''' of a Moser set in <math>[3]^4</math> to be the quantity <math>a + 5b/4 + 5c/3 + 5d/2 + 5e</math>. Double-counting (and Lemma 2) gives<br />
<br />
:'''Lemma 5''' The total score of all the ten side-slices of <math>{\mathcal A}</math> is equal to <math>5|{\mathcal A}| = 5 \times 125</math>. In particular, there exists a pair of opposite side-slices whose scores add up to at least 125.<br />
<br />
By Lemma 5 and symmetry, we may assume that the 1**** and 3**** slices have score adding up to at least 125. By Lemma 2, the 2**** slice has at most 41 points, which imply that the 1**** and 3**** have 41, 42, or 43 points. From the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] we know that all 41-point, 42-point, 43-point slices have score less than 62, with the following exceptions:<br />
<br />
# (2,16,24,0,0) [42 points, score: 62]<br />
# (4,16,23,0,0) [43 points, score: 62 1/3]<br />
# (4,15,24,0,0) [43 points, score: 62 3/4]<br />
# (3,16,24,0,0) [43 points, score: 63]<br />
<br />
Thus the 1**** and 3**** slices must from the above list. Furthermore, if one of the slices is of type 1, then the other must be of type 4, and if one slice is of type 2, then the other must be of type 3 or 4.<br />
<br />
:'''Lemma 6''' There exists one cut in which the side slices have total score strictly greater than 125 (i.e. they thus involve only Type 2, Type 3, and Type 4 slices, with at least one side slice not equal to Type 2, and the cut here is of the form 43+39+43).<br />
<br />
'''Proof''' If not, then all cuts have side slices exactly equal to 125, which by the above table implies that one is Type 1 and one is Type 4, in particular all side slices have c=24. But this forces C=80, contradicting Corollary 5. <math>\Box</math><br />
<br />
Adding up the corners we conclude<br />
<br />
:'''Corollary 6''' <math>6 \leq A \leq 8</math>.<br />
<br />
:'''Lemma 7''' <math>D=0</math>.<br />
<br />
'''Proof''' Firstly, from Lemma 6 we have a cut in which the two side slices are omitting at most one C-point between them (and have no D-point), which forces the middle slice to have at most one D-point; thus D is at most 1.<br />
<br />
Now suppose instead that D=1 (e.g. if 11222 was in the set); then there would be two choices of coordinates in which one of the side slices would have d=1 (e.g. 1**** and *1****). But the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] show that such slices have a score of at most 59 7/12, so that the total score from those two coordinates is at most 63 + 59 7/12 = 122 7/12. On the other hand, the other three slices have a net score of at most 63+63 = 126. This averages out to at most 124.633... < 125, a contradiction. <math>\Box</math><br />
<br />
:'''Corollary 7''' Every middle slice has at most 40 points.<br />
<br />
'''Proof''' From Lemma 7 we see that the middle slice must have c=0, but from the n=4 statistics this is not possible for any slice of size 41 or higher (alternatively, one can use the inequalities <math>4a+b \leq 64</math> and <math>b \leq 32</math>). <math>\Box</math><br />
<br />
Each middle slice now has 39 or 40 points, with c=d=e=0, and so from the inequalities <math>4a+b \leq 64</math>, <math>b \leq 32</math> must have statistics (8,32,0,0,0),(7,32,0,0,0), or (8,31,0,0,0). In particular the "a" index of the middle slices is at most 8. Summing over all middle slices we conclude<br />
<br />
:'''Lemma 9''' B is at most 40.<br />
<br />
:'''Lemma 10''' C is equal to 78 or 79.<br />
<br />
'''Proof''' By Corollary 5, it suffices to show that <math>C \geq 78</math>.<br />
<br />
As 125=43+39+43, we see that every center slice must have at least 39 points. By Lemma 2 and Lemma 7 the center slice has c=d=e=0, thus the center slice has a+b >= 39. On the other hand, from the n=4 theory we have 4a+b <= 64, which forces b >= 31.<br />
<br />
By double counting, we see that 2C is equal to the sum of the b's of all the five center slices. Thus C >= 5*31/2 = 77.5 and the claim follows. <math>\Box</math><br />
<br />
From Lemmas 9 and 10 we see that <math>B+C \leq 119</math>, and thus <math>A \geq 6</math>. Also, if <math>A \geq 7</math>, then by Lemma 4 we have at least three pairs of A points with Hamming distance 2. At most two of these pairs eliminate the same C point, so we would have C=78 in that case. <br />
<br />
Putting all the above facts together, we see that (A,B,C,D,E,F) must be one of the following triples:<br />
<br />
* (6,40,79,0,0,0)<br />
* (7,40,78,0,0,0)<br />
* (8,39,78,0,0,0)<br />
<br />
All three cases can be eliminated, giving <math>c'_5=124</math>.<br />
<br />
=== Elimination of (6,40,79,0,0,0) ===<br />
<br />
Look at the 6 A points. From Lemma 4 we have at least two pairs (a,b), (c,d) of A-points that have Hamming separation 2.<br />
<br />
Now look at the midpoints of these two pairs; these midpoints are C-points cannot lie in the set. But we have exactly one C-point missing from the set, thus the midpoints must be the same. By symmetry, we may thus assume that the two pairs are (11111,11133) and (11113,11131). Thus 11111,11133, 11113, 11131 are in the set, and so every C-point other than 11122 is in the set. On the other hand, the B-points 11121, 11123, 11112, 11132 lie outside the set.<br />
<br />
At most one of 11312, 11332 lie in the set (since 11322 lies in the set). Suppose that 11312 lies outside the set, then we have a pair (xy1z2, xy3z2) with x,y,z = 1,3 that is totally omitted from the set, namely (11112,11312). On the other hand, every other pair of this form can have at most point in the set, thus there are at most seven points in the set of the form (xyzw2) with x,y,z,w = 1,3. Similarly there are at most 8 points of the form xyz2w, or of xy2zw, x2yzw, 2xyzw, leading to at most 39 B-points in all, contradiction.<br />
<br />
=== Elimination of (7,40,78,0,0,0) ===<br />
<br />
By Lemma 4, we have at least three pairs of A-points of distance two apart that lie in the set.<br />
The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs must have the same midpoint, so we may assume as before that 111xy lies in the set for x,y=1,3, which implies that 1112* and 111*2 lie outside the set.<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 316. On the other hand, every one of the 160 lines can have at most two points, and four of these lines (namely 1112*, 111*2) have no points. Thus all the other lines must have exactly two points. <br />
<br />
We know that the C-point 11122 is missing from the set; there is one other missing C-point. Since 1112x, 111x2 lie outside the set, we conclude from the previous paragraph that 1132x, 113x2 and 1312x, 131x2 lie in the set. Taking midpoints we conclude that 11322 and 13122 lie outside the set. But this is now three C-points missing (together with 11122), a contradiction.<br />
<br />
=== Elimination of (8,39,78,0,0,0) ===<br />
<br />
By Lemma 4 we have at least four pairs of A-points of distance two apart that lie in the set. The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs (a,b), (c,d) must have the same midpoint p, and the other two pairs (a',b'), (c',d') must also have the same midpoint p'. (Note that every C-point is the midpoint of at most two such pairs.)<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 312. Every one of the 160 lines can have at most two points, and four of these (those in the plane of (a,b,c,d) or of (a',b',c',d') have no points. Thus all other lines must have exactly two points.<br />
<br />
Without loss of generality we have (a,b)=(11111,11133), (c,d) = (11113,11131), thus p = 11122. By permuting the first three indices, we may assume that p' is not of the form x2y2z, x2yz2, xy22z, xy2z2. Then 1112x lies outside the set and 1122x lies in the set, so by the above paragraph 1132x lies in the set; similarly for 113x2, 1312x, 131x2. This implies that 13122, 11322 lie outside the set, but this (together with 11122) shows that at least three C-points are missing, a contradiction.<br />
<br />
== General n ==<br />
<br />
General solution for <math>c'_N</math><br />
<br />
* q 2s, all points from A(N-q,1)<br />
* q-1 2s, points from A(N-q+1,2)<br />
* q-2 2s, points from A(N-q+2,3)<br />
* etc.<br />
<br />
where A(m,d) is a subset of <math>[1,3]^m</math> for which any two points differ from each other in at least d places.<br />
<br />
Mathworld’s entry on error-correcting codes suggests it might be NP-complete to find the size of A(m,d) in general.<br />
<br />
* <math>|A(m,1)| = 2^m</math> because it includes all points in <math>[1,3]^m</math><br />
* <math>|A(m,2)| = 2^{m-1}</math> because it can include all points in <math>[1,3]^m</math> with an odd number of 1s<br />
<br />
The integer programming routine from Maple 12 was used to obtain upper bounds for <math>c'_6</math> and <math>c'_7</math>. A large number of linear inequalities, described above in sections (n=3) and (n=4), were combined. The details are in [[Maple calculations]]. The results were that <math>c'_6 \le 361</math> and <math>c'_7 \le 1071</math>.<br />
<br />
A [[genetic algorithm]] has provided the following examples:<br />
<br />
* <math>c'_6 \geq 353</math> (26 examples; [http://twofoldgaze.wordpress.com/2009/03/10/353-element-solution/ here is one])<br />
* <math>c'_7 \geq 988</math> [http://twofoldgaze.wordpress.com/2009/03/10/978-element-solution/ Here is the example]<br />
<br />
== Larger sides (k>3) ==<br />
<br />
The following set gives a lower bound for Moser’s cube <math>[4]^n</math> (values 1,2,3,4): Pick all points where q entries are 2 or 3; and also pick those where q-1 entries are 2 or 3 and an odd number of entries are 1. This is maximized when q is near n/2, giving a lower bound of<br />
<br />
<math>\binom{n}{n/2} 2^n + \binom{n}{n/2-1} 2^{n-1}</math><br />
<br />
which is comparable to <math>4^n/\sqrt{n}</math> by [[Stirling's formula]].<br />
<br />
For k=5 (values 1,2,3,4,5) If A, B, C, D, and E denote the numbers of 1-s, 2-s, 3-s, 4-s and 5-s then the first three points of a geometric line form a 3-term arithmetic progression in A+E+2(B+D)+3C. So, for k=5 we have a similar lower bound for the Moser’s problem as for DHJ k=3, i.e. <math>5^{n - O(\sqrt{\log n})}</math>.<br />
<br />
The k=6 version of Moser implies DHJ(3). Indeed, any k=3 combinatorial line-free set can be "doubled up" into a k=6 geometric line-free set of the same density by pulling back the set from the map <math>\phi: [6]^n \to [3]^n</math> that maps 1, 2, 3, 4, 5, 6 to 1, 2, 3, 3, 2, 1 respectively; note that this map sends k=6 geometric lines to k=3 combinatorial lines.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Moser%27s_cube_problemMoser's cube problem2009-03-21T11:53:39Z<p>121.220.134.232: /* n=2 */</p>
<hr />
<div>Define a ''Moser set'' to be a subset of <math>[3]^n</math> which does not contain any [[geometric line]], and let <math>c'_n</math> denote the size of the largest Moser set in <math>[3]^n</math>. The first few values are (see [http://www.research.att.com/~njas/sequences/A003142 OEIS A003142]):<br />
<br />
: <math>c'_0 = 1; c'_1 = 2; c'_2 = 6; c'_3 = 16; c'_4 = 43; c'_5 = 124.</math><br />
<br />
Beyond this point, we only have some upper and lower bounds, in particular <math>353 \leq c'_6 \leq 361</math>; see [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DsU-uZ1tK7VEg this spreadsheet] for the latest bounds.<br />
<br />
The best known asymptotic lower bound for <math>c'_n</math> is<br />
<br />
: <math>c'_n \gg 3^n/\sqrt{n}</math>,<br />
<br />
formed by fixing the number of 2s to a single value near n/3. Compare this to the DHJ(2) or Sperner limit of <math>2^n/\sqrt{n}</math>. Is it possible to do any better? Note that we have a [[Upper_and_lower_bounds#Asymptotics|significantly better bound]] for the DHJ(3) <math>c_n</math>:<br />
<br />
: <math>c_n \geq 3^{n-O(\sqrt{\log n})}</math>.<br />
<br />
A more precise lower bound is<br />
<br />
: <math>c'_n \geq \binom{n+1}{q} 2^{n-q}</math><br />
<br />
where q is the nearest integer to <math>n/3</math>, formed by taking all strings with q 2s, together with all strings with q-1 2s and an odd number of 1s. This for instance gives the lower bound <math>c'_5 \geq 120</math>, which compares with the upper bound <math>c'_5 \leq 3 c'_4 = 129</math>.<br />
<br />
Using [[DHJ(3)]], we have the upper bound<br />
<br />
: <math>c'_n = o(3^n)</math>,<br />
<br />
but no effective decay rate is known. It would be good to have a combinatorial proof of this fact (which is weaker than [[DHJ(3)]], but implies [[Roth's theorem]]).<br />
<br />
== Notation ==<br />
<br />
Given a Moser set A in <math>[3]^n</math>, we let a be the number of points in A with no 2s, b be the number of points in A with one 2, c the number of points with two 2s, etc. We call (a,b,c,...) the ''statistics'' of A. Given a slice S of <math>[3]^n</math>, we let a(S), b(S), etc. denote the statistics of that slice (dropping the fixed coordinates). Thus for instance if A = {11, 12, 22, 32}, then (a,b,c) = (1,2,1), and (a(1*),b(1*)) = (1,1).<br />
<br />
We call a statistic (a,b,c,...) ''attainable'' if it is attained by a Moser set. We say that an attainable statistic is ''Pareto-optimal'' if it cannot be pointwise dominated by any other attainable statistic (a',b',c',...) (thus <math>a' \geq a</math>, <math>b' \geq b</math>, etc.) We say that it is ''extremal'' if it is not a convex combination of any other attainable statistic (this is a stronger property than Pareto-optimal). For the purposes of maximising linear scores of attainable statistics, it suffices to check extremal statistics.<br />
<br />
We let <math>(\alpha_0,\alpha_1,\ldots)</math> be the normalized version of <math>(a,b,\ldots)</math>, in which one divides the number of points of a certain type in the set by the total number of points in the set. Thus for instance <math>\alpha_0 = a/2^n, \alpha_1 = b/(n 2^{n-1}), \alpha_3 = c/(\binom{n}{2} 2^{n-2})</math>, etc., and the <math>\alpha_i</math> range between 0 and 1. Averaging arguments show that any linear inequality obeyed by the <math>\alpha_i</math> at one dimension is automatically inherited by higher dimensions, as are shifted versions of this inequality (in which <math>\alpha_i</math> is replaced by <math>\alpha_{i+1}</math>.<br />
<br />
The idea in 'c-statistics' is to identify 1s and 3s but leave the 2s intact. Let’s use x to denote letters that are either 1 or 3, then the 81 points get split up into 16 groups: 2222 (1 point), 222x, 22×2, 2×22, x222 (two points each), 22xx, 2×2x, x22x, 2xx2, x2×2, xx22 (four points each), 2xxx, x2xx, xx2x, xxx2 (eight points each), xxxx (sixteen points). Let c(w) denote the number of points inside a group w, e.g. c(xx22) is the number of points of the form xx22 inside the set, and is thus an integer from 0 to 4.<br />
<br />
== n=0 ==<br />
<br />
We trivially have <math>c'_0=1.</math><br />
<br />
== n=1 ==<br />
<br />
We trivially have <math>c'_1=2.</math> The Pareto-optimal values of the statistics (a,b) are (1,1) and (2,0); these are also the extremals. We thus have the inequality <math>a+b \leq 2</math>, or in normalized notation<br />
<br />
:<math>2\alpha_0 + \alpha_1 \leq 2</math>.<br />
<br />
== n=2 ==<br />
<br />
We have <math>c'_2 = 6</math>; the upper bound follows since <math>c'_2 \leq 3 c'_1</math>, and the lower bound follows by deleting one of the two diagonals from <math>[3]^2</math> (these are the only extremisers).<br />
<br />
The extremiser has statistics (a,b,c) = (2,4,0), which are of course Pareto-optimal. If c=1 then we must have a, b at most 2 (look at the lines through 22). This is attainable (e.g. {11, 12, 22, 23, 31}, and so (2,2,1) is another Pareto-optimal statistic. If a=4, then b and c must be 0, so we get another Pareto-optimal statistic (4,0,0) (attainable by {11, 13, 31, 33} of course). If a=3, then c=0 and b is at most 2, giving another Pareto-optimal statistic (3,2,0); but this is a convex combination of (4,0,0) and (2,4,0) and is thus not extremal. Thus the complete set of extremal statistics are<br />
<br />
(4,0,0), (2,4,0), (2,2,1).<br />
<br />
The sharp linear inequalities obeyed by a,b,c (other than the trivial ones <math>a,b,c \geq 0</math>) are then<br />
<br />
*<math>2a+b+2c \leq 8</math><br />
*<math>b+2c \leq 4</math><br />
*<math>a+2c \leq 4</math><br />
*<math>c \leq 1</math>.<br />
<br />
In normalized notation, we have<br />
*<math>4\alpha_0 + 2\alpha_1 + \alpha_2 \leq 4</math><br />
*<math>2\alpha_1 + \alpha_2 \leq 2</math><br />
*<math>2\alpha_0 + \alpha_2 \leq 2</math><br />
*<math>\alpha_2 \leq 1</math>.<br />
<br />
The Pareto optimizers for c-statistics are (c22,c2x,cx2,cxx) = (1112),(0222),(0004), which are covered by the linear inequalities <br />
*<math>c22+c2x+cxx \le 4</math>, <br />
*<math>c22+cx2+cxx \le 4</math>.<br />
<br />
== n=3 ==<br />
<br />
We have <math>c'_3 = 16</math>. The lower bound can be seen for instance by taking all the strings with one 2, and half the strings with no 2 (e.g. the strings with an odd number of 1s). The upper bound can be deduced from the corresponding [[upper and lower bounds]] for <math>c_3 = 18</math>; the 17-point and 18-point line-free sets each contain a geometric line.<br />
<br />
If a Moser set in <math>[3]^3</math> contains 222, then it can have at most 14 points, since the remaining 26 points in the cube split into 13 antipodal pairs, and at most one of each pair can lie in the set. By exhausting over the <math>2^{13} = 8192</math> possibilities, it can be shown that it is impossible for a 14-point set to exist; any Moser set containing 222 must in fact omit at least one antipodal pair completely and thus have only 13 points. (A human proof of this fact can be [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
The Pareto-optimal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(3,6,5,0),(4,4,5,0),(3,7,4,0),(4,6,4,0), (3,9,3,0),(4,7,3,0),(5,4,3,0),(4,9,2,0),(5,6,2,0),(6,3,2,0),(3,10,1,0),(5,7,1,0), (6,4,1,0),(4,12,0,0),(5,9,0,0),(6,6,0,0),(7,3,0,0),(8,0,0,0).<br />
<br />
These were found from a search of the <math>2^{27}</math> subsets of the cube.<br />
A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here].<br />
<br />
The extremal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(4,4,5,0),(4,6,4,0),(4,12,0,0),(8,0,0,0)<br />
<br />
The sharp linear bounds are :<br />
<br />
* <math>2a+b+2c+4d \leq 22</math><br />
* <math>3a+2b+3c+6d \leq 36</math><br />
* <math>7a+2b+4c+8d \leq 56</math><br />
* <math>6a+2b+3c+6d \leq 48</math><br />
* <math>b+c+3d \leq 12</math><br />
* <math>a+2c+4d \leq 14</math><br />
* <math>5a+4c+8d \leq 40</math><br />
* <math>a+4d \leq 8</math><br />
* <math>b+6d \leq 12</math><br />
* <math>c+3d \leq 6</math><br />
* <math>d \leq 1</math><br />
<br />
In normalized notation,<br />
* <math>8\alpha_0+ 6\alpha_1 + 6\alpha_2 + 2\alpha_3 \leq 11</math><br />
* <math>4\alpha_0+4\alpha_1+3\alpha_2+\alpha_3 \leq 6</math><br />
* <math>7\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 7</math><br />
* <math>8\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 8</math><br />
* <math>4\alpha_1+2\alpha_2+\alpha_3 \leq 4</math><br />
* <math>4\alpha_0+6\alpha_2+2\alpha_3 \leq 7</math><br />
* <math>5\alpha_0+3\alpha_2+\alpha_3 \leq 5</math><br />
* <math>2\alpha_0+\alpha_3 \leq 2</math><br />
* <math>2\alpha_1+\alpha_3 \leq 2</math><br />
* <math>2\alpha_2+\alpha_3 \leq 2</math><br />
* <math>\alpha_3 \leq 1</math><br />
<br />
The c-statistics can also be found from an exhaustive search of Moser sets. The resulting Pareto sets and linear inequalities can be found on Sheet 8 of [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# this spreadsheet]<br />
<br />
== n=4 ==<br />
<br />
A [http://abel.math.umu.se/~klasm/extremal-moser-n=5-t=3 computer search] has obtained all extremisers to <math>c'_4=43</math>. The 42-point solutions can be found [http://abel.math.umu.se/~klasm/moser-n=4-t=3-p=42.gz here].<br />
<br />
'''Proof that <math>c'_4 \leq 43</math>''': <br />
When e=1 (i.e. the 4D set contains 2222) then we have at most 41 points (in fact at most 39) by counting antipodal points, so assume e=0.<br />
<br />
Define the score of a 3D slice to be a/4+b/3+c/2+d. Observe from double counting that the size of a 4D set is the sum of the scores of its eight side slices.<br />
<br />
But by [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en looking at the extremals] we see that the largest score is 44/8, attained at only one point, namely when (a,b,c,d) = (2,6,6,0). So the only way one can have a 44-point set is if all side slices are (2,6,6,0), or equivalently if the whole set has statistics (a,b,c,d,e) = (4,16,24,0,0). But then we have all the points with two 2s, which means that the four "a" points cannot be separated by Hamming distance 2. We conclude that we must have an antipodal pair among the "a" points with an odd number of 1s, and an antipodal pair among the "a" points with an even number of 1s. By the symmetries of the cube, we may take the a-set to then be 1111, 3333, 1113, 3331. But then the "b" set must exclude both 1112 and 3332, and so can have at most three points in the eight-point set xyz2 (with x,y,z=1,3) rather than four (to get four points one must alternate in a checkerboard pattern). Adding this to the at most four points of the form xy2z, x2yz, 2xyz we see that b is at most 15, a contradiction. <math>\Box</math><br />
<br />
Given a subset of <math>[3]^4</math>, let a be the number of points with no 2s, b be the number of points with 1 2, and so forth. The quintuple (a,b,c,d,e) thus lies between (0,0,0,0,0) and (16,32,24,8,1).<br />
<br />
The 43-point solutions have distributions (a,b,c,d,e) as follows:<br />
<br />
* (5,20,18,0,0) [16 solutions]<br />
* (4,16,23,0,0) [768 solutions]<br />
* (3,16,24,0,0) [512 solutions]<br />
* (4,15,24,0,0) [256 solutions]<br />
<br />
The 42-point solutions are [http://abel.math.umu.se/~klasm/Moser-42-stat.pdf distributed as follows]:<br />
<br />
* (6,24,12,0,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-12 8 solutions]] <br />
* (5,20,17,0,0) [576 solutions]<br />
* (5,19,18,0,0) [384 solutions]<br />
* (6,16,18,2,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-e=2 192 solutions]]<br />
* (4,20,18,0,0) [272 solutions]<br />
* (5,17,20,0,0) [192 solutions]<br />
* (5,16,21,0,0) [3584 solutions]<br />
* (4,17,21,0,0) [768 solutions]<br />
* (4,16,22,0,0) [26880 solutions]<br />
* (5,15,22,0,0) [1536 solutions]<br />
* (4,15,23,0,0) [22272 solutions]<br />
* (3,16,23,0,0) [15744 solutions]<br />
* (4,14,24,0,0) [4224 solutions]<br />
* (3,15,24,0,0) [8704 solutions]<br />
* (2,16,24,0,0) [896 solutions]<br />
<br />
Note how c is usually quite large, and d quite low.<br />
<br />
One of the (6,24,12,0,0) solutions is <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math> (i.e. the set of points containing exactly two 1s, and/or exactly two 3s). The other seven are reflections of this set.<br />
<br />
There are 2,765,200 41-point solutions, [http://abel.math.umu.se/~klasm/solutions-4-t=3-41-moser.gz listed here]. The statistics for such points can be [http://abel.math.umu.se/~klasm/Moser-41-stat.pdf found here]. Noteworthy features of the statistics:<br />
<br />
* d is at most 3 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=3.gz 256 exceptional solutions] of the shape (5,15,18,3,0), have d at most 2; here are the [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=2.gz d=2 solutions] and [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=1.gz d=1 solutions])<br />
* c is at least 6 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-c=6.gz 16 exceptional solutions] of the shape (7,28,6,0,0), have c at least 11).<br />
<br />
Statistics for the 41-point, 42-point, and 43-point solutions can be found [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en here].<br />
<br />
If a Moser set in <math>[3]^4</math> contains 2222, then by the n=3 theory, any middle slice (i.e. 2***, *2**, **2*, or ***2) is missing at least one antipodal pair. But each antipodal pair belongs to at most three middle slices, thus two of the 40 antipodal pairs must be completely missing. As a consequence, any Moser set containing 2222 can have at most 39 points.<br />
(A more refined analysis can be found at [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
We have the following inequalities connecting a,b,c,d,e:<br />
<br />
* <math>4a+b \leq 64</math>: There are 32 lines connecting two "a" points with a "b" point; each "a" point belongs to four of these lines, and each "b" point belongs to one. But each such line can have at most two points in the set, and the claim follows.<br />
* This can be refined to <math>4a+b+\frac{2}{3} c \leq 64</math>: There are 24 planes connecting four "a" points, four "b" points, and one "c" point; each "a" point belongs to six of these, each "b" point belongs to three, and each "c" point belongs to one. For each of these planes, we have <math>2a + b + 2c \leq 8</math> from the n=2 theory, and the claim follows.<br />
* <math>6a+2c \leq 96</math>: There are 96 lines connecting two "a" points with a "c" point; each "a" point belongs to six of these lines, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
* <math>3b+2c \leq 96</math>: There are 48 lines connecting two "b" points to a "c" point; each "b" point belongs to three of these points, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
<br />
The inequalities for n=3 imply inequalities for n=4. Indeed, there are eight side slices of <math>[3]^4</math>; each "a" point belongs to four of these, each "b" point belongs to three, each "c" point belongs to two, and each "d" point belongs to one. Thus, any inequality of the form<br />
:<math> \alpha a + \beta b + \gamma c + \delta d \leq M</math><br />
in three dimensions implies the inequality<br />
:<math> 4 \alpha a + 3 \beta b + 2 \gamma c + \delta d \leq 8M</math><br />
in four dimensions. Thus we have<br />
<br />
* <math>8a+3b+4c+4d \leq 176</math><br />
* <math>2a+b+c+d \leq 48</math><br />
* <math>14a+3b+4c+4d \leq 244</math><br />
* <math>4a+b+c+d \leq 64</math> <br />
* <math>3b+2c+3d \leq 96</math><br />
* <math>a+c+d \leq 28</math><br />
* <math>5a+2c+2d \leq 80</math><br />
* <math>a+d \leq 16</math><br />
* <math>b+2d \leq 32</math><br />
* <math>2c+3d \leq 48</math><br />
<br />
Cubes also sit diagonally in the 4-dimensional cube. They may have coordinates xxyz, where xx runs over (11,22,33) or (13,22,31), while y and z run over (1,2,3). These cubes have a different distribution of 2s than the ordinary slices: (a,b,c,d,e) = (8,8,6,4,1) instead of (8,12,6,1,0) for the side slices and (0,8,12,6,1) for the middle slices. So a different set of inequalities arise. It is possible to run through the same procedure as described above for n=3, and the inequalities that arise are <math>2a+b+2c+2d+4e \le 24</math> and <math>4a+b+2c+2d+4e \le 32</math> within the 3D cube, which when averaged becomes <math>4a+b+2c+4d+16e \le 96</math> and <math>8a+b+2c+4d+16e \le 128</math> for the 4D cube. A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here] on sheet 2. Other sheets of this spreadsheet contain results for a 3D cube sitting diagonally in a 5D, 6D or 7D cube. Notice the similarity of the equations that arise for the xxxyz, xxyyz and xxxxyz diagonals. Also notice the xxxxyyz diagonals have the same Pareto sets and linear inequalities as the cube's c-statistics<br />
<br />
== n=5 ==<br />
<br />
Let (A,B,C,D,E,F) be the statistics of a five-dimensional Moser set, thus (A,B,C,D,E,F) varies between (0,0,0,0,0,0) and (32,80,80,40,10,1).<br />
<br />
There are several Moser sets with the statistics (4,40,80,0,0,0), which thus have 124 points. Indeed, one can take<br />
<br />
* all points with two 2s;<br />
* all points with one 2 and an even number of 1s; and<br />
* (13111),(13113),(31311),(13333). Any two of these four points differ in three places, except for one pair of points that differ in one place.<br />
<br />
For the rest of this section, we assume that the Moser set <math>{\mathcal A}</math> is a 125-point set.<br />
<br />
:'''Lemma 1''': F=0.<br />
<br />
'''Proof''' If F is non-zero, then the Moser set contains 22222, then each of the 121 antipodal pairs can have at most one point in the set, leading to only 122 points. <math>\Box</math><br />
<br />
:'''Lemma 2''': Every middle slice of <math>{\mathcal A}</math> has at most 41 points.<br />
<br />
'''Proof''' Without loss of generality we may consider the 2**** slice. There are two cases, depending on the value of c(2****).<br />
<br />
Suppose first that c is at least 17; thus there are at least 17 points of the form 222xy, 22xy2, 2xy22, or 2x22y, where the x, y denote 1 or 3. This gives 36 "xy" wildcards in all in four coordinate slots; by the pigeonhole principle one of the slots sees at least 9 of the wildcards. By symmetry, we may assume that the second coordinate slot sees at least 9 of these wildcards, thus there are at least 9 points of the form 2x22y, 2x2y2, 2xy22. The x=1, x=3 cases can absorb at most six of these, thus each of these cases must absorb at least three points, with at least one absorbing at least five. Let's say that it's the x=3 case that absorbs 5; thus <math>d(*1***) \geq 3</math> and <math>d(*3***) \geq 5</math>. From the n=4 theory this means that the *1*** slice has at most 41 points, and the *3*** slice has at most 40. Meanwhile, the middle slice has at most 43, leading to 41+41+42=124 points in all.<br />
<br />
Now suppose c is less than 16; then by the n=4 theory the middle slice is one of the eight (6,24,12,0,0) sets. Without loss of generality we may take it to be <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math>; in particular, the middle slice contains the points 21122 21212 21221 23322 23232 23223. In particular, the *1*** and *3*** slices have a "d" value of at least three, and so have at most 41 points. If the *2*** slice has at most 42 points, then we are at 41+42+41=124 points as needed, but if we have 43 or more, then we are back in the first case (as <math>c(*2***) \geq 17</math>) after permuting the indices.<br />
<math>\Box</math><br />
<br />
Since 125=41+41+43, we thus have<br />
<br />
:'''Corollary 1''': Every side slice of <math>{\mathcal A}</math> has at least 41 points. If one side slice does have 41 points, then the other has 43.<br />
<br />
Combining this with the n=4 statistics, we conclude<br />
<br />
:'''Corollary 2''': Every side slice of <math>{\mathcal A}</math> has e=0, and <math>d \leq 3</math>. Given two opposite side slices, e.g. 1**** and 3****, we have <math>d(1****)+d(3****) \leq 4</math>.<br />
<br />
:'''Corollary 3''': E=0.<br />
<br />
:'''Corollary 4''': Any middle slice has a "c" value of at most 8.<br />
<br />
'''Proof''' Let's work with the 2**** slice. By Corollary 2, there are at most four contributions to c(2****) of the form 21*** or 23***, and similarly for the other three positions. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 3''': <math>2B+C \leq 160</math>. <br />
<br />
'''Proof''': There are 160 lines connecting one "C" point to two "B" points (e.g. 11112, 11122, 11132); each "C" point lies in two of these, and each "B" point lies on four. A Moser set can have at most two points out of each of these lines. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 4''' Given m A-points with <math>m \geq 5</math>, there exists at least m-4 pairs (a,b) of such A-points with Hamming distance exactly two (i.e. b differs from a in exactly two places).<br />
<br />
'''Proof''' It suffices to check this for m=5, since the case of larger m then follows by locating a pair, removing a point that contributes to that pair, and using the induction hypothesis. Given 5 A-points, we may assume by the pigeonhole principle and symmetry that at least three of them have an odd number of 1s. Suppose 11111 is one of the points, and that no pair has Hamming distance 2. All points with two 3s are excluded, so the only points allowed with an odd number of 1s are those with four 3s. But all those points differ from each other in two positions, so at most one of them is allowed. <math>\Box</math><br />
<br />
:'''Corollary 5''' <math>C \leq 79</math>.<br />
<br />
'''Proof''': Suppose for contradiction that C=80, then D=0 and <math>B \leq 40</math>. From Lemma 4 we also see that A cannot exceed 5, leading to the contradiction.<br />
<math>\Box</math><br />
<br />
Define the '''score''' of a Moser set in <math>[3]^4</math> to be the quantity <math>a + 5b/4 + 5c/3 + 5d/2 + 5e</math>. Double-counting (and Lemma 2) gives<br />
<br />
:'''Lemma 5''' The total score of all the ten side-slices of <math>{\mathcal A}</math> is equal to <math>5|{\mathcal A}| = 5 \times 125</math>. In particular, there exists a pair of opposite side-slices whose scores add up to at least 125.<br />
<br />
By Lemma 5 and symmetry, we may assume that the 1**** and 3**** slices have score adding up to at least 125. By Lemma 2, the 2**** slice has at most 41 points, which imply that the 1**** and 3**** have 41, 42, or 43 points. From the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] we know that all 41-point, 42-point, 43-point slices have score less than 62, with the following exceptions:<br />
<br />
# (2,16,24,0,0) [42 points, score: 62]<br />
# (4,16,23,0,0) [43 points, score: 62 1/3]<br />
# (4,15,24,0,0) [43 points, score: 62 3/4]<br />
# (3,16,24,0,0) [43 points, score: 63]<br />
<br />
Thus the 1**** and 3**** slices must from the above list. Furthermore, if one of the slices is of type 1, then the other must be of type 4, and if one slice is of type 2, then the other must be of type 3 or 4.<br />
<br />
:'''Lemma 6''' There exists one cut in which the side slices have total score strictly greater than 125 (i.e. they thus involve only Type 2, Type 3, and Type 4 slices, with at least one side slice not equal to Type 2, and the cut here is of the form 43+39+43).<br />
<br />
'''Proof''' If not, then all cuts have side slices exactly equal to 125, which by the above table implies that one is Type 1 and one is Type 4, in particular all side slices have c=24. But this forces C=80, contradicting Corollary 5. <math>\Box</math><br />
<br />
Adding up the corners we conclude<br />
<br />
:'''Corollary 6''' <math>6 \leq A \leq 8</math>.<br />
<br />
:'''Lemma 7''' <math>D=0</math>.<br />
<br />
'''Proof''' Firstly, from Lemma 6 we have a cut in which the two side slices are omitting at most one C-point between them (and have no D-point), which forces the middle slice to have at most one D-point; thus D is at most 1.<br />
<br />
Now suppose instead that D=1 (e.g. if 11222 was in the set); then there would be two choices of coordinates in which one of the side slices would have d=1 (e.g. 1**** and *1****). But the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] show that such slices have a score of at most 59 7/12, so that the total score from those two coordinates is at most 63 + 59 7/12 = 122 7/12. On the other hand, the other three slices have a net score of at most 63+63 = 126. This averages out to at most 124.633... < 125, a contradiction. <math>\Box</math><br />
<br />
:'''Corollary 7''' Every middle slice has at most 40 points.<br />
<br />
'''Proof''' From Lemma 7 we see that the middle slice must have c=0, but from the n=4 statistics this is not possible for any slice of size 41 or higher (alternatively, one can use the inequalities <math>4a+b \leq 64</math> and <math>b \leq 32</math>). <math>\Box</math><br />
<br />
Each middle slice now has 39 or 40 points, with c=d=e=0, and so from the inequalities <math>4a+b \leq 64</math>, <math>b \leq 32</math> must have statistics (8,32,0,0,0),(7,32,0,0,0), or (8,31,0,0,0). In particular the "a" index of the middle slices is at most 8. Summing over all middle slices we conclude<br />
<br />
:'''Lemma 9''' B is at most 40.<br />
<br />
:'''Lemma 10''' C is equal to 78 or 79.<br />
<br />
'''Proof''' By Corollary 5, it suffices to show that <math>C \geq 78</math>.<br />
<br />
As 125=43+39+43, we see that every center slice must have at least 39 points. By Lemma 2 and Lemma 7 the center slice has c=d=e=0, thus the center slice has a+b >= 39. On the other hand, from the n=4 theory we have 4a+b <= 64, which forces b >= 31.<br />
<br />
By double counting, we see that 2C is equal to the sum of the b's of all the five center slices. Thus C >= 5*31/2 = 77.5 and the claim follows. <math>\Box</math><br />
<br />
From Lemmas 9 and 10 we see that <math>B+C \leq 119</math>, and thus <math>A \geq 6</math>. Also, if <math>A \geq 7</math>, then by Lemma 4 we have at least three pairs of A points with Hamming distance 2. At most two of these pairs eliminate the same C point, so we would have C=78 in that case. <br />
<br />
Putting all the above facts together, we see that (A,B,C,D,E,F) must be one of the following triples:<br />
<br />
* (6,40,79,0,0,0)<br />
* (7,40,78,0,0,0)<br />
* (8,39,78,0,0,0)<br />
<br />
All three cases can be eliminated, giving <math>c'_5=124</math>.<br />
<br />
=== Elimination of (6,40,79,0,0,0) ===<br />
<br />
Look at the 6 A points. From Lemma 4 we have at least two pairs (a,b), (c,d) of A-points that have Hamming separation 2.<br />
<br />
Now look at the midpoints of these two pairs; these midpoints are C-points cannot lie in the set. But we have exactly one C-point missing from the set, thus the midpoints must be the same. By symmetry, we may thus assume that the two pairs are (11111,11133) and (11113,11131). Thus 11111,11133, 11113, 11131 are in the set, and so every C-point other than 11122 is in the set. On the other hand, the B-points 11121, 11123, 11112, 11132 lie outside the set.<br />
<br />
At most one of 11312, 11332 lie in the set (since 11322 lies in the set). Suppose that 11312 lies outside the set, then we have a pair (xy1z2, xy3z2) with x,y,z = 1,3 that is totally omitted from the set, namely (11112,11312). On the other hand, every other pair of this form can have at most point in the set, thus there are at most seven points in the set of the form (xyzw2) with x,y,z,w = 1,3. Similarly there are at most 8 points of the form xyz2w, or of xy2zw, x2yzw, 2xyzw, leading to at most 39 B-points in all, contradiction.<br />
<br />
=== Elimination of (7,40,78,0,0,0) ===<br />
<br />
By Lemma 4, we have at least three pairs of A-points of distance two apart that lie in the set.<br />
The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs must have the same midpoint, so we may assume as before that 111xy lies in the set for x,y=1,3, which implies that 1112* and 111*2 lie outside the set.<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 316. On the other hand, every one of the 160 lines can have at most two points, and four of these lines (namely 1112*, 111*2) have no points. Thus all the other lines must have exactly two points. <br />
<br />
We know that the C-point 11122 is missing from the set; there is one other missing C-point. Since 1112x, 111x2 lie outside the set, we conclude from the previous paragraph that 1132x, 113x2 and 1312x, 131x2 lie in the set. Taking midpoints we conclude that 11322 and 13122 lie outside the set. But this is now three C-points missing (together with 11122), a contradiction.<br />
<br />
=== Elimination of (8,39,78,0,0,0) ===<br />
<br />
By Lemma 4 we have at least four pairs of A-points of distance two apart that lie in the set. The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs (a,b), (c,d) must have the same midpoint p, and the other two pairs (a',b'), (c',d') must also have the same midpoint p'. (Note that every C-point is the midpoint of at most two such pairs.)<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 312. Every one of the 160 lines can have at most two points, and four of these (those in the plane of (a,b,c,d) or of (a',b',c',d') have no points. Thus all other lines must have exactly two points.<br />
<br />
Without loss of generality we have (a,b)=(11111,11133), (c,d) = (11113,11131), thus p = 11122. By permuting the first three indices, we may assume that p' is not of the form x2y2z, x2yz2, xy22z, xy2z2. Then 1112x lies outside the set and 1122x lies in the set, so by the above paragraph 1132x lies in the set; similarly for 113x2, 1312x, 131x2. This implies that 13122, 11322 lie outside the set, but this (together with 11122) shows that at least three C-points are missing, a contradiction.<br />
<br />
== General n ==<br />
<br />
General solution for <math>c'_N</math><br />
<br />
* q 2s, all points from A(N-q,1)<br />
* q-1 2s, points from A(N-q+1,2)<br />
* q-2 2s, points from A(N-q+2,3)<br />
* etc.<br />
<br />
where A(m,d) is a subset of <math>[1,3]^m</math> for which any two points differ from each other in at least d places.<br />
<br />
Mathworld’s entry on error-correcting codes suggests it might be NP-complete to find the size of A(m,d) in general.<br />
<br />
* <math>|A(m,1)| = 2^m</math> because it includes all points in <math>[1,3]^m</math><br />
* <math>|A(m,2)| = 2^{m-1}</math> because it can include all points in <math>[1,3]^m</math> with an odd number of 1s<br />
<br />
The integer programming routine from Maple 12 was used to obtain upper bounds for <math>c'_6</math> and <math>c'_7</math>. A large number of linear inequalities, described above in sections (n=3) and (n=4), were combined. The details are in [[Maple calculations]]. The results were that <math>c'_6 \le 361</math> and <math>c'_7 \le 1071</math>.<br />
<br />
A [[genetic algorithm]] has provided the following examples:<br />
<br />
* <math>c'_6 \geq 353</math> (26 examples; [http://twofoldgaze.wordpress.com/2009/03/10/353-element-solution/ here is one])<br />
* <math>c'_7 \geq 988</math> [http://twofoldgaze.wordpress.com/2009/03/10/978-element-solution/ Here is the example]<br />
<br />
== Larger sides (k>3) ==<br />
<br />
The following set gives a lower bound for Moser’s cube <math>[4]^n</math> (values 1,2,3,4): Pick all points where q entries are 2 or 3; and also pick those where q-1 entries are 2 or 3 and an odd number of entries are 1. This is maximized when q is near n/2, giving a lower bound of<br />
<br />
<math>\binom{n}{n/2} 2^n + \binom{n}{n/2-1} 2^{n-1}</math><br />
<br />
which is comparable to <math>4^n/\sqrt{n}</math> by [[Stirling's formula]].<br />
<br />
For k=5 (values 1,2,3,4,5) If A, B, C, D, and E denote the numbers of 1-s, 2-s, 3-s, 4-s and 5-s then the first three points of a geometric line form a 3-term arithmetic progression in A+E+2(B+D)+3C. So, for k=5 we have a similar lower bound for the Moser’s problem as for DHJ k=3, i.e. <math>5^{n - O(\sqrt{\log n})}</math>.<br />
<br />
The k=6 version of Moser implies DHJ(3). Indeed, any k=3 combinatorial line-free set can be "doubled up" into a k=6 geometric line-free set of the same density by pulling back the set from the map <math>\phi: [6]^n \to [3]^n</math> that maps 1, 2, 3, 4, 5, 6 to 1, 2, 3, 3, 2, 1 respectively; note that this map sends k=6 geometric lines to k=3 combinatorial lines.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Moser%27s_cube_problemMoser's cube problem2009-03-21T11:53:00Z<p>121.220.134.232: /* n=2 */</p>
<hr />
<div>Define a ''Moser set'' to be a subset of <math>[3]^n</math> which does not contain any [[geometric line]], and let <math>c'_n</math> denote the size of the largest Moser set in <math>[3]^n</math>. The first few values are (see [http://www.research.att.com/~njas/sequences/A003142 OEIS A003142]):<br />
<br />
: <math>c'_0 = 1; c'_1 = 2; c'_2 = 6; c'_3 = 16; c'_4 = 43; c'_5 = 124.</math><br />
<br />
Beyond this point, we only have some upper and lower bounds, in particular <math>353 \leq c'_6 \leq 361</math>; see [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DsU-uZ1tK7VEg this spreadsheet] for the latest bounds.<br />
<br />
The best known asymptotic lower bound for <math>c'_n</math> is<br />
<br />
: <math>c'_n \gg 3^n/\sqrt{n}</math>,<br />
<br />
formed by fixing the number of 2s to a single value near n/3. Compare this to the DHJ(2) or Sperner limit of <math>2^n/\sqrt{n}</math>. Is it possible to do any better? Note that we have a [[Upper_and_lower_bounds#Asymptotics|significantly better bound]] for the DHJ(3) <math>c_n</math>:<br />
<br />
: <math>c_n \geq 3^{n-O(\sqrt{\log n})}</math>.<br />
<br />
A more precise lower bound is<br />
<br />
: <math>c'_n \geq \binom{n+1}{q} 2^{n-q}</math><br />
<br />
where q is the nearest integer to <math>n/3</math>, formed by taking all strings with q 2s, together with all strings with q-1 2s and an odd number of 1s. This for instance gives the lower bound <math>c'_5 \geq 120</math>, which compares with the upper bound <math>c'_5 \leq 3 c'_4 = 129</math>.<br />
<br />
Using [[DHJ(3)]], we have the upper bound<br />
<br />
: <math>c'_n = o(3^n)</math>,<br />
<br />
but no effective decay rate is known. It would be good to have a combinatorial proof of this fact (which is weaker than [[DHJ(3)]], but implies [[Roth's theorem]]).<br />
<br />
== Notation ==<br />
<br />
Given a Moser set A in <math>[3]^n</math>, we let a be the number of points in A with no 2s, b be the number of points in A with one 2, c the number of points with two 2s, etc. We call (a,b,c,...) the ''statistics'' of A. Given a slice S of <math>[3]^n</math>, we let a(S), b(S), etc. denote the statistics of that slice (dropping the fixed coordinates). Thus for instance if A = {11, 12, 22, 32}, then (a,b,c) = (1,2,1), and (a(1*),b(1*)) = (1,1).<br />
<br />
We call a statistic (a,b,c,...) ''attainable'' if it is attained by a Moser set. We say that an attainable statistic is ''Pareto-optimal'' if it cannot be pointwise dominated by any other attainable statistic (a',b',c',...) (thus <math>a' \geq a</math>, <math>b' \geq b</math>, etc.) We say that it is ''extremal'' if it is not a convex combination of any other attainable statistic (this is a stronger property than Pareto-optimal). For the purposes of maximising linear scores of attainable statistics, it suffices to check extremal statistics.<br />
<br />
We let <math>(\alpha_0,\alpha_1,\ldots)</math> be the normalized version of <math>(a,b,\ldots)</math>, in which one divides the number of points of a certain type in the set by the total number of points in the set. Thus for instance <math>\alpha_0 = a/2^n, \alpha_1 = b/(n 2^{n-1}), \alpha_3 = c/(\binom{n}{2} 2^{n-2})</math>, etc., and the <math>\alpha_i</math> range between 0 and 1. Averaging arguments show that any linear inequality obeyed by the <math>\alpha_i</math> at one dimension is automatically inherited by higher dimensions, as are shifted versions of this inequality (in which <math>\alpha_i</math> is replaced by <math>\alpha_{i+1}</math>.<br />
<br />
The idea in 'c-statistics' is to identify 1s and 3s but leave the 2s intact. Let’s use x to denote letters that are either 1 or 3, then the 81 points get split up into 16 groups: 2222 (1 point), 222x, 22×2, 2×22, x222 (two points each), 22xx, 2×2x, x22x, 2xx2, x2×2, xx22 (four points each), 2xxx, x2xx, xx2x, xxx2 (eight points each), xxxx (sixteen points). Let c(w) denote the number of points inside a group w, e.g. c(xx22) is the number of points of the form xx22 inside the set, and is thus an integer from 0 to 4.<br />
<br />
== n=0 ==<br />
<br />
We trivially have <math>c'_0=1.</math><br />
<br />
== n=1 ==<br />
<br />
We trivially have <math>c'_1=2.</math> The Pareto-optimal values of the statistics (a,b) are (1,1) and (2,0); these are also the extremals. We thus have the inequality <math>a+b \leq 2</math>, or in normalized notation<br />
<br />
:<math>2\alpha_0 + \alpha_1 \leq 2</math>.<br />
<br />
== n=2 ==<br />
<br />
We have <math>c'_2 = 6</math>; the upper bound follows since <math>c'_2 \leq 3 c'_1</math>, and the lower bound follows by deleting one of the two diagonals from <math>[3]^2</math> (these are the only extremisers).<br />
<br />
The extremiser has statistics (a,b,c) = (2,4,0), which are of course Pareto-optimal. If c=1 then we must have a, b at most 2 (look at the lines through 22). This is attainable (e.g. {11, 12, 22, 23, 31}, and so (2,2,1) is another Pareto-optimal statistic. If a=4, then b and c must be 0, so we get another Pareto-optimal statistic (4,0,0) (attainable by {11, 13, 31, 33} of course). If a=3, then c=0 and b is at most 2, giving another Pareto-optimal statistic (3,2,0); but this is a convex combination of (4,0,0) and (2,4,0) and is thus not extremal. Thus the complete set of extremal statistics are<br />
<br />
(4,0,0), (2,4,0), (2,2,1).<br />
<br />
The sharp linear inequalities obeyed by a,b,c (other than the trivial ones <math>a,b,c \geq 0</math>) are then<br />
<br />
*<math>2a+b+2c \leq 8</math><br />
*<math>b+2c \leq 4</math><br />
*<math>a+2c \leq 4</math><br />
*<math>c \leq 1</math>.<br />
<br />
In normalized notation, we have<br />
*<math>4\alpha_0 + 2\alpha_1 + \alpha_2 \leq 4</math><br />
*<math>2\alpha_1 + \alpha_2 \leq 2</math><br />
*<math>2\alpha_0 + \alpha_2 \leq 2</math><br />
*<math>\alpha_2 \leq 1</math>.<br />
<br />
The Pareto optimizers for c-statistics are (c22,c2x,cx2,cxx) = (1112),(0222),(0004), which are covered by the linear inequalities <br />
*<math>c22+c2x+cxx <= 4</math>, <br />
*<math>c22+cx2+cxx <= 4</math>.<br />
<br />
== n=3 ==<br />
<br />
We have <math>c'_3 = 16</math>. The lower bound can be seen for instance by taking all the strings with one 2, and half the strings with no 2 (e.g. the strings with an odd number of 1s). The upper bound can be deduced from the corresponding [[upper and lower bounds]] for <math>c_3 = 18</math>; the 17-point and 18-point line-free sets each contain a geometric line.<br />
<br />
If a Moser set in <math>[3]^3</math> contains 222, then it can have at most 14 points, since the remaining 26 points in the cube split into 13 antipodal pairs, and at most one of each pair can lie in the set. By exhausting over the <math>2^{13} = 8192</math> possibilities, it can be shown that it is impossible for a 14-point set to exist; any Moser set containing 222 must in fact omit at least one antipodal pair completely and thus have only 13 points. (A human proof of this fact can be [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
The Pareto-optimal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(3,6,5,0),(4,4,5,0),(3,7,4,0),(4,6,4,0), (3,9,3,0),(4,7,3,0),(5,4,3,0),(4,9,2,0),(5,6,2,0),(6,3,2,0),(3,10,1,0),(5,7,1,0), (6,4,1,0),(4,12,0,0),(5,9,0,0),(6,6,0,0),(7,3,0,0),(8,0,0,0).<br />
<br />
These were found from a search of the <math>2^{27}</math> subsets of the cube.<br />
A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here].<br />
<br />
The extremal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(4,4,5,0),(4,6,4,0),(4,12,0,0),(8,0,0,0)<br />
<br />
The sharp linear bounds are :<br />
<br />
* <math>2a+b+2c+4d \leq 22</math><br />
* <math>3a+2b+3c+6d \leq 36</math><br />
* <math>7a+2b+4c+8d \leq 56</math><br />
* <math>6a+2b+3c+6d \leq 48</math><br />
* <math>b+c+3d \leq 12</math><br />
* <math>a+2c+4d \leq 14</math><br />
* <math>5a+4c+8d \leq 40</math><br />
* <math>a+4d \leq 8</math><br />
* <math>b+6d \leq 12</math><br />
* <math>c+3d \leq 6</math><br />
* <math>d \leq 1</math><br />
<br />
In normalized notation,<br />
* <math>8\alpha_0+ 6\alpha_1 + 6\alpha_2 + 2\alpha_3 \leq 11</math><br />
* <math>4\alpha_0+4\alpha_1+3\alpha_2+\alpha_3 \leq 6</math><br />
* <math>7\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 7</math><br />
* <math>8\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 8</math><br />
* <math>4\alpha_1+2\alpha_2+\alpha_3 \leq 4</math><br />
* <math>4\alpha_0+6\alpha_2+2\alpha_3 \leq 7</math><br />
* <math>5\alpha_0+3\alpha_2+\alpha_3 \leq 5</math><br />
* <math>2\alpha_0+\alpha_3 \leq 2</math><br />
* <math>2\alpha_1+\alpha_3 \leq 2</math><br />
* <math>2\alpha_2+\alpha_3 \leq 2</math><br />
* <math>\alpha_3 \leq 1</math><br />
<br />
The c-statistics can also be found from an exhaustive search of Moser sets. The resulting Pareto sets and linear inequalities can be found on Sheet 8 of [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# this spreadsheet]<br />
<br />
== n=4 ==<br />
<br />
A [http://abel.math.umu.se/~klasm/extremal-moser-n=5-t=3 computer search] has obtained all extremisers to <math>c'_4=43</math>. The 42-point solutions can be found [http://abel.math.umu.se/~klasm/moser-n=4-t=3-p=42.gz here].<br />
<br />
'''Proof that <math>c'_4 \leq 43</math>''': <br />
When e=1 (i.e. the 4D set contains 2222) then we have at most 41 points (in fact at most 39) by counting antipodal points, so assume e=0.<br />
<br />
Define the score of a 3D slice to be a/4+b/3+c/2+d. Observe from double counting that the size of a 4D set is the sum of the scores of its eight side slices.<br />
<br />
But by [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en looking at the extremals] we see that the largest score is 44/8, attained at only one point, namely when (a,b,c,d) = (2,6,6,0). So the only way one can have a 44-point set is if all side slices are (2,6,6,0), or equivalently if the whole set has statistics (a,b,c,d,e) = (4,16,24,0,0). But then we have all the points with two 2s, which means that the four "a" points cannot be separated by Hamming distance 2. We conclude that we must have an antipodal pair among the "a" points with an odd number of 1s, and an antipodal pair among the "a" points with an even number of 1s. By the symmetries of the cube, we may take the a-set to then be 1111, 3333, 1113, 3331. But then the "b" set must exclude both 1112 and 3332, and so can have at most three points in the eight-point set xyz2 (with x,y,z=1,3) rather than four (to get four points one must alternate in a checkerboard pattern). Adding this to the at most four points of the form xy2z, x2yz, 2xyz we see that b is at most 15, a contradiction. <math>\Box</math><br />
<br />
Given a subset of <math>[3]^4</math>, let a be the number of points with no 2s, b be the number of points with 1 2, and so forth. The quintuple (a,b,c,d,e) thus lies between (0,0,0,0,0) and (16,32,24,8,1).<br />
<br />
The 43-point solutions have distributions (a,b,c,d,e) as follows:<br />
<br />
* (5,20,18,0,0) [16 solutions]<br />
* (4,16,23,0,0) [768 solutions]<br />
* (3,16,24,0,0) [512 solutions]<br />
* (4,15,24,0,0) [256 solutions]<br />
<br />
The 42-point solutions are [http://abel.math.umu.se/~klasm/Moser-42-stat.pdf distributed as follows]:<br />
<br />
* (6,24,12,0,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-12 8 solutions]] <br />
* (5,20,17,0,0) [576 solutions]<br />
* (5,19,18,0,0) [384 solutions]<br />
* (6,16,18,2,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-e=2 192 solutions]]<br />
* (4,20,18,0,0) [272 solutions]<br />
* (5,17,20,0,0) [192 solutions]<br />
* (5,16,21,0,0) [3584 solutions]<br />
* (4,17,21,0,0) [768 solutions]<br />
* (4,16,22,0,0) [26880 solutions]<br />
* (5,15,22,0,0) [1536 solutions]<br />
* (4,15,23,0,0) [22272 solutions]<br />
* (3,16,23,0,0) [15744 solutions]<br />
* (4,14,24,0,0) [4224 solutions]<br />
* (3,15,24,0,0) [8704 solutions]<br />
* (2,16,24,0,0) [896 solutions]<br />
<br />
Note how c is usually quite large, and d quite low.<br />
<br />
One of the (6,24,12,0,0) solutions is <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math> (i.e. the set of points containing exactly two 1s, and/or exactly two 3s). The other seven are reflections of this set.<br />
<br />
There are 2,765,200 41-point solutions, [http://abel.math.umu.se/~klasm/solutions-4-t=3-41-moser.gz listed here]. The statistics for such points can be [http://abel.math.umu.se/~klasm/Moser-41-stat.pdf found here]. Noteworthy features of the statistics:<br />
<br />
* d is at most 3 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=3.gz 256 exceptional solutions] of the shape (5,15,18,3,0), have d at most 2; here are the [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=2.gz d=2 solutions] and [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=1.gz d=1 solutions])<br />
* c is at least 6 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-c=6.gz 16 exceptional solutions] of the shape (7,28,6,0,0), have c at least 11).<br />
<br />
Statistics for the 41-point, 42-point, and 43-point solutions can be found [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en here].<br />
<br />
If a Moser set in <math>[3]^4</math> contains 2222, then by the n=3 theory, any middle slice (i.e. 2***, *2**, **2*, or ***2) is missing at least one antipodal pair. But each antipodal pair belongs to at most three middle slices, thus two of the 40 antipodal pairs must be completely missing. As a consequence, any Moser set containing 2222 can have at most 39 points.<br />
(A more refined analysis can be found at [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
We have the following inequalities connecting a,b,c,d,e:<br />
<br />
* <math>4a+b \leq 64</math>: There are 32 lines connecting two "a" points with a "b" point; each "a" point belongs to four of these lines, and each "b" point belongs to one. But each such line can have at most two points in the set, and the claim follows.<br />
* This can be refined to <math>4a+b+\frac{2}{3} c \leq 64</math>: There are 24 planes connecting four "a" points, four "b" points, and one "c" point; each "a" point belongs to six of these, each "b" point belongs to three, and each "c" point belongs to one. For each of these planes, we have <math>2a + b + 2c \leq 8</math> from the n=2 theory, and the claim follows.<br />
* <math>6a+2c \leq 96</math>: There are 96 lines connecting two "a" points with a "c" point; each "a" point belongs to six of these lines, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
* <math>3b+2c \leq 96</math>: There are 48 lines connecting two "b" points to a "c" point; each "b" point belongs to three of these points, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
<br />
The inequalities for n=3 imply inequalities for n=4. Indeed, there are eight side slices of <math>[3]^4</math>; each "a" point belongs to four of these, each "b" point belongs to three, each "c" point belongs to two, and each "d" point belongs to one. Thus, any inequality of the form<br />
:<math> \alpha a + \beta b + \gamma c + \delta d \leq M</math><br />
in three dimensions implies the inequality<br />
:<math> 4 \alpha a + 3 \beta b + 2 \gamma c + \delta d \leq 8M</math><br />
in four dimensions. Thus we have<br />
<br />
* <math>8a+3b+4c+4d \leq 176</math><br />
* <math>2a+b+c+d \leq 48</math><br />
* <math>14a+3b+4c+4d \leq 244</math><br />
* <math>4a+b+c+d \leq 64</math> <br />
* <math>3b+2c+3d \leq 96</math><br />
* <math>a+c+d \leq 28</math><br />
* <math>5a+2c+2d \leq 80</math><br />
* <math>a+d \leq 16</math><br />
* <math>b+2d \leq 32</math><br />
* <math>2c+3d \leq 48</math><br />
<br />
Cubes also sit diagonally in the 4-dimensional cube. They may have coordinates xxyz, where xx runs over (11,22,33) or (13,22,31), while y and z run over (1,2,3). These cubes have a different distribution of 2s than the ordinary slices: (a,b,c,d,e) = (8,8,6,4,1) instead of (8,12,6,1,0) for the side slices and (0,8,12,6,1) for the middle slices. So a different set of inequalities arise. It is possible to run through the same procedure as described above for n=3, and the inequalities that arise are <math>2a+b+2c+2d+4e \le 24</math> and <math>4a+b+2c+2d+4e \le 32</math> within the 3D cube, which when averaged becomes <math>4a+b+2c+4d+16e \le 96</math> and <math>8a+b+2c+4d+16e \le 128</math> for the 4D cube. A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here] on sheet 2. Other sheets of this spreadsheet contain results for a 3D cube sitting diagonally in a 5D, 6D or 7D cube. Notice the similarity of the equations that arise for the xxxyz, xxyyz and xxxxyz diagonals. Also notice the xxxxyyz diagonals have the same Pareto sets and linear inequalities as the cube's c-statistics<br />
<br />
== n=5 ==<br />
<br />
Let (A,B,C,D,E,F) be the statistics of a five-dimensional Moser set, thus (A,B,C,D,E,F) varies between (0,0,0,0,0,0) and (32,80,80,40,10,1).<br />
<br />
There are several Moser sets with the statistics (4,40,80,0,0,0), which thus have 124 points. Indeed, one can take<br />
<br />
* all points with two 2s;<br />
* all points with one 2 and an even number of 1s; and<br />
* (13111),(13113),(31311),(13333). Any two of these four points differ in three places, except for one pair of points that differ in one place.<br />
<br />
For the rest of this section, we assume that the Moser set <math>{\mathcal A}</math> is a 125-point set.<br />
<br />
:'''Lemma 1''': F=0.<br />
<br />
'''Proof''' If F is non-zero, then the Moser set contains 22222, then each of the 121 antipodal pairs can have at most one point in the set, leading to only 122 points. <math>\Box</math><br />
<br />
:'''Lemma 2''': Every middle slice of <math>{\mathcal A}</math> has at most 41 points.<br />
<br />
'''Proof''' Without loss of generality we may consider the 2**** slice. There are two cases, depending on the value of c(2****).<br />
<br />
Suppose first that c is at least 17; thus there are at least 17 points of the form 222xy, 22xy2, 2xy22, or 2x22y, where the x, y denote 1 or 3. This gives 36 "xy" wildcards in all in four coordinate slots; by the pigeonhole principle one of the slots sees at least 9 of the wildcards. By symmetry, we may assume that the second coordinate slot sees at least 9 of these wildcards, thus there are at least 9 points of the form 2x22y, 2x2y2, 2xy22. The x=1, x=3 cases can absorb at most six of these, thus each of these cases must absorb at least three points, with at least one absorbing at least five. Let's say that it's the x=3 case that absorbs 5; thus <math>d(*1***) \geq 3</math> and <math>d(*3***) \geq 5</math>. From the n=4 theory this means that the *1*** slice has at most 41 points, and the *3*** slice has at most 40. Meanwhile, the middle slice has at most 43, leading to 41+41+42=124 points in all.<br />
<br />
Now suppose c is less than 16; then by the n=4 theory the middle slice is one of the eight (6,24,12,0,0) sets. Without loss of generality we may take it to be <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math>; in particular, the middle slice contains the points 21122 21212 21221 23322 23232 23223. In particular, the *1*** and *3*** slices have a "d" value of at least three, and so have at most 41 points. If the *2*** slice has at most 42 points, then we are at 41+42+41=124 points as needed, but if we have 43 or more, then we are back in the first case (as <math>c(*2***) \geq 17</math>) after permuting the indices.<br />
<math>\Box</math><br />
<br />
Since 125=41+41+43, we thus have<br />
<br />
:'''Corollary 1''': Every side slice of <math>{\mathcal A}</math> has at least 41 points. If one side slice does have 41 points, then the other has 43.<br />
<br />
Combining this with the n=4 statistics, we conclude<br />
<br />
:'''Corollary 2''': Every side slice of <math>{\mathcal A}</math> has e=0, and <math>d \leq 3</math>. Given two opposite side slices, e.g. 1**** and 3****, we have <math>d(1****)+d(3****) \leq 4</math>.<br />
<br />
:'''Corollary 3''': E=0.<br />
<br />
:'''Corollary 4''': Any middle slice has a "c" value of at most 8.<br />
<br />
'''Proof''' Let's work with the 2**** slice. By Corollary 2, there are at most four contributions to c(2****) of the form 21*** or 23***, and similarly for the other three positions. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 3''': <math>2B+C \leq 160</math>. <br />
<br />
'''Proof''': There are 160 lines connecting one "C" point to two "B" points (e.g. 11112, 11122, 11132); each "C" point lies in two of these, and each "B" point lies on four. A Moser set can have at most two points out of each of these lines. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 4''' Given m A-points with <math>m \geq 5</math>, there exists at least m-4 pairs (a,b) of such A-points with Hamming distance exactly two (i.e. b differs from a in exactly two places).<br />
<br />
'''Proof''' It suffices to check this for m=5, since the case of larger m then follows by locating a pair, removing a point that contributes to that pair, and using the induction hypothesis. Given 5 A-points, we may assume by the pigeonhole principle and symmetry that at least three of them have an odd number of 1s. Suppose 11111 is one of the points, and that no pair has Hamming distance 2. All points with two 3s are excluded, so the only points allowed with an odd number of 1s are those with four 3s. But all those points differ from each other in two positions, so at most one of them is allowed. <math>\Box</math><br />
<br />
:'''Corollary 5''' <math>C \leq 79</math>.<br />
<br />
'''Proof''': Suppose for contradiction that C=80, then D=0 and <math>B \leq 40</math>. From Lemma 4 we also see that A cannot exceed 5, leading to the contradiction.<br />
<math>\Box</math><br />
<br />
Define the '''score''' of a Moser set in <math>[3]^4</math> to be the quantity <math>a + 5b/4 + 5c/3 + 5d/2 + 5e</math>. Double-counting (and Lemma 2) gives<br />
<br />
:'''Lemma 5''' The total score of all the ten side-slices of <math>{\mathcal A}</math> is equal to <math>5|{\mathcal A}| = 5 \times 125</math>. In particular, there exists a pair of opposite side-slices whose scores add up to at least 125.<br />
<br />
By Lemma 5 and symmetry, we may assume that the 1**** and 3**** slices have score adding up to at least 125. By Lemma 2, the 2**** slice has at most 41 points, which imply that the 1**** and 3**** have 41, 42, or 43 points. From the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] we know that all 41-point, 42-point, 43-point slices have score less than 62, with the following exceptions:<br />
<br />
# (2,16,24,0,0) [42 points, score: 62]<br />
# (4,16,23,0,0) [43 points, score: 62 1/3]<br />
# (4,15,24,0,0) [43 points, score: 62 3/4]<br />
# (3,16,24,0,0) [43 points, score: 63]<br />
<br />
Thus the 1**** and 3**** slices must from the above list. Furthermore, if one of the slices is of type 1, then the other must be of type 4, and if one slice is of type 2, then the other must be of type 3 or 4.<br />
<br />
:'''Lemma 6''' There exists one cut in which the side slices have total score strictly greater than 125 (i.e. they thus involve only Type 2, Type 3, and Type 4 slices, with at least one side slice not equal to Type 2, and the cut here is of the form 43+39+43).<br />
<br />
'''Proof''' If not, then all cuts have side slices exactly equal to 125, which by the above table implies that one is Type 1 and one is Type 4, in particular all side slices have c=24. But this forces C=80, contradicting Corollary 5. <math>\Box</math><br />
<br />
Adding up the corners we conclude<br />
<br />
:'''Corollary 6''' <math>6 \leq A \leq 8</math>.<br />
<br />
:'''Lemma 7''' <math>D=0</math>.<br />
<br />
'''Proof''' Firstly, from Lemma 6 we have a cut in which the two side slices are omitting at most one C-point between them (and have no D-point), which forces the middle slice to have at most one D-point; thus D is at most 1.<br />
<br />
Now suppose instead that D=1 (e.g. if 11222 was in the set); then there would be two choices of coordinates in which one of the side slices would have d=1 (e.g. 1**** and *1****). But the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] show that such slices have a score of at most 59 7/12, so that the total score from those two coordinates is at most 63 + 59 7/12 = 122 7/12. On the other hand, the other three slices have a net score of at most 63+63 = 126. This averages out to at most 124.633... < 125, a contradiction. <math>\Box</math><br />
<br />
:'''Corollary 7''' Every middle slice has at most 40 points.<br />
<br />
'''Proof''' From Lemma 7 we see that the middle slice must have c=0, but from the n=4 statistics this is not possible for any slice of size 41 or higher (alternatively, one can use the inequalities <math>4a+b \leq 64</math> and <math>b \leq 32</math>). <math>\Box</math><br />
<br />
Each middle slice now has 39 or 40 points, with c=d=e=0, and so from the inequalities <math>4a+b \leq 64</math>, <math>b \leq 32</math> must have statistics (8,32,0,0,0),(7,32,0,0,0), or (8,31,0,0,0). In particular the "a" index of the middle slices is at most 8. Summing over all middle slices we conclude<br />
<br />
:'''Lemma 9''' B is at most 40.<br />
<br />
:'''Lemma 10''' C is equal to 78 or 79.<br />
<br />
'''Proof''' By Corollary 5, it suffices to show that <math>C \geq 78</math>.<br />
<br />
As 125=43+39+43, we see that every center slice must have at least 39 points. By Lemma 2 and Lemma 7 the center slice has c=d=e=0, thus the center slice has a+b >= 39. On the other hand, from the n=4 theory we have 4a+b <= 64, which forces b >= 31.<br />
<br />
By double counting, we see that 2C is equal to the sum of the b's of all the five center slices. Thus C >= 5*31/2 = 77.5 and the claim follows. <math>\Box</math><br />
<br />
From Lemmas 9 and 10 we see that <math>B+C \leq 119</math>, and thus <math>A \geq 6</math>. Also, if <math>A \geq 7</math>, then by Lemma 4 we have at least three pairs of A points with Hamming distance 2. At most two of these pairs eliminate the same C point, so we would have C=78 in that case. <br />
<br />
Putting all the above facts together, we see that (A,B,C,D,E,F) must be one of the following triples:<br />
<br />
* (6,40,79,0,0,0)<br />
* (7,40,78,0,0,0)<br />
* (8,39,78,0,0,0)<br />
<br />
All three cases can be eliminated, giving <math>c'_5=124</math>.<br />
<br />
=== Elimination of (6,40,79,0,0,0) ===<br />
<br />
Look at the 6 A points. From Lemma 4 we have at least two pairs (a,b), (c,d) of A-points that have Hamming separation 2.<br />
<br />
Now look at the midpoints of these two pairs; these midpoints are C-points cannot lie in the set. But we have exactly one C-point missing from the set, thus the midpoints must be the same. By symmetry, we may thus assume that the two pairs are (11111,11133) and (11113,11131). Thus 11111,11133, 11113, 11131 are in the set, and so every C-point other than 11122 is in the set. On the other hand, the B-points 11121, 11123, 11112, 11132 lie outside the set.<br />
<br />
At most one of 11312, 11332 lie in the set (since 11322 lies in the set). Suppose that 11312 lies outside the set, then we have a pair (xy1z2, xy3z2) with x,y,z = 1,3 that is totally omitted from the set, namely (11112,11312). On the other hand, every other pair of this form can have at most point in the set, thus there are at most seven points in the set of the form (xyzw2) with x,y,z,w = 1,3. Similarly there are at most 8 points of the form xyz2w, or of xy2zw, x2yzw, 2xyzw, leading to at most 39 B-points in all, contradiction.<br />
<br />
=== Elimination of (7,40,78,0,0,0) ===<br />
<br />
By Lemma 4, we have at least three pairs of A-points of distance two apart that lie in the set.<br />
The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs must have the same midpoint, so we may assume as before that 111xy lies in the set for x,y=1,3, which implies that 1112* and 111*2 lie outside the set.<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 316. On the other hand, every one of the 160 lines can have at most two points, and four of these lines (namely 1112*, 111*2) have no points. Thus all the other lines must have exactly two points. <br />
<br />
We know that the C-point 11122 is missing from the set; there is one other missing C-point. Since 1112x, 111x2 lie outside the set, we conclude from the previous paragraph that 1132x, 113x2 and 1312x, 131x2 lie in the set. Taking midpoints we conclude that 11322 and 13122 lie outside the set. But this is now three C-points missing (together with 11122), a contradiction.<br />
<br />
=== Elimination of (8,39,78,0,0,0) ===<br />
<br />
By Lemma 4 we have at least four pairs of A-points of distance two apart that lie in the set. The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs (a,b), (c,d) must have the same midpoint p, and the other two pairs (a',b'), (c',d') must also have the same midpoint p'. (Note that every C-point is the midpoint of at most two such pairs.)<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 312. Every one of the 160 lines can have at most two points, and four of these (those in the plane of (a,b,c,d) or of (a',b',c',d') have no points. Thus all other lines must have exactly two points.<br />
<br />
Without loss of generality we have (a,b)=(11111,11133), (c,d) = (11113,11131), thus p = 11122. By permuting the first three indices, we may assume that p' is not of the form x2y2z, x2yz2, xy22z, xy2z2. Then 1112x lies outside the set and 1122x lies in the set, so by the above paragraph 1132x lies in the set; similarly for 113x2, 1312x, 131x2. This implies that 13122, 11322 lie outside the set, but this (together with 11122) shows that at least three C-points are missing, a contradiction.<br />
<br />
== General n ==<br />
<br />
General solution for <math>c'_N</math><br />
<br />
* q 2s, all points from A(N-q,1)<br />
* q-1 2s, points from A(N-q+1,2)<br />
* q-2 2s, points from A(N-q+2,3)<br />
* etc.<br />
<br />
where A(m,d) is a subset of <math>[1,3]^m</math> for which any two points differ from each other in at least d places.<br />
<br />
Mathworld’s entry on error-correcting codes suggests it might be NP-complete to find the size of A(m,d) in general.<br />
<br />
* <math>|A(m,1)| = 2^m</math> because it includes all points in <math>[1,3]^m</math><br />
* <math>|A(m,2)| = 2^{m-1}</math> because it can include all points in <math>[1,3]^m</math> with an odd number of 1s<br />
<br />
The integer programming routine from Maple 12 was used to obtain upper bounds for <math>c'_6</math> and <math>c'_7</math>. A large number of linear inequalities, described above in sections (n=3) and (n=4), were combined. The details are in [[Maple calculations]]. The results were that <math>c'_6 \le 361</math> and <math>c'_7 \le 1071</math>.<br />
<br />
A [[genetic algorithm]] has provided the following examples:<br />
<br />
* <math>c'_6 \geq 353</math> (26 examples; [http://twofoldgaze.wordpress.com/2009/03/10/353-element-solution/ here is one])<br />
* <math>c'_7 \geq 988</math> [http://twofoldgaze.wordpress.com/2009/03/10/978-element-solution/ Here is the example]<br />
<br />
== Larger sides (k>3) ==<br />
<br />
The following set gives a lower bound for Moser’s cube <math>[4]^n</math> (values 1,2,3,4): Pick all points where q entries are 2 or 3; and also pick those where q-1 entries are 2 or 3 and an odd number of entries are 1. This is maximized when q is near n/2, giving a lower bound of<br />
<br />
<math>\binom{n}{n/2} 2^n + \binom{n}{n/2-1} 2^{n-1}</math><br />
<br />
which is comparable to <math>4^n/\sqrt{n}</math> by [[Stirling's formula]].<br />
<br />
For k=5 (values 1,2,3,4,5) If A, B, C, D, and E denote the numbers of 1-s, 2-s, 3-s, 4-s and 5-s then the first three points of a geometric line form a 3-term arithmetic progression in A+E+2(B+D)+3C. So, for k=5 we have a similar lower bound for the Moser’s problem as for DHJ k=3, i.e. <math>5^{n - O(\sqrt{\log n})}</math>.<br />
<br />
The k=6 version of Moser implies DHJ(3). Indeed, any k=3 combinatorial line-free set can be "doubled up" into a k=6 geometric line-free set of the same density by pulling back the set from the map <math>\phi: [6]^n \to [3]^n</math> that maps 1, 2, 3, 4, 5, 6 to 1, 2, 3, 3, 2, 1 respectively; note that this map sends k=6 geometric lines to k=3 combinatorial lines.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Moser%27s_cube_problemMoser's cube problem2009-03-21T04:32:45Z<p>121.220.134.232: /* n=4 */</p>
<hr />
<div>Define a ''Moser set'' to be a subset of <math>[3]^n</math> which does not contain any [[geometric line]], and let <math>c'_n</math> denote the size of the largest Moser set in <math>[3]^n</math>. The first few values are (see [http://www.research.att.com/~njas/sequences/A003142 OEIS A003142]):<br />
<br />
: <math>c'_0 = 1; c'_1 = 2; c'_2 = 6; c'_3 = 16; c'_4 = 43; c'_5 = 124.</math><br />
<br />
Beyond this point, we only have some upper and lower bounds, in particular <math>353 \leq c'_6 \leq 361</math>; see [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DsU-uZ1tK7VEg this spreadsheet] for the latest bounds.<br />
<br />
The best known asymptotic lower bound for <math>c'_n</math> is<br />
<br />
: <math>c'_n \gg 3^n/\sqrt{n}</math>,<br />
<br />
formed by fixing the number of 2s to a single value near n/3. Compare this to the DHJ(2) or Sperner limit of <math>2^n/\sqrt{n}</math>. Is it possible to do any better? Note that we have a [[Upper_and_lower_bounds#Asymptotics|significantly better bound]] for the DHJ(3) <math>c_n</math>:<br />
<br />
: <math>c_n \geq 3^{n-O(\sqrt{\log n})}</math>.<br />
<br />
A more precise lower bound is<br />
<br />
: <math>c'_n \geq \binom{n+1}{q} 2^{n-q}</math><br />
<br />
where q is the nearest integer to <math>n/3</math>, formed by taking all strings with q 2s, together with all strings with q-1 2s and an odd number of 1s. This for instance gives the lower bound <math>c'_5 \geq 120</math>, which compares with the upper bound <math>c'_5 \leq 3 c'_4 = 129</math>.<br />
<br />
Using [[DHJ(3)]], we have the upper bound<br />
<br />
: <math>c'_n = o(3^n)</math>,<br />
<br />
but no effective decay rate is known. It would be good to have a combinatorial proof of this fact (which is weaker than [[DHJ(3)]], but implies [[Roth's theorem]]).<br />
<br />
== Notation ==<br />
<br />
Given a Moser set A in <math>[3]^n</math>, we let a be the number of points in A with no 2s, b be the number of points in A with one 2, c the number of points with two 2s, etc. We call (a,b,c,...) the ''statistics'' of A. Given a slice S of <math>[3]^n</math>, we let a(S), b(S), etc. denote the statistics of that slice (dropping the fixed coordinates). Thus for instance if A = {11, 12, 22, 32}, then (a,b,c) = (1,2,1), and (a(1*),b(1*)) = (1,1).<br />
<br />
We call a statistic (a,b,c,...) ''attainable'' if it is attained by a Moser set. We say that an attainable statistic is ''Pareto-optimal'' if it cannot be pointwise dominated by any other attainable statistic (a',b',c',...) (thus <math>a' \geq a</math>, <math>b' \geq b</math>, etc.) We say that it is ''extremal'' if it is not a convex combination of any other attainable statistic (this is a stronger property than Pareto-optimal). For the purposes of maximising linear scores of attainable statistics, it suffices to check extremal statistics.<br />
<br />
We let <math>(\alpha_0,\alpha_1,\ldots)</math> be the normalized version of <math>(a,b,\ldots)</math>, in which one divides the number of points of a certain type in the set by the total number of points in the set. Thus for instance <math>\alpha_0 = a/2^n, \alpha_1 = b/(n 2^{n-1}), \alpha_3 = c/(\binom{n}{2} 2^{n-2})</math>, etc., and the <math>\alpha_i</math> range between 0 and 1. Averaging arguments show that any linear inequality obeyed by the <math>\alpha_i</math> at one dimension is automatically inherited by higher dimensions, as are shifted versions of this inequality (in which <math>\alpha_i</math> is replaced by <math>\alpha_{i+1}</math>.<br />
<br />
The idea in 'c-statistics' is to identify 1s and 3s but leave the 2s intact. Let’s use x to denote letters that are either 1 or 3, then the 81 points get split up into 16 groups: 2222 (1 point), 222x, 22×2, 2×22, x222 (two points each), 22xx, 2×2x, x22x, 2xx2, x2×2, xx22 (four points each), 2xxx, x2xx, xx2x, xxx2 (eight points each), xxxx (sixteen points). Let c(w) denote the number of points inside a group w, e.g. c(xx22) is the number of points of the form xx22 inside the set, and is thus an integer from 0 to 4.<br />
<br />
== n=0 ==<br />
<br />
We trivially have <math>c'_0=1.</math><br />
<br />
== n=1 ==<br />
<br />
We trivially have <math>c'_1=2.</math> The Pareto-optimal values of the statistics (a,b) are (1,1) and (2,0); these are also the extremals. We thus have the inequality <math>a+b \leq 2</math>, or in normalized notation<br />
<br />
:<math>2\alpha_0 + \alpha_1 \leq 2</math>.<br />
<br />
== n=2 ==<br />
<br />
We have <math>c'_2 = 6</math>; the upper bound follows since <math>c'_2 \leq 3 c'_1</math>, and the lower bound follows by deleting one of the two diagonals from <math>[3]^2</math> (these are the only extremisers).<br />
<br />
The extremiser has statistics (a,b,c) = (2,4,0), which are of course Pareto-optimal. If c=1 then we must have a, b at most 2 (look at the lines through 22). This is attainable (e.g. {11, 12, 22, 23, 31}, and so (2,2,1) is another Pareto-optimal statistic. If a=4, then b and c must be 0, so we get another Pareto-optimal statistic (4,0,0) (attainable by {11, 13, 31, 33} of course). If a=3, then c=0 and b is at most 2, giving another Pareto-optimal statistic (3,2,0); but this is a convex combination of (4,0,0) and (2,4,0) and is thus not extremal. Thus the complete set of extremal statistics are<br />
<br />
(4,0,0), (2,4,0), (2,2,1).<br />
<br />
The sharp linear inequalities obeyed by a,b,c (other than the trivial ones <math>a,b,c \geq 0</math>) are then<br />
<br />
*<math>2a+b+2c \leq 8</math><br />
*<math>b+2c \leq 4</math><br />
*<math>a+2c \leq 4</math><br />
*<math>c \leq 1</math>.<br />
<br />
In normalized notation, we have<br />
*<math>4\alpha_0 + 2\alpha_1 + \alpha_2 \leq 4</math><br />
*<math>2\alpha_1 + \alpha_2 \leq 2</math><br />
*<math>2\alpha_0 + \alpha_2 \leq 2</math><br />
*<math>\alpha_2 \leq 1</math>.<br />
<br />
The Pareto optimizers for c-statistics are (c22,c2x,cx2,cxx) = (1112),(0222),(0004), which are covered by the linear inequalities c22+c2x+cxx <= 4, c22+cx2+cxx <= 4.<br />
<br />
== n=3 ==<br />
<br />
We have <math>c'_3 = 16</math>. The lower bound can be seen for instance by taking all the strings with one 2, and half the strings with no 2 (e.g. the strings with an odd number of 1s). The upper bound can be deduced from the corresponding [[upper and lower bounds]] for <math>c_3 = 18</math>; the 17-point and 18-point line-free sets each contain a geometric line.<br />
<br />
If a Moser set in <math>[3]^3</math> contains 222, then it can have at most 14 points, since the remaining 26 points in the cube split into 13 antipodal pairs, and at most one of each pair can lie in the set. By exhausting over the <math>2^{13} = 8192</math> possibilities, it can be shown that it is impossible for a 14-point set to exist; any Moser set containing 222 must in fact omit at least one antipodal pair completely and thus have only 13 points. (A human proof of this fact can be [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
The Pareto-optimal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(3,6,5,0),(4,4,5,0),(3,7,4,0),(4,6,4,0), (3,9,3,0),(4,7,3,0),(5,4,3,0),(4,9,2,0),(5,6,2,0),(6,3,2,0),(3,10,1,0),(5,7,1,0), (6,4,1,0),(4,12,0,0),(5,9,0,0),(6,6,0,0),(7,3,0,0),(8,0,0,0).<br />
<br />
These were found from a search of the <math>2^{27}</math> subsets of the cube.<br />
A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here].<br />
<br />
The extremal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(4,4,5,0),(4,6,4,0),(4,12,0,0),(8,0,0,0)<br />
<br />
The sharp linear bounds are :<br />
<br />
* <math>2a+b+2c+4d \leq 22</math><br />
* <math>3a+2b+3c+6d \leq 36</math><br />
* <math>7a+2b+4c+8d \leq 56</math><br />
* <math>6a+2b+3c+6d \leq 48</math><br />
* <math>b+c+3d \leq 12</math><br />
* <math>a+2c+4d \leq 14</math><br />
* <math>5a+4c+8d \leq 40</math><br />
* <math>a+4d \leq 8</math><br />
* <math>b+6d \leq 12</math><br />
* <math>c+3d \leq 6</math><br />
* <math>d \leq 1</math><br />
<br />
In normalized notation,<br />
* <math>8\alpha_0+ 6\alpha_1 + 6\alpha_2 + 2\alpha_3 \leq 11</math><br />
* <math>4\alpha_0+4\alpha_1+3\alpha_2+\alpha_3 \leq 6</math><br />
* <math>7\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 7</math><br />
* <math>8\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 8</math><br />
* <math>4\alpha_1+2\alpha_2+\alpha_3 \leq 4</math><br />
* <math>4\alpha_0+6\alpha_2+2\alpha_3 \leq 7</math><br />
* <math>5\alpha_0+3\alpha_2+\alpha_3 \leq 5</math><br />
* <math>2\alpha_0+\alpha_3 \leq 2</math><br />
* <math>2\alpha_1+\alpha_3 \leq 2</math><br />
* <math>2\alpha_2+\alpha_3 \leq 2</math><br />
* <math>\alpha_3 \leq 1</math><br />
<br />
The c-statistics can also be found from an exhaustive search of Moser sets. The resulting Pareto sets and linear inequalities can be found on Sheet 8 of [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# this spreadsheet]<br />
<br />
== n=4 ==<br />
<br />
A [http://abel.math.umu.se/~klasm/extremal-moser-n=5-t=3 computer search] has obtained all extremisers to <math>c'_4=43</math>. The 42-point solutions can be found [http://abel.math.umu.se/~klasm/moser-n=4-t=3-p=42.gz here].<br />
<br />
'''Proof that <math>c'_4 \leq 43</math>''': <br />
When e=1 (i.e. the 4D set contains 2222) then we have at most 41 points (in fact at most 39) by counting antipodal points, so assume e=0.<br />
<br />
Define the score of a 3D slice to be a/4+b/3+c/2+d. Observe from double counting that the size of a 4D set is the sum of the scores of its eight side slices.<br />
<br />
But by [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en looking at the extremals] we see that the largest score is 44/8, attained at only one point, namely when (a,b,c,d) = (2,6,6,0). So the only way one can have a 44-point set is if all side slices are (2,6,6,0), or equivalently if the whole set has statistics (a,b,c,d,e) = (4,16,24,0,0). But then we have all the points with two 2s, which means that the four "a" points cannot be separated by Hamming distance 2. We conclude that we must have an antipodal pair among the "a" points with an odd number of 1s, and an antipodal pair among the "a" points with an even number of 1s. By the symmetries of the cube, we may take the a-set to then be 1111, 3333, 1113, 3331. But then the "b" set must exclude both 1112 and 3332, and so can have at most three points in the eight-point set xyz2 (with x,y,z=1,3) rather than four (to get four points one must alternate in a checkerboard pattern). Adding this to the at most four points of the form xy2z, x2yz, 2xyz we see that b is at most 15, a contradiction. <math>\Box</math><br />
<br />
Given a subset of <math>[3]^4</math>, let a be the number of points with no 2s, b be the number of points with 1 2, and so forth. The quintuple (a,b,c,d,e) thus lies between (0,0,0,0,0) and (16,32,24,8,1).<br />
<br />
The 43-point solutions have distributions (a,b,c,d,e) as follows:<br />
<br />
* (5,20,18,0,0) [16 solutions]<br />
* (4,16,23,0,0) [768 solutions]<br />
* (3,16,24,0,0) [512 solutions]<br />
* (4,15,24,0,0) [256 solutions]<br />
<br />
The 42-point solutions are [http://abel.math.umu.se/~klasm/Moser-42-stat.pdf distributed as follows]:<br />
<br />
* (6,24,12,0,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-12 8 solutions]] <br />
* (5,20,17,0,0) [576 solutions]<br />
* (5,19,18,0,0) [384 solutions]<br />
* (6,16,18,2,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-e=2 192 solutions]]<br />
* (4,20,18,0,0) [272 solutions]<br />
* (5,17,20,0,0) [192 solutions]<br />
* (5,16,21,0,0) [3584 solutions]<br />
* (4,17,21,0,0) [768 solutions]<br />
* (4,16,22,0,0) [26880 solutions]<br />
* (5,15,22,0,0) [1536 solutions]<br />
* (4,15,23,0,0) [22272 solutions]<br />
* (3,16,23,0,0) [15744 solutions]<br />
* (4,14,24,0,0) [4224 solutions]<br />
* (3,15,24,0,0) [8704 solutions]<br />
* (2,16,24,0,0) [896 solutions]<br />
<br />
Note how c is usually quite large, and d quite low.<br />
<br />
One of the (6,24,12,0,0) solutions is <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math> (i.e. the set of points containing exactly two 1s, and/or exactly two 3s). The other seven are reflections of this set.<br />
<br />
There are 2,765,200 41-point solutions, [http://abel.math.umu.se/~klasm/solutions-4-t=3-41-moser.gz listed here]. The statistics for such points can be [http://abel.math.umu.se/~klasm/Moser-41-stat.pdf found here]. Noteworthy features of the statistics:<br />
<br />
* d is at most 3 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=3.gz 256 exceptional solutions] of the shape (5,15,18,3,0), have d at most 2; here are the [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=2.gz d=2 solutions] and [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=1.gz d=1 solutions])<br />
* c is at least 6 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-c=6.gz 16 exceptional solutions] of the shape (7,28,6,0,0), have c at least 11).<br />
<br />
Statistics for the 41-point, 42-point, and 43-point solutions can be found [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en here].<br />
<br />
If a Moser set in <math>[3]^4</math> contains 2222, then by the n=3 theory, any middle slice (i.e. 2***, *2**, **2*, or ***2) is missing at least one antipodal pair. But each antipodal pair belongs to at most three middle slices, thus two of the 40 antipodal pairs must be completely missing. As a consequence, any Moser set containing 2222 can have at most 39 points.<br />
(A more refined analysis can be found at [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
We have the following inequalities connecting a,b,c,d,e:<br />
<br />
* <math>4a+b \leq 64</math>: There are 32 lines connecting two "a" points with a "b" point; each "a" point belongs to four of these lines, and each "b" point belongs to one. But each such line can have at most two points in the set, and the claim follows.<br />
* This can be refined to <math>4a+b+\frac{2}{3} c \leq 64</math>: There are 24 planes connecting four "a" points, four "b" points, and one "c" point; each "a" point belongs to six of these, each "b" point belongs to three, and each "c" point belongs to one. For each of these planes, we have <math>2a + b + 2c \leq 8</math> from the n=2 theory, and the claim follows.<br />
* <math>6a+2c \leq 96</math>: There are 96 lines connecting two "a" points with a "c" point; each "a" point belongs to six of these lines, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
* <math>3b+2c \leq 96</math>: There are 48 lines connecting two "b" points to a "c" point; each "b" point belongs to three of these points, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
<br />
The inequalities for n=3 imply inequalities for n=4. Indeed, there are eight side slices of <math>[3]^4</math>; each "a" point belongs to four of these, each "b" point belongs to three, each "c" point belongs to two, and each "d" point belongs to one. Thus, any inequality of the form<br />
:<math> \alpha a + \beta b + \gamma c + \delta d \leq M</math><br />
in three dimensions implies the inequality<br />
:<math> 4 \alpha a + 3 \beta b + 2 \gamma c + \delta d \leq 8M</math><br />
in four dimensions. Thus we have<br />
<br />
* <math>8a+3b+4c+4d \leq 176</math><br />
* <math>2a+b+c+d \leq 48</math><br />
* <math>14a+3b+4c+4d \leq 244</math><br />
* <math>4a+b+c+d \leq 64</math> <br />
* <math>3b+2c+3d \leq 96</math><br />
* <math>a+c+d \leq 28</math><br />
* <math>5a+2c+2d \leq 80</math><br />
* <math>a+d \leq 16</math><br />
* <math>b+2d \leq 32</math><br />
* <math>2c+3d \leq 48</math><br />
<br />
Cubes also sit diagonally in the 4-dimensional cube. They may have coordinates xxyz, where xx runs over (11,22,33) or (13,22,31), while y and z run over (1,2,3). These cubes have a different distribution of 2s than the ordinary slices: (a,b,c,d,e) = (8,8,6,4,1) instead of (8,12,6,1,0) for the side slices and (0,8,12,6,1) for the middle slices. So a different set of inequalities arise. It is possible to run through the same procedure as described above for n=3, and the inequalities that arise are <math>2a+b+2c+2d+4e \le 24</math> and <math>4a+b+2c+2d+4e \le 32</math> within the 3D cube, which when averaged becomes <math>4a+b+2c+4d+16e \le 96</math> and <math>8a+b+2c+4d+16e \le 128</math> for the 4D cube. A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here] on sheet 2. Other sheets of this spreadsheet contain results for a 3D cube sitting diagonally in a 5D, 6D or 7D cube. Notice the similarity of the equations that arise for the xxxyz, xxyyz and xxxxyz diagonals. Also notice the xxxxyyz diagonals have the same Pareto sets and linear inequalities as the cube's c-statistics<br />
<br />
== n=5 ==<br />
<br />
Let (A,B,C,D,E,F) be the statistics of a five-dimensional Moser set, thus (A,B,C,D,E,F) varies between (0,0,0,0,0,0) and (32,80,80,40,10,1).<br />
<br />
There are several Moser sets with the statistics (4,40,80,0,0,0), which thus have 124 points. Indeed, one can take<br />
<br />
* all points with two 2s;<br />
* all points with one 2 and an even number of 1s; and<br />
* (13111),(13113),(31311),(13333). Any two of these four points differ in three places, except for one pair of points that differ in one place.<br />
<br />
For the rest of this section, we assume that the Moser set <math>{\mathcal A}</math> is a 125-point set.<br />
<br />
:'''Lemma 1''': F=0.<br />
<br />
'''Proof''' If F is non-zero, then the Moser set contains 22222, then each of the 121 antipodal pairs can have at most one point in the set, leading to only 122 points. <math>\Box</math><br />
<br />
:'''Lemma 2''': Every middle slice of <math>{\mathcal A}</math> has at most 41 points.<br />
<br />
'''Proof''' Without loss of generality we may consider the 2**** slice. There are two cases, depending on the value of c(2****).<br />
<br />
Suppose first that c is at least 17; thus there are at least 17 points of the form 222xy, 22xy2, 2xy22, or 2x22y, where the x, y denote 1 or 3. This gives 36 "xy" wildcards in all in four coordinate slots; by the pigeonhole principle one of the slots sees at least 9 of the wildcards. By symmetry, we may assume that the second coordinate slot sees at least 9 of these wildcards, thus there are at least 9 points of the form 2x22y, 2x2y2, 2xy22. The x=1, x=3 cases can absorb at most six of these, thus each of these cases must absorb at least three points, with at least one absorbing at least five. Let's say that it's the x=3 case that absorbs 5; thus <math>d(*1***) \geq 3</math> and <math>d(*3***) \geq 5</math>. From the n=4 theory this means that the *1*** slice has at most 41 points, and the *3*** slice has at most 40. Meanwhile, the middle slice has at most 43, leading to 41+41+42=124 points in all.<br />
<br />
Now suppose c is less than 16; then by the n=4 theory the middle slice is one of the eight (6,24,12,0,0) sets. Without loss of generality we may take it to be <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math>; in particular, the middle slice contains the points 21122 21212 21221 23322 23232 23223. In particular, the *1*** and *3*** slices have a "d" value of at least three, and so have at most 41 points. If the *2*** slice has at most 42 points, then we are at 41+42+41=124 points as needed, but if we have 43 or more, then we are back in the first case (as <math>c(*2***) \geq 17</math>) after permuting the indices.<br />
<math>\Box</math><br />
<br />
Since 125=41+41+43, we thus have<br />
<br />
:'''Corollary 1''': Every side slice of <math>{\mathcal A}</math> has at least 41 points. If one side slice does have 41 points, then the other has 43.<br />
<br />
Combining this with the n=4 statistics, we conclude<br />
<br />
:'''Corollary 2''': Every side slice of <math>{\mathcal A}</math> has e=0, and <math>d \leq 3</math>. Given two opposite side slices, e.g. 1**** and 3****, we have <math>d(1****)+d(3****) \leq 4</math>.<br />
<br />
:'''Corollary 3''': E=0.<br />
<br />
:'''Corollary 4''': Any middle slice has a "c" value of at most 8.<br />
<br />
'''Proof''' Let's work with the 2**** slice. By Corollary 2, there are at most four contributions to c(2****) of the form 21*** or 23***, and similarly for the other three positions. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 3''': <math>2B+C \leq 160</math>. <br />
<br />
'''Proof''': There are 160 lines connecting one "C" point to two "B" points (e.g. 11112, 11122, 11132); each "C" point lies in two of these, and each "B" point lies on four. A Moser set can have at most two points out of each of these lines. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 4''' Given m A-points with <math>m \geq 5</math>, there exists at least m-4 pairs (a,b) of such A-points with Hamming distance exactly two (i.e. b differs from a in exactly two places).<br />
<br />
'''Proof''' It suffices to check this for m=5, since the case of larger m then follows by locating a pair, removing a point that contributes to that pair, and using the induction hypothesis. Given 5 A-points, we may assume by the pigeonhole principle and symmetry that at least three of them have an odd number of 1s. Suppose 11111 is one of the points, and that no pair has Hamming distance 2. All points with two 3s are excluded, so the only points allowed with an odd number of 1s are those with four 3s. But all those points differ from each other in two positions, so at most one of them is allowed. <math>\Box</math><br />
<br />
:'''Corollary 5''' <math>C \leq 79</math>.<br />
<br />
'''Proof''': Suppose for contradiction that C=80, then D=0 and <math>B \leq 40</math>. From Lemma 4 we also see that A cannot exceed 5, leading to the contradiction.<br />
<math>\Box</math><br />
<br />
Define the '''score''' of a Moser set in <math>[3]^4</math> to be the quantity <math>a + 5b/4 + 5c/3 + 5d/2 + 5e</math>. Double-counting (and Lemma 2) gives<br />
<br />
:'''Lemma 5''' The total score of all the ten side-slices of <math>{\mathcal A}</math> is equal to <math>5|{\mathcal A}| = 5 \times 125</math>. In particular, there exists a pair of opposite side-slices whose scores add up to at least 125.<br />
<br />
By Lemma 5 and symmetry, we may assume that the 1**** and 3**** slices have score adding up to at least 125. By Lemma 2, the 2**** slice has at most 41 points, which imply that the 1**** and 3**** have 41, 42, or 43 points. From the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] we know that all 41-point, 42-point, 43-point slices have score less than 62, with the following exceptions:<br />
<br />
# (2,16,24,0,0) [42 points, score: 62]<br />
# (4,16,23,0,0) [43 points, score: 62 1/3]<br />
# (4,15,24,0,0) [43 points, score: 62 3/4]<br />
# (3,16,24,0,0) [43 points, score: 63]<br />
<br />
Thus the 1**** and 3**** slices must from the above list. Furthermore, if one of the slices is of type 1, then the other must be of type 4, and if one slice is of type 2, then the other must be of type 3 or 4.<br />
<br />
:'''Lemma 6''' There exists one cut in which the side slices have total score strictly greater than 125 (i.e. they thus involve only Type 2, Type 3, and Type 4 slices, with at least one side slice not equal to Type 2, and the cut here is of the form 43+39+43).<br />
<br />
'''Proof''' If not, then all cuts have side slices exactly equal to 125, which by the above table implies that one is Type 1 and one is Type 4, in particular all side slices have c=24. But this forces C=80, contradicting Corollary 5. <math>\Box</math><br />
<br />
Adding up the corners we conclude<br />
<br />
:'''Corollary 6''' <math>6 \leq A \leq 8</math>.<br />
<br />
:'''Lemma 7''' <math>D=0</math>.<br />
<br />
'''Proof''' Firstly, from Lemma 6 we have a cut in which the two side slices are omitting at most one C-point between them (and have no D-point), which forces the middle slice to have at most one D-point; thus D is at most 1.<br />
<br />
Now suppose instead that D=1 (e.g. if 11222 was in the set); then there would be two choices of coordinates in which one of the side slices would have d=1 (e.g. 1**** and *1****). But the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] show that such slices have a score of at most 59 7/12, so that the total score from those two coordinates is at most 63 + 59 7/12 = 122 7/12. On the other hand, the other three slices have a net score of at most 63+63 = 126. This averages out to at most 124.633... < 125, a contradiction. <math>\Box</math><br />
<br />
:'''Corollary 7''' Every middle slice has at most 40 points.<br />
<br />
'''Proof''' From Lemma 7 we see that the middle slice must have c=0, but from the n=4 statistics this is not possible for any slice of size 41 or higher (alternatively, one can use the inequalities <math>4a+b \leq 64</math> and <math>b \leq 32</math>). <math>\Box</math><br />
<br />
Each middle slice now has 39 or 40 points, with c=d=e=0, and so from the inequalities <math>4a+b \leq 64</math>, <math>b \leq 32</math> must have statistics (8,32,0,0,0),(7,32,0,0,0), or (8,31,0,0,0). In particular the "a" index of the middle slices is at most 8. Summing over all middle slices we conclude<br />
<br />
:'''Lemma 9''' B is at most 40.<br />
<br />
:'''Lemma 10''' C is equal to 78 or 79.<br />
<br />
'''Proof''' By Corollary 5, it suffices to show that <math>C \geq 78</math>.<br />
<br />
As 125=43+39+43, we see that every center slice must have at least 39 points. By Lemma 2 and Lemma 7 the center slice has c=d=e=0, thus the center slice has a+b >= 39. On the other hand, from the n=4 theory we have 4a+b <= 64, which forces b >= 31.<br />
<br />
By double counting, we see that 2C is equal to the sum of the b's of all the five center slices. Thus C >= 5*31/2 = 77.5 and the claim follows. <math>\Box</math><br />
<br />
From Lemmas 9 and 10 we see that <math>B+C \leq 119</math>, and thus <math>A \geq 6</math>. Also, if <math>A \geq 7</math>, then by Lemma 4 we have at least three pairs of A points with Hamming distance 2. At most two of these pairs eliminate the same C point, so we would have C=78 in that case. <br />
<br />
Putting all the above facts together, we see that (A,B,C,D,E,F) must be one of the following triples:<br />
<br />
* (6,40,79,0,0,0)<br />
* (7,40,78,0,0,0)<br />
* (8,39,78,0,0,0)<br />
<br />
All three cases can be eliminated, giving <math>c'_5=124</math>.<br />
<br />
=== Elimination of (6,40,79,0,0,0) ===<br />
<br />
Look at the 6 A points. From Lemma 4 we have at least two pairs (a,b), (c,d) of A-points that have Hamming separation 2.<br />
<br />
Now look at the midpoints of these two pairs; these midpoints are C-points cannot lie in the set. But we have exactly one C-point missing from the set, thus the midpoints must be the same. By symmetry, we may thus assume that the two pairs are (11111,11133) and (11113,11131). Thus 11111,11133, 11113, 11131 are in the set, and so every C-point other than 11122 is in the set. On the other hand, the B-points 11121, 11123, 11112, 11132 lie outside the set.<br />
<br />
At most one of 11312, 11332 lie in the set (since 11322 lies in the set). Suppose that 11312 lies outside the set, then we have a pair (xy1z2, xy3z2) with x,y,z = 1,3 that is totally omitted from the set, namely (11112,11312). On the other hand, every other pair of this form can have at most point in the set, thus there are at most seven points in the set of the form (xyzw2) with x,y,z,w = 1,3. Similarly there are at most 8 points of the form xyz2w, or of xy2zw, x2yzw, 2xyzw, leading to at most 39 B-points in all, contradiction.<br />
<br />
=== Elimination of (7,40,78,0,0,0) ===<br />
<br />
By Lemma 4, we have at least three pairs of A-points of distance two apart that lie in the set.<br />
The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs must have the same midpoint, so we may assume as before that 111xy lies in the set for x,y=1,3, which implies that 1112* and 111*2 lie outside the set.<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 316. On the other hand, every one of the 160 lines can have at most two points, and four of these lines (namely 1112*, 111*2) have no points. Thus all the other lines must have exactly two points. <br />
<br />
We know that the C-point 11122 is missing from the set; there is one other missing C-point. Since 1112x, 111x2 lie outside the set, we conclude from the previous paragraph that 1132x, 113x2 and 1312x, 131x2 lie in the set. Taking midpoints we conclude that 11322 and 13122 lie outside the set. But this is now three C-points missing (together with 11122), a contradiction.<br />
<br />
=== Elimination of (8,39,78,0,0,0) ===<br />
<br />
By Lemma 4 we have at least four pairs of A-points of distance two apart that lie in the set. The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs (a,b), (c,d) must have the same midpoint p, and the other two pairs (a',b'), (c',d') must also have the same midpoint p'. (Note that every C-point is the midpoint of at most two such pairs.)<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 312. Every one of the 160 lines can have at most two points, and four of these (those in the plane of (a,b,c,d) or of (a',b',c',d') have no points. Thus all other lines must have exactly two points.<br />
<br />
Without loss of generality we have (a,b)=(11111,11133), (c,d) = (11113,11131), thus p = 11122. By permuting the first three indices, we may assume that p' is not of the form x2y2z, x2yz2, xy22z, xy2z2. Then 1112x lies outside the set and 1122x lies in the set, so by the above paragraph 1132x lies in the set; similarly for 113x2, 1312x, 131x2. This implies that 13122, 11322 lie outside the set, but this (together with 11122) shows that at least three C-points are missing, a contradiction.<br />
<br />
== General n ==<br />
<br />
General solution for <math>c'_N</math><br />
<br />
* q 2s, all points from A(N-q,1)<br />
* q-1 2s, points from A(N-q+1,2)<br />
* q-2 2s, points from A(N-q+2,3)<br />
* etc.<br />
<br />
where A(m,d) is a subset of <math>[1,3]^m</math> for which any two points differ from each other in at least d places.<br />
<br />
Mathworld’s entry on error-correcting codes suggests it might be NP-complete to find the size of A(m,d) in general.<br />
<br />
* <math>|A(m,1)| = 2^m</math> because it includes all points in <math>[1,3]^m</math><br />
* <math>|A(m,2)| = 2^{m-1}</math> because it can include all points in <math>[1,3]^m</math> with an odd number of 1s<br />
<br />
The integer programming routine from Maple 12 was used to obtain upper bounds for <math>c'_6</math> and <math>c'_7</math>. A large number of linear inequalities, described above in sections (n=3) and (n=4), were combined. The details are in [[Maple calculations]]. The results were that <math>c'_6 \le 361</math> and <math>c'_7 \le 1071</math>.<br />
<br />
A [[genetic algorithm]] has provided the following examples:<br />
<br />
* <math>c'_6 \geq 353</math> (26 examples; [http://twofoldgaze.wordpress.com/2009/03/10/353-element-solution/ here is one])<br />
* <math>c'_7 \geq 988</math> [http://twofoldgaze.wordpress.com/2009/03/10/978-element-solution/ Here is the example]<br />
<br />
== Larger sides (k>3) ==<br />
<br />
The following set gives a lower bound for Moser’s cube <math>[4]^n</math> (values 1,2,3,4): Pick all points where q entries are 2 or 3; and also pick those where q-1 entries are 2 or 3 and an odd number of entries are 1. This is maximized when q is near n/2, giving a lower bound of<br />
<br />
<math>\binom{n}{n/2} 2^n + \binom{n}{n/2-1} 2^{n-1}</math><br />
<br />
which is comparable to <math>4^n/\sqrt{n}</math> by [[Stirling's formula]].<br />
<br />
For k=5 (values 1,2,3,4,5) If A, B, C, D, and E denote the numbers of 1-s, 2-s, 3-s, 4-s and 5-s then the first three points of a geometric line form a 3-term arithmetic progression in A+E+2(B+D)+3C. So, for k=5 we have a similar lower bound for the Moser’s problem as for DHJ k=3, i.e. <math>5^{n - O(\sqrt{\log n})}</math>.<br />
<br />
The k=6 version of Moser implies DHJ(3). Indeed, any k=3 combinatorial line-free set can be "doubled up" into a k=6 geometric line-free set of the same density by pulling back the set from the map <math>\phi: [6]^n \to [3]^n</math> that maps 1, 2, 3, 4, 5, 6 to 1, 2, 3, 3, 2, 1 respectively; note that this map sends k=6 geometric lines to k=3 combinatorial lines.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Moser%27s_cube_problemMoser's cube problem2009-03-21T04:08:36Z<p>121.220.134.232: /* n=2 */</p>
<hr />
<div>Define a ''Moser set'' to be a subset of <math>[3]^n</math> which does not contain any [[geometric line]], and let <math>c'_n</math> denote the size of the largest Moser set in <math>[3]^n</math>. The first few values are (see [http://www.research.att.com/~njas/sequences/A003142 OEIS A003142]):<br />
<br />
: <math>c'_0 = 1; c'_1 = 2; c'_2 = 6; c'_3 = 16; c'_4 = 43; c'_5 = 124.</math><br />
<br />
Beyond this point, we only have some upper and lower bounds, in particular <math>353 \leq c'_6 \leq 361</math>; see [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DsU-uZ1tK7VEg this spreadsheet] for the latest bounds.<br />
<br />
The best known asymptotic lower bound for <math>c'_n</math> is<br />
<br />
: <math>c'_n \gg 3^n/\sqrt{n}</math>,<br />
<br />
formed by fixing the number of 2s to a single value near n/3. Compare this to the DHJ(2) or Sperner limit of <math>2^n/\sqrt{n}</math>. Is it possible to do any better? Note that we have a [[Upper_and_lower_bounds#Asymptotics|significantly better bound]] for the DHJ(3) <math>c_n</math>:<br />
<br />
: <math>c_n \geq 3^{n-O(\sqrt{\log n})}</math>.<br />
<br />
A more precise lower bound is<br />
<br />
: <math>c'_n \geq \binom{n+1}{q} 2^{n-q}</math><br />
<br />
where q is the nearest integer to <math>n/3</math>, formed by taking all strings with q 2s, together with all strings with q-1 2s and an odd number of 1s. This for instance gives the lower bound <math>c'_5 \geq 120</math>, which compares with the upper bound <math>c'_5 \leq 3 c'_4 = 129</math>.<br />
<br />
Using [[DHJ(3)]], we have the upper bound<br />
<br />
: <math>c'_n = o(3^n)</math>,<br />
<br />
but no effective decay rate is known. It would be good to have a combinatorial proof of this fact (which is weaker than [[DHJ(3)]], but implies [[Roth's theorem]]).<br />
<br />
== Notation ==<br />
<br />
Given a Moser set A in <math>[3]^n</math>, we let a be the number of points in A with no 2s, b be the number of points in A with one 2, c the number of points with two 2s, etc. We call (a,b,c,...) the ''statistics'' of A. Given a slice S of <math>[3]^n</math>, we let a(S), b(S), etc. denote the statistics of that slice (dropping the fixed coordinates). Thus for instance if A = {11, 12, 22, 32}, then (a,b,c) = (1,2,1), and (a(1*),b(1*)) = (1,1).<br />
<br />
We call a statistic (a,b,c,...) ''attainable'' if it is attained by a Moser set. We say that an attainable statistic is ''Pareto-optimal'' if it cannot be pointwise dominated by any other attainable statistic (a',b',c',...) (thus <math>a' \geq a</math>, <math>b' \geq b</math>, etc.) We say that it is ''extremal'' if it is not a convex combination of any other attainable statistic (this is a stronger property than Pareto-optimal). For the purposes of maximising linear scores of attainable statistics, it suffices to check extremal statistics.<br />
<br />
We let <math>(\alpha_0,\alpha_1,\ldots)</math> be the normalized version of <math>(a,b,\ldots)</math>, in which one divides the number of points of a certain type in the set by the total number of points in the set. Thus for instance <math>\alpha_0 = a/2^n, \alpha_1 = b/(n 2^{n-1}), \alpha_3 = c/(\binom{n}{2} 2^{n-2})</math>, etc., and the <math>\alpha_i</math> range between 0 and 1. Averaging arguments show that any linear inequality obeyed by the <math>\alpha_i</math> at one dimension is automatically inherited by higher dimensions, as are shifted versions of this inequality (in which <math>\alpha_i</math> is replaced by <math>\alpha_{i+1}</math>.<br />
<br />
The idea in 'c-statistics' is to identify 1s and 3s but leave the 2s intact. Let’s use x to denote letters that are either 1 or 3, then the 81 points get split up into 16 groups: 2222 (1 point), 222x, 22×2, 2×22, x222 (two points each), 22xx, 2×2x, x22x, 2xx2, x2×2, xx22 (four points each), 2xxx, x2xx, xx2x, xxx2 (eight points each), xxxx (sixteen points). Let c(w) denote the number of points inside a group w, e.g. c(xx22) is the number of points of the form xx22 inside the set, and is thus an integer from 0 to 4.<br />
<br />
== n=0 ==<br />
<br />
We trivially have <math>c'_0=1.</math><br />
<br />
== n=1 ==<br />
<br />
We trivially have <math>c'_1=2.</math> The Pareto-optimal values of the statistics (a,b) are (1,1) and (2,0); these are also the extremals. We thus have the inequality <math>a+b \leq 2</math>, or in normalized notation<br />
<br />
:<math>2\alpha_0 + \alpha_1 \leq 2</math>.<br />
<br />
== n=2 ==<br />
<br />
We have <math>c'_2 = 6</math>; the upper bound follows since <math>c'_2 \leq 3 c'_1</math>, and the lower bound follows by deleting one of the two diagonals from <math>[3]^2</math> (these are the only extremisers).<br />
<br />
The extremiser has statistics (a,b,c) = (2,4,0), which are of course Pareto-optimal. If c=1 then we must have a, b at most 2 (look at the lines through 22). This is attainable (e.g. {11, 12, 22, 23, 31}, and so (2,2,1) is another Pareto-optimal statistic. If a=4, then b and c must be 0, so we get another Pareto-optimal statistic (4,0,0) (attainable by {11, 13, 31, 33} of course). If a=3, then c=0 and b is at most 2, giving another Pareto-optimal statistic (3,2,0); but this is a convex combination of (4,0,0) and (2,4,0) and is thus not extremal. Thus the complete set of extremal statistics are<br />
<br />
(4,0,0), (2,4,0), (2,2,1).<br />
<br />
The sharp linear inequalities obeyed by a,b,c (other than the trivial ones <math>a,b,c \geq 0</math>) are then<br />
<br />
*<math>2a+b+2c \leq 8</math><br />
*<math>b+2c \leq 4</math><br />
*<math>a+2c \leq 4</math><br />
*<math>c \leq 1</math>.<br />
<br />
In normalized notation, we have<br />
*<math>4\alpha_0 + 2\alpha_1 + \alpha_2 \leq 4</math><br />
*<math>2\alpha_1 + \alpha_2 \leq 2</math><br />
*<math>2\alpha_0 + \alpha_2 \leq 2</math><br />
*<math>\alpha_2 \leq 1</math>.<br />
<br />
The Pareto optimizers for c-statistics are (c22,c2x,cx2,cxx) = (1112),(0222),(0004), which are covered by the linear inequalities c22+c2x+cxx <= 4, c22+cx2+cxx <= 4.<br />
<br />
== n=3 ==<br />
<br />
We have <math>c'_3 = 16</math>. The lower bound can be seen for instance by taking all the strings with one 2, and half the strings with no 2 (e.g. the strings with an odd number of 1s). The upper bound can be deduced from the corresponding [[upper and lower bounds]] for <math>c_3 = 18</math>; the 17-point and 18-point line-free sets each contain a geometric line.<br />
<br />
If a Moser set in <math>[3]^3</math> contains 222, then it can have at most 14 points, since the remaining 26 points in the cube split into 13 antipodal pairs, and at most one of each pair can lie in the set. By exhausting over the <math>2^{13} = 8192</math> possibilities, it can be shown that it is impossible for a 14-point set to exist; any Moser set containing 222 must in fact omit at least one antipodal pair completely and thus have only 13 points. (A human proof of this fact can be [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
The Pareto-optimal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(3,6,5,0),(4,4,5,0),(3,7,4,0),(4,6,4,0), (3,9,3,0),(4,7,3,0),(5,4,3,0),(4,9,2,0),(5,6,2,0),(6,3,2,0),(3,10,1,0),(5,7,1,0), (6,4,1,0),(4,12,0,0),(5,9,0,0),(6,6,0,0),(7,3,0,0),(8,0,0,0).<br />
<br />
These were found from a search of the <math>2^{27}</math> subsets of the cube.<br />
A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here].<br />
<br />
The extremal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(4,4,5,0),(4,6,4,0),(4,12,0,0),(8,0,0,0)<br />
<br />
The sharp linear bounds are :<br />
<br />
* <math>2a+b+2c+4d \leq 22</math><br />
* <math>3a+2b+3c+6d \leq 36</math><br />
* <math>7a+2b+4c+8d \leq 56</math><br />
* <math>6a+2b+3c+6d \leq 48</math><br />
* <math>b+c+3d \leq 12</math><br />
* <math>a+2c+4d \leq 14</math><br />
* <math>5a+4c+8d \leq 40</math><br />
* <math>a+4d \leq 8</math><br />
* <math>b+6d \leq 12</math><br />
* <math>c+3d \leq 6</math><br />
* <math>d \leq 1</math><br />
<br />
In normalized notation,<br />
* <math>8\alpha_0+ 6\alpha_1 + 6\alpha_2 + 2\alpha_3 \leq 11</math><br />
* <math>4\alpha_0+4\alpha_1+3\alpha_2+\alpha_3 \leq 6</math><br />
* <math>7\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 7</math><br />
* <math>8\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 8</math><br />
* <math>4\alpha_1+2\alpha_2+\alpha_3 \leq 4</math><br />
* <math>4\alpha_0+6\alpha_2+2\alpha_3 \leq 7</math><br />
* <math>5\alpha_0+3\alpha_2+\alpha_3 \leq 5</math><br />
* <math>2\alpha_0+\alpha_3 \leq 2</math><br />
* <math>2\alpha_1+\alpha_3 \leq 2</math><br />
* <math>2\alpha_2+\alpha_3 \leq 2</math><br />
* <math>\alpha_3 \leq 1</math><br />
<br />
The c-statistics can also be found from an exhaustive search of Moser sets. The resulting Pareto sets and linear inequalities can be found on Sheet 8 of [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# this spreadsheet]<br />
<br />
== n=4 ==<br />
<br />
A [http://abel.math.umu.se/~klasm/extremal-moser-n=5-t=3 computer search] has obtained all extremisers to <math>c'_4=43</math>. The 42-point solutions can be found [http://abel.math.umu.se/~klasm/moser-n=4-t=3-p=42.gz here].<br />
<br />
'''Proof that <math>c'_4 \leq 43</math>''': <br />
When e=1 (i.e. the 4D set contains 2222) then we have at most 41 points (in fact at most 39) by counting antipodal points, so assume e=0.<br />
<br />
Define the score of a 3D slice to be a/4+b/3+c/2+d. Observe from double counting that the size of a 4D set is the sum of the scores of its eight side slices.<br />
<br />
But by [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en looking at the extremals] we see that the largest score is 44/8, attained at only one point, namely when (a,b,c,d) = (2,6,6,0). So the only way one can have a 44-point set is if all side slices are (2,6,6,0), or equivalently if the whole set has statistics (a,b,c,d,e) = (4,16,24,0,0). But then we have all the points with two 2s, which means that the four "a" points cannot be separated by Hamming distance 2. We conclude that we must have an antipodal pair among the "a" points with an odd number of 1s, and an antipodal pair among the "a" points with an even number of 1s. By the symmetries of the cube, we may take the a-set to then be 1111, 3333, 1113, 3331. But then the "b" set must exclude both 1112 and 3332, and so can have at most three points in the eight-point set xyz2 (with x,y,z=1,3) rather than four (to get four points one must alternate in a checkerboard pattern). Adding this to the at most four points of the form xy2z, x2yz, 2xyz we see that b is at most 16, a contradiction. <math>\Box</math><br />
<br />
Given a subset of <math>[3]^4</math>, let a be the number of points with no 2s, b be the number of points with 1 2, and so forth. The quintuple (a,b,c,d,e) thus lies between (0,0,0,0,0) and (16,32,24,8,1).<br />
<br />
The 43-point solutions have distributions (a,b,c,d,e) as follows:<br />
<br />
* (5,20,18,0,0) [16 solutions]<br />
* (4,16,23,0,0) [768 solutions]<br />
* (3,16,24,0,0) [512 solutions]<br />
* (4,15,24,0,0) [256 solutions]<br />
<br />
The 42-point solutions are [http://abel.math.umu.se/~klasm/Moser-42-stat.pdf distributed as follows]:<br />
<br />
* (6,24,12,0,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-12 8 solutions]] <br />
* (5,20,17,0,0) [576 solutions]<br />
* (5,19,18,0,0) [384 solutions]<br />
* (6,16,18,2,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-e=2 192 solutions]]<br />
* (4,20,18,0,0) [272 solutions]<br />
* (5,17,20,0,0) [192 solutions]<br />
* (5,16,21,0,0) [3584 solutions]<br />
* (4,17,21,0,0) [768 solutions]<br />
* (4,16,22,0,0) [26880 solutions]<br />
* (5,15,22,0,0) [1536 solutions]<br />
* (4,15,23,0,0) [22272 solutions]<br />
* (3,16,23,0,0) [15744 solutions]<br />
* (4,14,24,0,0) [4224 solutions]<br />
* (3,15,24,0,0) [8704 solutions]<br />
* (2,16,24,0,0) [896 solutions]<br />
<br />
Note how c is usually quite large, and d quite low.<br />
<br />
One of the (6,24,12,0,0) solutions is <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math> (i.e. the set of points containing exactly two 1s, and/or exactly two 3s). The other seven are reflections of this set.<br />
<br />
There are 2,765,200 41-point solutions, [http://abel.math.umu.se/~klasm/solutions-4-t=3-41-moser.gz listed here]. The statistics for such points can be [http://abel.math.umu.se/~klasm/Moser-41-stat.pdf found here]. Noteworthy features of the statistics:<br />
<br />
* d is at most 3 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=3.gz 256 exceptional solutions] of the shape (5,15,18,3,0), have d at most 2; here are the [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=2.gz d=2 solutions] and [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=1.gz d=1 solutions])<br />
* c is at least 6 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-c=6.gz 16 exceptional solutions] of the shape (7,28,6,0,0), have c at least 11).<br />
<br />
Statistics for the 41-point, 42-point, and 43-point solutions can be found [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en here].<br />
<br />
If a Moser set in <math>[3]^4</math> contains 2222, then by the n=3 theory, any middle slice (i.e. 2***, *2**, **2*, or ***2) is missing at least one antipodal pair. But each antipodal pair belongs to at most three middle slices, thus two of the 40 antipodal pairs must be completely missing. As a consequence, any Moser set containing 2222 can have at most 39 points.<br />
(A more refined analysis can be found at [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
We have the following inequalities connecting a,b,c,d,e:<br />
<br />
* <math>4a+b \leq 64</math>: There are 32 lines connecting two "a" points with a "b" point; each "a" point belongs to four of these lines, and each "b" point belongs to one. But each such line can have at most two points in the set, and the claim follows.<br />
* This can be refined to <math>4a+b+\frac{2}{3} c \leq 64</math>: There are 24 planes connecting four "a" points, four "b" points, and one "c" point; each "a" point belongs to six of these, each "b" point belongs to three, and each "c" point belongs to one. For each of these planes, we have <math>2a + b + 2c \leq 8</math> from the n=2 theory, and the claim follows.<br />
* <math>6a+2c \leq 96</math>: There are 96 lines connecting two "a" points with a "c" point; each "a" point belongs to six of these lines, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
* <math>3b+2c \leq 96</math>: There are 48 lines connecting two "b" points to a "c" point; each "b" point belongs to three of these points, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
<br />
The inequalities for n=3 imply inequalities for n=4. Indeed, there are eight side slices of <math>[3]^4</math>; each "a" point belongs to four of these, each "b" point belongs to three, each "c" point belongs to two, and each "d" point belongs to one. Thus, any inequality of the form<br />
:<math> \alpha a + \beta b + \gamma c + \delta d \leq M</math><br />
in three dimensions implies the inequality<br />
:<math> 4 \alpha a + 3 \beta b + 2 \gamma c + \delta d \leq 8M</math><br />
in four dimensions. Thus we have<br />
<br />
* <math>8a+3b+4c+4d \leq 176</math><br />
* <math>2a+b+c+d \leq 48</math><br />
* <math>14a+3b+4c+4d \leq 244</math><br />
* <math>4a+b+c+d \leq 64</math> <br />
* <math>3b+2c+3d \leq 96</math><br />
* <math>a+c+d \leq 28</math><br />
* <math>5a+2c+2d \leq 80</math><br />
* <math>a+d \leq 16</math><br />
* <math>b+2d \leq 32</math><br />
* <math>2c+3d \leq 48</math><br />
<br />
Cubes also sit diagonally in the 4-dimensional cube. They may have coordinates xxyz, where xx runs over (11,22,33) or (13,22,31), while y and z run over (1,2,3). These cubes have a different distribution of 2s than the ordinary slices: (a,b,c,d,e) = (8,8,6,4,1) instead of (8,12,6,1,0) for the side slices and (0,8,12,6,1) for the middle slices. So a different set of inequalities arise. It is possible to run through the same procedure as described above for n=3, and the inequalities that arise are <math>2a+b+2c+2d+4e \le 24</math> and <math>4a+b+2c+2d+4e \le 32</math> within the 3D cube, which when averaged becomes <math>4a+b+2c+4d+16e \le 96</math> and <math>8a+b+2c+4d+16e \le 128</math> for the 4D cube. A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here] on sheet 2. Other sheets of this spreadsheet contain results for a 3D cube sitting diagonally in a 5D, 6D or 7D cube. Notice the similarity of the equations that arise for the xxxyz, xxyyz and xxxxyz diagonals. Also notice the xxxxyyz diagonals have the same Pareto sets and linear inequalities as the cube's c-statistics<br />
<br />
== n=5 ==<br />
<br />
Let (A,B,C,D,E,F) be the statistics of a five-dimensional Moser set, thus (A,B,C,D,E,F) varies between (0,0,0,0,0,0) and (32,80,80,40,10,1).<br />
<br />
There are several Moser sets with the statistics (4,40,80,0,0,0), which thus have 124 points. Indeed, one can take<br />
<br />
* all points with two 2s;<br />
* all points with one 2 and an even number of 1s; and<br />
* (13111),(13113),(31311),(13333). Any two of these four points differ in three places, except for one pair of points that differ in one place.<br />
<br />
For the rest of this section, we assume that the Moser set <math>{\mathcal A}</math> is a 125-point set.<br />
<br />
:'''Lemma 1''': F=0.<br />
<br />
'''Proof''' If F is non-zero, then the Moser set contains 22222, then each of the 121 antipodal pairs can have at most one point in the set, leading to only 122 points. <math>\Box</math><br />
<br />
:'''Lemma 2''': Every middle slice of <math>{\mathcal A}</math> has at most 41 points.<br />
<br />
'''Proof''' Without loss of generality we may consider the 2**** slice. There are two cases, depending on the value of c(2****).<br />
<br />
Suppose first that c is at least 17; thus there are at least 17 points of the form 222xy, 22xy2, 2xy22, or 2x22y, where the x, y denote 1 or 3. This gives 36 "xy" wildcards in all in four coordinate slots; by the pigeonhole principle one of the slots sees at least 9 of the wildcards. By symmetry, we may assume that the second coordinate slot sees at least 9 of these wildcards, thus there are at least 9 points of the form 2x22y, 2x2y2, 2xy22. The x=1, x=3 cases can absorb at most six of these, thus each of these cases must absorb at least three points, with at least one absorbing at least five. Let's say that it's the x=3 case that absorbs 5; thus <math>d(*1***) \geq 3</math> and <math>d(*3***) \geq 5</math>. From the n=4 theory this means that the *1*** slice has at most 41 points, and the *3*** slice has at most 40. Meanwhile, the middle slice has at most 43, leading to 41+41+42=124 points in all.<br />
<br />
Now suppose c is less than 16; then by the n=4 theory the middle slice is one of the eight (6,24,12,0,0) sets. Without loss of generality we may take it to be <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math>; in particular, the middle slice contains the points 21122 21212 21221 23322 23232 23223. In particular, the *1*** and *3*** slices have a "d" value of at least three, and so have at most 41 points. If the *2*** slice has at most 42 points, then we are at 41+42+41=124 points as needed, but if we have 43 or more, then we are back in the first case (as <math>c(*2***) \geq 17</math>) after permuting the indices.<br />
<math>\Box</math><br />
<br />
Since 125=41+41+43, we thus have<br />
<br />
:'''Corollary 1''': Every side slice of <math>{\mathcal A}</math> has at least 41 points. If one side slice does have 41 points, then the other has 43.<br />
<br />
Combining this with the n=4 statistics, we conclude<br />
<br />
:'''Corollary 2''': Every side slice of <math>{\mathcal A}</math> has e=0, and <math>d \leq 3</math>. Given two opposite side slices, e.g. 1**** and 3****, we have <math>d(1****)+d(3****) \leq 4</math>.<br />
<br />
:'''Corollary 3''': E=0.<br />
<br />
:'''Corollary 4''': Any middle slice has a "c" value of at most 8.<br />
<br />
'''Proof''' Let's work with the 2**** slice. By Corollary 2, there are at most four contributions to c(2****) of the form 21*** or 23***, and similarly for the other three positions. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 3''': <math>2B+C \leq 160</math>. <br />
<br />
'''Proof''': There are 160 lines connecting one "C" point to two "B" points (e.g. 11112, 11122, 11132); each "C" point lies in two of these, and each "B" point lies on four. A Moser set can have at most two points out of each of these lines. Double counting then gives the claim. <math>\Box</math><br />
<br />
:'''Lemma 4''' Given m A-points with <math>m \geq 5</math>, there exists at least m-4 pairs (a,b) of such A-points with Hamming distance exactly two (i.e. b differs from a in exactly two places).<br />
<br />
'''Proof''' It suffices to check this for m=5, since the case of larger m then follows by locating a pair, removing a point that contributes to that pair, and using the induction hypothesis. Given 5 A-points, we may assume by the pigeonhole principle and symmetry that at least three of them have an odd number of 1s. Suppose 11111 is one of the points, and that no pair has Hamming distance 2. All points with two 3s are excluded, so the only points allowed with an odd number of 1s are those with four 3s. But all those points differ from each other in two positions, so at most one of them is allowed. <math>\Box</math><br />
<br />
:'''Corollary 5''' <math>C \leq 79</math>.<br />
<br />
'''Proof''': Suppose for contradiction that C=80, then D=0 and <math>B \leq 40</math>. From Lemma 4 we also see that A cannot exceed 5, leading to the contradiction.<br />
<math>\Box</math><br />
<br />
Define the '''score''' of a Moser set in <math>[3]^4</math> to be the quantity <math>a + 5b/4 + 5c/3 + 5d/2 + 5e</math>. Double-counting (and Lemma 2) gives<br />
<br />
:'''Lemma 5''' The total score of all the ten side-slices of <math>{\mathcal A}</math> is equal to <math>5|{\mathcal A}| = 5 \times 125</math>. In particular, there exists a pair of opposite side-slices whose scores add up to at least 125.<br />
<br />
By Lemma 5 and symmetry, we may assume that the 1**** and 3**** slices have score adding up to at least 125. By Lemma 2, the 2**** slice has at most 41 points, which imply that the 1**** and 3**** have 41, 42, or 43 points. From the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] we know that all 41-point, 42-point, 43-point slices have score less than 62, with the following exceptions:<br />
<br />
# (2,16,24,0,0) [42 points, score: 62]<br />
# (4,16,23,0,0) [43 points, score: 62 1/3]<br />
# (4,15,24,0,0) [43 points, score: 62 3/4]<br />
# (3,16,24,0,0) [43 points, score: 63]<br />
<br />
Thus the 1**** and 3**** slices must from the above list. Furthermore, if one of the slices is of type 1, then the other must be of type 4, and if one slice is of type 2, then the other must be of type 3 or 4.<br />
<br />
:'''Lemma 6''' There exists one cut in which the side slices have total score strictly greater than 125 (i.e. they thus involve only Type 2, Type 3, and Type 4 slices, with at least one side slice not equal to Type 2, and the cut here is of the form 43+39+43).<br />
<br />
'''Proof''' If not, then all cuts have side slices exactly equal to 125, which by the above table implies that one is Type 1 and one is Type 4, in particular all side slices have c=24. But this forces C=80, contradicting Corollary 5. <math>\Box</math><br />
<br />
Adding up the corners we conclude<br />
<br />
:'''Corollary 6''' <math>6 \leq A \leq 8</math>.<br />
<br />
:'''Lemma 7''' <math>D=0</math>.<br />
<br />
'''Proof''' Firstly, from Lemma 6 we have a cut in which the two side slices are omitting at most one C-point between them (and have no D-point), which forces the middle slice to have at most one D-point; thus D is at most 1.<br />
<br />
Now suppose instead that D=1 (e.g. if 11222 was in the set); then there would be two choices of coordinates in which one of the side slices would have d=1 (e.g. 1**** and *1****). But the [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en n=4 statistics] show that such slices have a score of at most 59 7/12, so that the total score from those two coordinates is at most 63 + 59 7/12 = 122 7/12. On the other hand, the other three slices have a net score of at most 63+63 = 126. This averages out to at most 124.633... < 125, a contradiction. <math>\Box</math><br />
<br />
:'''Corollary 7''' Every middle slice has at most 40 points.<br />
<br />
'''Proof''' From Lemma 7 we see that the middle slice must have c=0, but from the n=4 statistics this is not possible for any slice of size 41 or higher (alternatively, one can use the inequalities <math>4a+b \leq 64</math> and <math>b \leq 32</math>). <math>\Box</math><br />
<br />
Each middle slice now has 39 or 40 points, with c=d=e=0, and so from the inequalities <math>4a+b \leq 64</math>, <math>b \leq 32</math> must have statistics (8,32,0,0,0),(7,32,0,0,0), or (8,31,0,0,0). In particular the "a" index of the middle slices is at most 8. Summing over all middle slices we conclude<br />
<br />
:'''Lemma 9''' B is at most 40.<br />
<br />
:'''Lemma 10''' C is equal to 78 or 79.<br />
<br />
'''Proof''' By Corollary 5, it suffices to show that <math>C \geq 78</math>.<br />
<br />
As 125=43+39+43, we see that every center slice must have at least 39 points. By Lemma 2 and Lemma 7 the center slice has c=d=e=0, thus the center slice has a+b >= 39. On the other hand, from the n=4 theory we have 4a+b <= 64, which forces b >= 31.<br />
<br />
By double counting, we see that 2C is equal to the sum of the b's of all the five center slices. Thus C >= 5*31/2 = 77.5 and the claim follows. <math>\Box</math><br />
<br />
From Lemmas 9 and 10 we see that <math>B+C \leq 119</math>, and thus <math>A \geq 6</math>. Also, if <math>A \geq 7</math>, then by Lemma 4 we have at least three pairs of A points with Hamming distance 2. At most two of these pairs eliminate the same C point, so we would have C=78 in that case. <br />
<br />
Putting all the above facts together, we see that (A,B,C,D,E,F) must be one of the following triples:<br />
<br />
* (6,40,79,0,0,0)<br />
* (7,40,78,0,0,0)<br />
* (8,39,78,0,0,0)<br />
<br />
All three cases can be eliminated, giving <math>c'_5=124</math>.<br />
<br />
=== Elimination of (6,40,79,0,0,0) ===<br />
<br />
Look at the 6 A points. From Lemma 4 we have at least two pairs (a,b), (c,d) of A-points that have Hamming separation 2.<br />
<br />
Now look at the midpoints of these two pairs; these midpoints are C-points cannot lie in the set. But we have exactly one C-point missing from the set, thus the midpoints must be the same. By symmetry, we may thus assume that the two pairs are (11111,11133) and (11113,11131). Thus 11111,11133, 11113, 11131 are in the set, and so every C-point other than 11122 is in the set. On the other hand, the B-points 11121, 11123, 11112, 11132 lie outside the set.<br />
<br />
At most one of 11312, 11332 lie in the set (since 11322 lies in the set). Suppose that 11312 lies outside the set, then we have a pair (xy1z2, xy3z2) with x,y,z = 1,3 that is totally omitted from the set, namely (11112,11312). On the other hand, every other pair of this form can have at most point in the set, thus there are at most seven points in the set of the form (xyzw2) with x,y,z,w = 1,3. Similarly there are at most 8 points of the form xyz2w, or of xy2zw, x2yzw, 2xyzw, leading to at most 39 B-points in all, contradiction.<br />
<br />
=== Elimination of (7,40,78,0,0,0) ===<br />
<br />
By Lemma 4, we have at least three pairs of A-points of distance two apart that lie in the set.<br />
The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs must have the same midpoint, so we may assume as before that 111xy lies in the set for x,y=1,3, which implies that 1112* and 111*2 lie outside the set.<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 316. On the other hand, every one of the 160 lines can have at most two points, and four of these lines (namely 1112*, 111*2) have no points. Thus all the other lines must have exactly two points. <br />
<br />
We know that the C-point 11122 is missing from the set; there is one other missing C-point. Since 1112x, 111x2 lie outside the set, we conclude from the previous paragraph that 1132x, 113x2 and 1312x, 131x2 lie in the set. Taking midpoints we conclude that 11322 and 13122 lie outside the set. But this is now three C-points missing (together with 11122), a contradiction.<br />
<br />
=== Elimination of (8,39,78,0,0,0) ===<br />
<br />
By Lemma 4 we have at least four pairs of A-points of distance two apart that lie in the set. The midpoints of these pairs are C-points that do not lie in the set; but there are only two such C-points, thus two pairs (a,b), (c,d) must have the same midpoint p, and the other two pairs (a',b'), (c',d') must also have the same midpoint p'. (Note that every C-point is the midpoint of at most two such pairs.)<br />
<br />
Now consider the 160 lines between 2 B points and one C point (cf. Lemma 3). The sum of all the points in each such line (counting multiplicity) is 4B+2C = 312. Every one of the 160 lines can have at most two points, and four of these (those in the plane of (a,b,c,d) or of (a',b',c',d') have no points. Thus all other lines must have exactly two points.<br />
<br />
Without loss of generality we have (a,b)=(11111,11133), (c,d) = (11113,11131), thus p = 11122. By permuting the first three indices, we may assume that p' is not of the form x2y2z, x2yz2, xy22z, xy2z2. Then 1112x lies outside the set and 1122x lies in the set, so by the above paragraph 1132x lies in the set; similarly for 113x2, 1312x, 131x2. This implies that 13122, 11322 lie outside the set, but this (together with 11122) shows that at least three C-points are missing, a contradiction.<br />
<br />
== General n ==<br />
<br />
General solution for <math>c'_N</math><br />
<br />
* q 2s, all points from A(N-q,1)<br />
* q-1 2s, points from A(N-q+1,2)<br />
* q-2 2s, points from A(N-q+2,3)<br />
* etc.<br />
<br />
where A(m,d) is a subset of <math>[1,3]^m</math> for which any two points differ from each other in at least d places.<br />
<br />
Mathworld’s entry on error-correcting codes suggests it might be NP-complete to find the size of A(m,d) in general.<br />
<br />
* <math>|A(m,1)| = 2^m</math> because it includes all points in <math>[1,3]^m</math><br />
* <math>|A(m,2)| = 2^{m-1}</math> because it can include all points in <math>[1,3]^m</math> with an odd number of 1s<br />
<br />
The integer programming routine from Maple 12 was used to obtain upper bounds for <math>c'_6</math> and <math>c'_7</math>. A large number of linear inequalities, described above in sections (n=3) and (n=4), were combined. The details are in [[Maple calculations]]. The results were that <math>c'_6 \le 361</math> and <math>c'_7 \le 1071</math>.<br />
<br />
A [[genetic algorithm]] has provided the following examples:<br />
<br />
* <math>c'_6 \geq 353</math> (26 examples; [http://twofoldgaze.wordpress.com/2009/03/10/353-element-solution/ here is one])<br />
* <math>c'_7 \geq 988</math> [http://twofoldgaze.wordpress.com/2009/03/10/978-element-solution/ Here is the example]<br />
<br />
== Larger sides (k>3) ==<br />
<br />
The following set gives a lower bound for Moser’s cube <math>[4]^n</math> (values 1,2,3,4): Pick all points where q entries are 2 or 3; and also pick those where q-1 entries are 2 or 3 and an odd number of entries are 1. This is maximized when q is near n/2, giving a lower bound of<br />
<br />
<math>\binom{n}{n/2} 2^n + \binom{n}{n/2-1} 2^{n-1}</math><br />
<br />
which is comparable to <math>4^n/\sqrt{n}</math> by [[Stirling's formula]].<br />
<br />
For k=5 (values 1,2,3,4,5) If A, B, C, D, and E denote the numbers of 1-s, 2-s, 3-s, 4-s and 5-s then the first three points of a geometric line form a 3-term arithmetic progression in A+E+2(B+D)+3C. So, for k=5 we have a similar lower bound for the Moser’s problem as for DHJ k=3, i.e. <math>5^{n - O(\sqrt{\log n})}</math>.<br />
<br />
The k=6 version of Moser implies DHJ(3). Indeed, any k=3 combinatorial line-free set can be "doubled up" into a k=6 geometric line-free set of the same density by pulling back the set from the map <math>\phi: [6]^n \to [3]^n</math> that maps 1, 2, 3, 4, 5, 6 to 1, 2, 3, 3, 2, 1 respectively; note that this map sends k=6 geometric lines to k=3 combinatorial lines.</div>121.220.134.232http://michaelnielsen.org/polymath1/index.php?title=Moser%27s_cube_problemMoser's cube problem2009-03-21T04:03:45Z<p>121.220.134.232: /* Larger sides */</p>
<hr />
<div>Define a ''Moser set'' to be a subset of <math>[3]^n</math> which does not contain any [[geometric line]], and let <math>c'_n</math> denote the size of the largest Moser set in <math>[3]^n</math>. The first few values are (see [http://www.research.att.com/~njas/sequences/A003142 OEIS A003142]):<br />
<br />
: <math>c'_0 = 1; c'_1 = 2; c'_2 = 6; c'_3 = 16; c'_4 = 43; c'_5 = 124.</math><br />
<br />
Beyond this point, we only have some upper and lower bounds, in particular <math>353 \leq c'_6 \leq 361</math>; see [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DsU-uZ1tK7VEg this spreadsheet] for the latest bounds.<br />
<br />
The best known asymptotic lower bound for <math>c'_n</math> is<br />
<br />
: <math>c'_n \gg 3^n/\sqrt{n}</math>,<br />
<br />
formed by fixing the number of 2s to a single value near n/3. Compare this to the DHJ(2) or Sperner limit of <math>2^n/\sqrt{n}</math>. Is it possible to do any better? Note that we have a [[Upper_and_lower_bounds#Asymptotics|significantly better bound]] for the DHJ(3) <math>c_n</math>:<br />
<br />
: <math>c_n \geq 3^{n-O(\sqrt{\log n})}</math>.<br />
<br />
A more precise lower bound is<br />
<br />
: <math>c'_n \geq \binom{n+1}{q} 2^{n-q}</math><br />
<br />
where q is the nearest integer to <math>n/3</math>, formed by taking all strings with q 2s, together with all strings with q-1 2s and an odd number of 1s. This for instance gives the lower bound <math>c'_5 \geq 120</math>, which compares with the upper bound <math>c'_5 \leq 3 c'_4 = 129</math>.<br />
<br />
Using [[DHJ(3)]], we have the upper bound<br />
<br />
: <math>c'_n = o(3^n)</math>,<br />
<br />
but no effective decay rate is known. It would be good to have a combinatorial proof of this fact (which is weaker than [[DHJ(3)]], but implies [[Roth's theorem]]).<br />
<br />
== Notation ==<br />
<br />
Given a Moser set A in <math>[3]^n</math>, we let a be the number of points in A with no 2s, b be the number of points in A with one 2, c the number of points with two 2s, etc. We call (a,b,c,...) the ''statistics'' of A. Given a slice S of <math>[3]^n</math>, we let a(S), b(S), etc. denote the statistics of that slice (dropping the fixed coordinates). Thus for instance if A = {11, 12, 22, 32}, then (a,b,c) = (1,2,1), and (a(1*),b(1*)) = (1,1).<br />
<br />
We call a statistic (a,b,c,...) ''attainable'' if it is attained by a Moser set. We say that an attainable statistic is ''Pareto-optimal'' if it cannot be pointwise dominated by any other attainable statistic (a',b',c',...) (thus <math>a' \geq a</math>, <math>b' \geq b</math>, etc.) We say that it is ''extremal'' if it is not a convex combination of any other attainable statistic (this is a stronger property than Pareto-optimal). For the purposes of maximising linear scores of attainable statistics, it suffices to check extremal statistics.<br />
<br />
We let <math>(\alpha_0,\alpha_1,\ldots)</math> be the normalized version of <math>(a,b,\ldots)</math>, in which one divides the number of points of a certain type in the set by the total number of points in the set. Thus for instance <math>\alpha_0 = a/2^n, \alpha_1 = b/(n 2^{n-1}), \alpha_3 = c/(\binom{n}{2} 2^{n-2})</math>, etc., and the <math>\alpha_i</math> range between 0 and 1. Averaging arguments show that any linear inequality obeyed by the <math>\alpha_i</math> at one dimension is automatically inherited by higher dimensions, as are shifted versions of this inequality (in which <math>\alpha_i</math> is replaced by <math>\alpha_{i+1}</math>.<br />
<br />
The idea in 'c-statistics' is to identify 1s and 3s but leave the 2s intact. Let’s use x to denote letters that are either 1 or 3, then the 81 points get split up into 16 groups: 2222 (1 point), 222x, 22×2, 2×22, x222 (two points each), 22xx, 2×2x, x22x, 2xx2, x2×2, xx22 (four points each), 2xxx, x2xx, xx2x, xxx2 (eight points each), xxxx (sixteen points). Let c(w) denote the number of points inside a group w, e.g. c(xx22) is the number of points of the form xx22 inside the set, and is thus an integer from 0 to 4.<br />
<br />
== n=0 ==<br />
<br />
We trivially have <math>c'_0=1.</math><br />
<br />
== n=1 ==<br />
<br />
We trivially have <math>c'_1=2.</math> The Pareto-optimal values of the statistics (a,b) are (1,1) and (2,0); these are also the extremals. We thus have the inequality <math>a+b \leq 2</math>, or in normalized notation<br />
<br />
:<math>2\alpha_0 + \alpha_1 \leq 2</math>.<br />
<br />
== n=2 ==<br />
<br />
We have <math>c'_2 = 6</math>; the upper bound follows since <math>c'_2 \leq 3 c'_1</math>, and the lower bound follows by deleting one of the two diagonals from <math>[3]^2</math> (these are the only extremisers).<br />
<br />
The extremiser has statistics (a,b,c) = (2,4,0), which are of course Pareto-optimal. If c=1 then we must have a, b at most 2 (look at the lines through 22). This is attainable (e.g. {11, 12, 22, 23, 31}, and so (2,2,1) is another Pareto-optimal statistic. If a=4, then b and c must be 0, so we get another Pareto-optimal statistic (4,0,0) (attainable by {11, 13, 31, 33} of course). If a=3, then c=0 and b is at most 2, giving another Pareto-optimal statistic (3,2,0); but this is a convex combination of (4,0,0) and (2,4,0) and is thus not extremal. Thus the complete set of extremal statistics are<br />
<br />
(4,0,0), (2,4,0), (2,2,1).<br />
<br />
The sharp linear inequalities obeyed by a,b,c (other than the trivial ones <math>a,b,c \geq 0</math>) are then<br />
<br />
*<math>2a+b+2c \leq 8</math><br />
*<math>b+2c \leq 4</math><br />
*<math>a+2c \leq 4</math><br />
*<math>c \leq 1</math>.<br />
<br />
In normalized notation, we have<br />
*<math>4\alpha_0 + 2\alpha_1 + \alpha_2 \leq 4</math><br />
*<math>2\alpha_1 + \alpha_2 \leq 2</math><br />
*<math>2\alpha_0 + \alpha_2 \leq 2</math><br />
*<math>\alpha_2 \leq 1</math>.<br />
<br />
== n=3 ==<br />
<br />
We have <math>c'_3 = 16</math>. The lower bound can be seen for instance by taking all the strings with one 2, and half the strings with no 2 (e.g. the strings with an odd number of 1s). The upper bound can be deduced from the corresponding [[upper and lower bounds]] for <math>c_3 = 18</math>; the 17-point and 18-point line-free sets each contain a geometric line.<br />
<br />
If a Moser set in <math>[3]^3</math> contains 222, then it can have at most 14 points, since the remaining 26 points in the cube split into 13 antipodal pairs, and at most one of each pair can lie in the set. By exhausting over the <math>2^{13} = 8192</math> possibilities, it can be shown that it is impossible for a 14-point set to exist; any Moser set containing 222 must in fact omit at least one antipodal pair completely and thus have only 13 points. (A human proof of this fact can be [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
The Pareto-optimal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(3,6,5,0),(4,4,5,0),(3,7,4,0),(4,6,4,0), (3,9,3,0),(4,7,3,0),(5,4,3,0),(4,9,2,0),(5,6,2,0),(6,3,2,0),(3,10,1,0),(5,7,1,0), (6,4,1,0),(4,12,0,0),(5,9,0,0),(6,6,0,0),(7,3,0,0),(8,0,0,0).<br />
<br />
These were found from a search of the <math>2^{27}</math> subsets of the cube.<br />
A spreadsheet containing these statistics can be [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# found here].<br />
<br />
The extremal statistics are<br />
<br />
(3,6,3,1),(4,4,3,1),(4,6,2,1),(2,6,6,0),(4,4,5,0),(4,6,4,0),(4,12,0,0),(8,0,0,0)<br />
<br />
The sharp linear bounds are :<br />
<br />
* <math>2a+b+2c+4d \leq 22</math><br />
* <math>3a+2b+3c+6d \leq 36</math><br />
* <math>7a+2b+4c+8d \leq 56</math><br />
* <math>6a+2b+3c+6d \leq 48</math><br />
* <math>b+c+3d \leq 12</math><br />
* <math>a+2c+4d \leq 14</math><br />
* <math>5a+4c+8d \leq 40</math><br />
* <math>a+4d \leq 8</math><br />
* <math>b+6d \leq 12</math><br />
* <math>c+3d \leq 6</math><br />
* <math>d \leq 1</math><br />
<br />
In normalized notation,<br />
* <math>8\alpha_0+ 6\alpha_1 + 6\alpha_2 + 2\alpha_3 \leq 11</math><br />
* <math>4\alpha_0+4\alpha_1+3\alpha_2+\alpha_3 \leq 6</math><br />
* <math>7\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 7</math><br />
* <math>8\alpha_0+3\alpha_1+3\alpha_2+\alpha_3 \leq 8</math><br />
* <math>4\alpha_1+2\alpha_2+\alpha_3 \leq 4</math><br />
* <math>4\alpha_0+6\alpha_2+2\alpha_3 \leq 7</math><br />
* <math>5\alpha_0+3\alpha_2+\alpha_3 \leq 5</math><br />
* <math>2\alpha_0+\alpha_3 \leq 2</math><br />
* <math>2\alpha_1+\alpha_3 \leq 2</math><br />
* <math>2\alpha_2+\alpha_3 \leq 2</math><br />
* <math>\alpha_3 \leq 1</math><br />
<br />
The c-statistics can also be found from an exhaustive search of Moser sets. The resulting Pareto sets and linear inequalities can be found on Sheet 8 of [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en# this spreadsheet]<br />
<br />
== n=4 ==<br />
<br />
A [http://abel.math.umu.se/~klasm/extremal-moser-n=5-t=3 computer search] has obtained all extremisers to <math>c'_4=43</math>. The 42-point solutions can be found [http://abel.math.umu.se/~klasm/moser-n=4-t=3-p=42.gz here].<br />
<br />
'''Proof that <math>c'_4 \leq 43</math>''': <br />
When e=1 (i.e. the 4D set contains 2222) then we have at most 41 points (in fact at most 39) by counting antipodal points, so assume e=0.<br />
<br />
Define the score of a 3D slice to be a/4+b/3+c/2+d. Observe from double counting that the size of a 4D set is the sum of the scores of its eight side slices.<br />
<br />
But by [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuKZ2DyzO9EOg&hl=en looking at the extremals] we see that the largest score is 44/8, attained at only one point, namely when (a,b,c,d) = (2,6,6,0). So the only way one can have a 44-point set is if all side slices are (2,6,6,0), or equivalently if the whole set has statistics (a,b,c,d,e) = (4,16,24,0,0). But then we have all the points with two 2s, which means that the four "a" points cannot be separated by Hamming distance 2. We conclude that we must have an antipodal pair among the "a" points with an odd number of 1s, and an antipodal pair among the "a" points with an even number of 1s. By the symmetries of the cube, we may take the a-set to then be 1111, 3333, 1113, 3331. But then the "b" set must exclude both 1112 and 3332, and so can have at most three points in the eight-point set xyz2 (with x,y,z=1,3) rather than four (to get four points one must alternate in a checkerboard pattern). Adding this to the at most four points of the form xy2z, x2yz, 2xyz we see that b is at most 16, a contradiction. <math>\Box</math><br />
<br />
Given a subset of <math>[3]^4</math>, let a be the number of points with no 2s, b be the number of points with 1 2, and so forth. The quintuple (a,b,c,d,e) thus lies between (0,0,0,0,0) and (16,32,24,8,1).<br />
<br />
The 43-point solutions have distributions (a,b,c,d,e) as follows:<br />
<br />
* (5,20,18,0,0) [16 solutions]<br />
* (4,16,23,0,0) [768 solutions]<br />
* (3,16,24,0,0) [512 solutions]<br />
* (4,15,24,0,0) [256 solutions]<br />
<br />
The 42-point solutions are [http://abel.math.umu.se/~klasm/Moser-42-stat.pdf distributed as follows]:<br />
<br />
* (6,24,12,0,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-12 8 solutions]] <br />
* (5,20,17,0,0) [576 solutions]<br />
* (5,19,18,0,0) [384 solutions]<br />
* (6,16,18,2,0) [[http://abel.math.umu.se/~klasm/moser-n=4-42-e=2 192 solutions]]<br />
* (4,20,18,0,0) [272 solutions]<br />
* (5,17,20,0,0) [192 solutions]<br />
* (5,16,21,0,0) [3584 solutions]<br />
* (4,17,21,0,0) [768 solutions]<br />
* (4,16,22,0,0) [26880 solutions]<br />
* (5,15,22,0,0) [1536 solutions]<br />
* (4,15,23,0,0) [22272 solutions]<br />
* (3,16,23,0,0) [15744 solutions]<br />
* (4,14,24,0,0) [4224 solutions]<br />
* (3,15,24,0,0) [8704 solutions]<br />
* (2,16,24,0,0) [896 solutions]<br />
<br />
Note how c is usually quite large, and d quite low.<br />
<br />
One of the (6,24,12,0,0) solutions is <math>\Gamma_{220}+\Gamma_{202}+\Gamma_{022}+\Gamma_{112}+\Gamma_{211}</math> (i.e. the set of points containing exactly two 1s, and/or exactly two 3s). The other seven are reflections of this set.<br />
<br />
There are 2,765,200 41-point solutions, [http://abel.math.umu.se/~klasm/solutions-4-t=3-41-moser.gz listed here]. The statistics for such points can be [http://abel.math.umu.se/~klasm/Moser-41-stat.pdf found here]. Noteworthy features of the statistics:<br />
<br />
* d is at most 3 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=3.gz 256 exceptional solutions] of the shape (5,15,18,3,0), have d at most 2; here are the [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=2.gz d=2 solutions] and [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-d=1.gz d=1 solutions])<br />
* c is at least 6 (and, except for [http://abel.math.umu.se/~klasm/moser-n=3-t=3-41-c=6.gz 16 exceptional solutions] of the shape (7,28,6,0,0), have c at least 11).<br />
<br />
Statistics for the 41-point, 42-point, and 43-point solutions can be found [http://spreadsheets.google.com/ccc?key=p5T0SktZY9DuqNcxJ171Bbw&hl=en here].<br />
<br />
If a Moser set in <math>[3]^4</math> contains 2222, then by the n=3 theory, any middle slice (i.e. 2***, *2**, **2*, or ***2) is missing at least one antipodal pair. But each antipodal pair belongs to at most three middle slices, thus two of the 40 antipodal pairs must be completely missing. As a consequence, any Moser set containing 2222 can have at most 39 points.<br />
(A more refined analysis can be found at [http://www.ma.rhul.ac.uk/~elsholtz/WWW/blog/mosertablogv01.pdf found here].)<br />
<br />
We have the following inequalities connecting a,b,c,d,e:<br />
<br />
* <math>4a+b \leq 64</math>: There are 32 lines connecting two "a" points with a "b" point; each "a" point belongs to four of these lines, and each "b" point belongs to one. But each such line can have at most two points in the set, and the claim follows.<br />
* This can be refined to <math>4a+b+\frac{2}{3} c \leq 64</math>: There are 24 planes connecting four "a" points, four "b" points, and one "c" point; each "a" point belongs to six of these, each "b" point belongs to three, and each "c" point belongs to one. For each of these planes, we have <math>2a + b + 2c \leq 8</math> from the n=2 theory, and the claim follows.<br />
* <math>6a+2c \leq 96</math>: There are 96 lines connecting two "a" points with a "c" point; each "a" point belongs to six of these lines, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
* <math>3b+2c \leq 96</math>: There are 48 lines connecting two "b" points to a "c" point; each "b" point belongs to three of these points, and each "c" point belongs to two. But each such line can have at most two points in the set, and the claim follows.<br />
<br />
The inequalities for n=3 imply inequalities for n=4. Indeed, there are eight side slices of <math>[3]^4</math>; each "a" point belongs to four of these, each "b" point belongs to three, each "c" point belongs to two, and each "d" point belongs to one. Thus, any inequality of the form<br />
:<math> \alpha a + \beta b + \gamma c + \delta d \leq M</math>&l