# Main Page

## Contents

## The Problem

Let [math][3]^n[/math] be the set of all length [math]n[/math] strings over the alphabet [math]1, 2, 3[/math]. A *combinatorial line* is a set of three points in [math][3]^n[/math], formed by taking a string with one or more wildcards [math]x[/math] in it, e.g., [math]112x1xx3\ldots[/math], and replacing those wildcards by [math]1, 2[/math] and [math]3[/math], respectively. In the example given, the resulting combinatorial line is: [math]\{ 11211113\ldots, 11221223\ldots, 11231333\ldots \}[/math]. A subset of [math][3]^n[/math] is said to be *line-free* if it contains no lines. Let [math]c_n[/math] be the size of the largest line-free subset of [math][3]^n[/math].

**Density Hales-Jewett (DHJ) theorem:** [math]\lim_{n \rightarrow \infty} c_n/3^n = 0[/math]

The original proof of DHJ used arguments from ergodic theory. The basic problem to be consider by the Polymath project is to explore a particular combinatorial approach to DHJ, suggested by Tim Gowers.

## Threads

- (1-199) A combinatorial approach to density Hales-Jewett (inactive)
- (200-299) Upper and lower bounds for the density Hales-Jewett problem (active)
- (300-399) The triangle-removal approach (inactive)
- (400-499) Quasirandomness and obstructions to uniformity (final call)
- (500-599) TBA
- (600-699) A reading seminar on density Hales-Jewett (active)

Here is a further list of blog posts related to the Polymath1 project. Here is wordpress's list.

A spreadsheet containing the latest lower and upper bounds for [math]c_n[/math] can be found here.

## Unsolved questions

**Gowers.462:** Incidentally, it occurs to me that we as a collective are doing what I as an individual mathematician do all the time: have an idea that leads to an interesting avenue to explore, get diverted by some temporarily more exciting idea, and forget about the first one. I think we should probably go through the various threads and collect together all the unsolved questions we can find (even if they are vague ones like, “Can an approach of the following kind work?”) and write them up in a single post. If this were a more massive collaboration, then we could work on the various questions in parallel, and update the post if they got answered, or reformulated, or if new questions arose.

### IP-Szemeredi (a weaker problem than DHJ)

**Solymosi.2:** In this note I will try to argue that we should consider a variant of the original problem first. If the removal technique doesn’t work here, then it won’t work in the more difficult setting. If it works, then we have a nice result! Consider the Cartesian product of an IP_d set. (An IP_d set is generated by d numbers by taking all the [math]2^d[/math] possible sums. So, if the n numbers are independent then the size of the IP_d set is [math]2^d[/math]. In the following statements we will suppose that our IP_d sets have size [math]2^n[/math].)

Prove that for any [math]c\gt0[/math] there is a [math]d[/math], such that any [math]c[/math]-dense subset of the Cartesian product of an IP_d set (it is a two dimensional pointset) has a corner.

The statement is true. One can even prove that the dense subset of a Cartesian product contains a square, by using the density HJ for [math]k=4[/math]. (I will sketch the simple proof later) What is promising here is that one can build a not-very-large tripartite graph where we can try to prove a removal lemma. The vertex sets are the vertical, horizontal, and slope -1 lines, having intersection with the Cartesian product. Two vertices are connected by an edge if the corresponding lines meet in a point of our [math]c[/math]-dense subset. Every point defines a triangle, and if you can find another, non-degenerate, triangle then we are done. This graph is still sparse, but maybe it is well-structured for a removal lemma.

Finally, let me prove that there is square if [math]d[/math] is large enough compare to [math]c[/math]. Every point of the Cartesian product has two coordinates, a 0,1 sequence of length [math]d[/math]. It has a one to one mapping to [math][4]^d[/math]; Given a point [math]((x_1,…,x_d),(y_1,…,y_d))[/math] where [math]x_i,y_j[/math] are 0 or 1, it maps to [math](z_1,…,z_d)[/math], where [math]z_i=0[/math] if [math]x_i=y_i=0[/math], [math]z_i=1[/math] if [math]x_i=1[/math] and [math]y_i=0, z_i=2[/math] if [math]x_i=0[/math] and [math]y_i=1[/math], and finally [math]z_i=3[/math] if [math]x_i=y_i=1[/math]. Any combinatorial line in [math][4]^d[/math] defines a square in the Cartesian product, so the density HJ implies the statement.

**Gowers.7:** With reference to Jozsef’s comment, if we suppose that the d numbers used to generate the set are indeed independent, then it’s natural to label a typical point of the Cartesian product as (\epsilon,\eta), where each of \epsilon and \eta is a 01-sequence of length d. Then a corner is a triple of the form (\epsilon,\eta), (\epsilon,\eta+\delta), (\epsilon+\delta,\eta), where \delta is a \{-1,0,1\}-valued sequence of length d with the property that both \epsilon+\delta and \eta+\delta are 01-sequences. So the question is whether corners exist in every dense subset of the original Cartesian product.

This is simpler than the density Hales-Jewett problem in at least one respect: it involves 01-sequences rather than 012-sequences. But that simplicity may be slightly misleading because we are looking for corners in the Cartesian product. A possible disadvantage is that in this formulation we lose the symmetry of the corners: the horizontal and vertical lines will intersect this set in a different way from how the lines of slope -1 do.

I feel that this is a promising avenue to explore, but I would also like a little more justification of the suggestion that this variant is likely to be simpler.

Gowers.22: A slight variant of the problem you propose is this. Let’s take as our ground set the set of all pairs (U,V) of subsets of \null [n], and let’s take as our definition of a corner a triple of the form (U,V), (U\cup D,V), (U,V\cup D), where both the unions must be disjoint unions. This is asking for more than you asked for because I insist that the difference D is positive, so to speak. It seems to be a nice combination of Sperner’s theorem and the usual corners result. But perhaps it would be more sensible not to insist on that positivity and instead ask for a triple of the form (U,V), ((U\cup D)\setminus C,V), (U, (V\cup D)\setminus C, where D is disjoint from both U and V and C is contained in both U and V. That is your original problem I think.

I think I now understand better why your problem could be a good toy problem to look at first. Let’s quickly work out what triangle-removal statement would be needed to solve it. (You’ve already done that, so I just want to reformulate it in set-theoretic language, which I find easier to understand.) We let all of X, Y and Z equal the power set of \null [n]. We join U\in X to V\in Y if (U,V)\in A.

Ah, I see now that there’s a problem with what I’m suggesting, which is that in the normal corners problem we say that (x,y+d) and (x+d,y) lie in a line because both points have the same coordinate sum. When should we say that (U,V\cup D) and (U\cup D,V) lie in a line? It looks to me as though we have to treat the sets as 01-sequences and take the sum again. So it’s not really a set-theoretic reformulation after all.

O'Donnell.35: Just to confirm I have the question right…

There is a dense subset A of {0,1}^n x {0,1}^n. Is it true that it must contain three nonidentical strings (x,x’), (y,y’), (z,z’) such that for each i = 1…n, the 6 bits

[ x_i x'_i ] [ y_i y'_i ] [ z_i z'_i ]

are equal to one of the following:

[ 0 0 ] [ 0 0 ] [ 0, 1 ] [ 1 0 ] [ 1 1 ] [ 1 1 ] [ 0 0 ], [ 0 1 ], [ 0, 1 ], [ 1 0 ], [ 1 0 ], [ 1 1 ], [ 0 0 ] [ 1 0 ] [ 0, 1 ] [ 1 0 ] [ 0 1 ] [ 1 1 ]

?

McCutcheon.469: IP Roth:

Just to be clear on the formulation I had in mind (with apologies for the unprocessed code): for every $\delta>0$ there is an $n$ such that any $E\subset [n]^{[n]}\times [n]^{[n]}$ having relative density at least $\delta$ contains a corner of the form $\{a, a+(\sum_{i\in \alpha} e_i ,0),a+(0, \sum_{i\in \alpha} e_i)\}$. Here $(e_i)$ is the coordinate basis for $[n]^{[n]}$, i.e. $e_i(j)=\delta_{ij}$.

Presumably, this should be (perhaps much) simpler than DHJ, k=3.

High-dimensional Sperner

Kalai.29: There is an analogous for Sperner but with high dimensional combinatorial spaces instead of "lines" but I do not remember the details (Kleitman(?) Katona(?) those are ususal suspects.)

Fourier approach

Kalai.29: A sort of generic attack one can try with Sperner is to look at f=1_A and express using the Fourier expansion of f the expression \int f(x)f(y)1_{x<y} where x<y is the partial order (=containment) for 0-1 vectors. Then one may hope that if f does not have a large Fourier coefficient then the expression above is similar to what we get when A is random and otherwise we can raise the density for subspaces. (OK, you can try it directly for the k=3 density HJ problem too but Sperner would be easier;)
This is not unrealeted to the regularity philosophy.

Gowers.31: Gil, a quick remark about Fourier expansions and the k=3 case. I want to explain why I got stuck several years ago when I was trying to develop some kind of Fourier approach. Maybe with your deep knowledge of this kind of thing you can get me unstuck again.

The problem was that the natural Fourier basis in \null [3]^n was the basis you get by thinking of \null [3]^n as the group \mathbb{Z}_3^n. And if that’s what you do, then there appear to be examples that do not behave quasirandomly, but which do not have large Fourier coefficients either. For example, suppose that n is a multiple of 7, and you look at the set A of all sequences where the numbers of 1s, 2s and 3s are all multiples of 7. If two such sequences lie in a combinatorial line, then the set of variable coordinates for that line must have cardinality that’s a multiple of 7, from which it follows that the third point automatically lies in the line. So this set A has too many combinatorial lines. But I’m fairly sure — perhaps you can confirm this — that A has no large Fourier coefficient.

You can use this idea to produce lots more examples. Obviously you can replace 7 by some other small number. But you can also pick some arbitrary subset W of \null[n] and just ask that the numbers of 0s, 1s and 2s inside W are multiples of 7.

DHJ for dense subsets of a random set

Tao.18: A sufficiently good Varnavides type theorem for DHJ may have a separate application from the one in this project, namely to obtain a “relative” DHJ for dense subsets of a sufficiently pseudorandom subset of {}[3]^n, much as I did with Ben Green for the primes (and which now has a significantly simpler proof by Gowers and by Reingold-Trevisan-Tulsiani-Vadhan). There are other obstacles though to that task (e.g. understanding the analogue of “dual functions” for Hales-Jewett), and so this is probably a bit off-topic.

## Bibliography

- H. Furstenberg, Y. Katznelson, “A density version of the Hales-Jewett theorem for k=3“, Graph Theory and Combinatorics (Cambridge, 1988). Discrete Math. 75 (1989), no. 1-3, 227–241.
- R. McCutcheon, “The conclusion of the proof of the density Hales-Jewett theorem for k=3“, unpublished.
- H. Furstenberg, Y. Katznelson, “A density version of the Hales-Jewett theorem“, J. Anal. Math. 57 (1991), 64–119.