# Ergodic perspective

In a nutshell, the philosophy is this:

- Study a global dense object by looking at a random local subobject. This trades a globally dense deterministic object for a pointwise dense random object. Furthermore, this random object obeys various symmetry (or more precisely, stationarity) properties.

So, one gets much more structure on the object being studied, provided that one lets go of the idea of thinking of this object deterministically, and instead embracing its random nature.

Let me give some concrete examples of this philosophy. Firstly, Roth's theorem in [math]{\Bbb Z}/N{\Bbb Z}[/math], with N assumed prime for simplicity. To locate length three progressions in a set A, what one can do is pick a medium size number m, and look at a random arithmetic progression of length m in [math]{\Bbb Z}/N{\Bbb Z}[/math]. One can then pull back the deterministic set A by this progression to create a random subset A' of [m]. If we can find a progression of length 3 in A’ with positive probability, then we get a progression of length 3 in A (in fact, we get lots of such progressions; this is essentially the proof of Varnavides' theorem).

The original deterministic set A had global density [math]\geq \delta[/math]. Because of this, the random set A' has a better property: it has a *pointwise* density [math]\geq \delta[/math], in the sense that every j in [m] belongs to A' with probability [math]\geq \delta[/math]. From linearity of expectation this implies that the expected density of A' with respect to uniform measure of [m] is [math]\geq \delta[/math], but it is far stronger - indeed, it implies that the expected density of A' with respect to *any* probability measure on [m] is [math]\geq \delta[/math].

Similarly, the deterministic set A has no symmetry properties: A+1, for instance, may well be a completely different set from A. However, the random set A' has an important symmetry property: the random set A'+1, while being different from A', has the same probability distribution as A' (ignoring some trivial issues at the boundary of [m]; in practice one eliminates this by taking the limit [math]m \to \infty[/math]). In the language of probability theory, we say that the probability distribution of A' is *stationary*; in the language of ergodic theory, we say that the transformation is a measure-preserving transformation with respect to the probability distribution of A'.

Now, we know that the events [math]n \in A'[/math], [math]n+r \in A'[/math], [math]n+2r \in A'[/math] each occur with probability [math]\geq \delta[/math]. If they were independent, then we would have with probability and we would get Roth's theorem. But of course they could be correlated - but one can try to quantify this correlation by a variety of tools, e.g. using the spectral theory of the measure-preserving transformation T (this is the analogue of using Fourier analysis in the original combinatorial setting). This is the starting point for the ergodic theory method for solving density Ramsey theory problems.

Now we return to Hales-Jewett. Similarly to above, if we wish to study a subset A of which has density [math]\delta[/math] wrt uniform measure, we can take a random m-dimensional subspace of [math][3]^n[/math] (where I will be a little vague for now as to exactly what “random” means; there are a variety of options here) and pull back the deterministic set A to get a random subset A' of [math][3]^m[/math]. Again, finding lines in A' will generate lines in A (and there will be an appropriate Varnavides statement).

Also as before, the global density of ensures that the random subset A' has pointwise density very close to [math]\delta[/math] (there is a slight distortion here, but it disappears in the limit [math]n \to \infty[/math] Thus, the expected density of A' with respect to any particular measure on - be it uniform, slices-equal, or anything else - is basically [math]\delta[/math].

It is at this point that the three approaches to the problem (ergodic approach, Tim's approach, and Terry's approach) begin to diverge.

- The ergodic theory approach would then proceed by studying correlations between events such as [math]w^0 \in A', w^1 \in A', w^2 \in A'[/math]. If these events are independent, we are done; otherwise, we hope to discern some usable structural information on out of this. The original set A is not used at all in the ergodic theory approach; its only legacies are the pointwise density and stationary properties it bestows on A'.

- Tim's approach is proceeding by using the first moment method to locate a specific deterministic instance of which is dense wrt slices-equal measure and then proceeding from there. As with the ergodic theory method, the original set A (which was dense in the uniform measure, rather than slices-equal) is now discarded, and one is now working entirely with the slices-equal-dense A'. As I understand it, Tim is then trying to show that lack of combinatorial lines in will force some new structure on A'.

- Terry's approach starts off similarly to the ergodic theory approach - studying correlations between events such as [math]w^0 \in A', w^1 \in A', w^2 \in A'[/math] - but the difference is that once a correlation is detected, one tries to see what this tells me about the original set A, in particular to show that it has some “low-influence”, “insensitivity”, “mostly continuous”, or “monotone” like properties. I have a vague idea of trying to iterate this procedure until we have enough structure on to either pass to a denser subspace or find combinatorial lines directly, but I have not worked this part out yet.