Quasirandomness

Quasirandomness is a central concept in extremal combinatorics, and is likely to play an important role in any combinatorial proof of the density Hales-Jewett theorem. This will be particularly true if that proof is based on the density increment method or on some kind of generalization of Szemerédi's regularity lemma.

In general, one has some kind of parameter associated with a set, which in our case will be the number of combinatorial lines it contains, and one would like a deterministic definition of the word "quasirandom" with the following key property.

• Every quasirandom set $\mathcal{A}$ has roughly the same value of the given parameter as a random set of the same density.

Needless to say, this is not the only desirable property of the definition, since otherwise we could just define $\mathcal{A}$ to be quasirandom if it has roughly the same value of the given parameter as a random set of the same density. The second key property is this.

• Every set $\mathcal{A}$ that fails to be quasirandom has some other property that we can exploit.

These two properties are already discussed in some detail in the article on the density increment method: this article concentrates more on examples of quasirandomness in other contexts, and possible definitions of quasirandomness connected with the density Hales-Jewett theorem.

Examples of quasirandomness definitions

Bipartite graphs

Let X and Y be two finite sets and let $f:X\times Y\rightarrow [-1,1].$ Then f is defined to be c-quasirandom if $\mathbb{E}_{x,x'\in X}\mathbb{E}_{y,y'\in Y}f(x,y)f(x,y')f(x',y)f(x',y')\leq c.$

Since the left-hand side is equal to $\mathbb{E}_{x,x'\in X}(\mathbb{E}_{y\in Y}f(x,y)f(x',y))^2,$ it is always non-negative, and the condition that it should be small implies that $\mathbb{E}_{y\in Y}f(x,y)f(x',y)$ is small for almost every pair $x,x'.$

If G is a bipartite graph with vertex sets X and Y and $\delta$ is the density of G, then we can define $f(x,y)$ to be $1-\delta$ if xy is an edge of G and $-\delta$ otherwise. We call f the balanced function of G, and we say that G is c-quasirandom if its balanced function is c-quasirandom.

It can be shown that if H is any fixed graph and G is a large quasirandom graph, then the number of copies of H in G is approximately what it would be in a random graph of the same density as G.

Subsets of finite Abelian groups

If A is a subset of a finite Abelian group G and A has density $\delta,$ then we define the balanced function f of A by setting $f(x)=1-\delta$ when x\in A and $f(x)=-\delta$ otherwise. Then A is c-quasirandom if and only if f is c-quasirandom, and f is defined to be c-quasirandom if $\mathbb{E}_{x,a,b\in G}f(x)f(x+a)f(x+b)f(x+a+b)\leq c.$ Again, we can prove positivity by observing that the left-hand side is a sum of squares. In this case, it is $\mathbb{E}_{a\in G}(\mathbb{E}_{x\in G}f(x)f(x+a))^2.$

If G has odd order, then it can be shown that a quasirandom set A contains approximately the same number of triples $(x,x+d,x+2d)$ as a random subset A of the same density. However, it is decidedly not the case that A must contain approximately the same number of arithmetic progressions of higher length (regardless of torsion assumptions on G). For that one must use "higher uniformity".

Subsets of grids

A function f from $[n]^2$ to [-1,1] is c-quasirandom if the "sum over rectangles" is at most c. The sum over rectangles is $\mathbb{E}_{x,y,a,b}f(x,y)f(x+a,y)f(x,y+b)f(x+a,y+b)$. Again, it is easy to show that this sum is non-negative by expressing it as a sum of squares. And again, one defines a subset $A\subset[n]^2$ to be c-quasirandom if it has a balanced function that is c-quasirandom.

If A is a c-quasirandom set of density $\delta$ and c is sufficiently small, then A contains roughly the same number of corners as a random subset of $[n]$ of density $\delta.$

A possible definition of quasirandom subsets of $[3]^n$

As with all the examples above, it is more convenient to give a definition for quasirandom functions. However, in this case it is not quite so obvious what should be meant by a balanced function.

Here, first, is a possible definition of a quasirandom function from $[2]^n\times [2]^n$ to $[-1,1].$ We say that f is c-quasirandom if $\mathbb{E}_{A,A',B,B'}f(A,B)f(A,B')f(A',B)f(A',B')\leq c.$ However, the expectation is not with respect to the uniform distribution over all quadruples (A,A',B,B') of subsets of $[n].$ Rather, we choose them as follows. (Several variants of what we write here are possible: it is not clear in advance what precise definition will be the most convenient to use.) First we randomly permute $[n]$ using a permutation $\pi$. Then we let A, A', B and B' be four random intervals in $\pi([n]),$ where we allow our intervals to wrap around mod n. (So, for example, a possible set A is $\{\pi(n-2),\pi(n-1),\pi(n),\pi(1),\pi(2)\}.$)

As ever, it is easy to prove positivity. To apply this definition to subsets $\mathcal{A}$ of $[3]^n,$ define f(A,B) to be 0 if A and B intersect, $1-\delta$ if they are disjoint and the sequence x that is 1 on A, 2 on B and 3 elsewhere belongs to $\mathcal{A},$ and $-\delta$ otherwise. Here, $\delta$ is the probability that (A,B) belongs to $\mathcal{A}$ if we choose (A,B) randomly by taking two random intervals in a random permutation of $[n]$ (in other words, we take the marginal distribution of (A,B) from the distribution of the quadruple (A,A',B,B') above) and condition on their being disjoint. It follows from this definition that $\mathbb{E}f=0$ (since the expectation conditional on A and B being disjoint is 0 and f is zero whenever A and B intersect).

Nothing that one would really like to know about this definition has yet been fully established, though an argument that looks as though it might work has been proposed to show that if f is quasirandom in this sense then the expectation $\mathbb{E}f(A,B)f(A\cup D,B)f(A,B\cup D)$ is small (if the distribution on these "set-theoretic corners" is appropriately defined).