Find a good configuration of HAPs
Click here to return to the main Polymath5 page.
Introduction
One of the difficulties of thinking about the Erdős discrepancy problem is that one can get hung up on HAPs with small common differences, when ultimately what should matter is the more typical HAPs, which have large common differences. This is less of a problem when one is looking at multiplicative sequences, since then all one cares about is partial sums. However, the strategy to be described on this page concerns a direct attack on the general problem.
The rough idea is this. Recall that a hypergraph is a collection of subsets of some set X. We borrow graph terminology and call the elements of X verticesand the sets in H edges (or sometimes hyperedges if we want to make it clear that our graph is hyper). If H is a hypergraph with vertex set X and f is a function from X to {-1,1}, then the discrepancy of f is the maximum of [math]\displaystyle{ |\sum_{x\in A}f(x)| }[/math] over all edges A of H. The discrepancy of H is the minimum discrepancy of any function [math]\displaystyle{ f:X\to\{-1,1\} }[/math].
The Erdős discrepancy problem is asking us to prove that the discrepancy of the hypergraph whose vertex set is [math]\displaystyle{ \mathbb{N} }[/math] and whose edges are all finite HAPs is infinite. One obvious strategy for doing this is to identify within this hypergraph a collection of hypergraphs [math]\displaystyle{ H_n }[/math] that have properties that allow us to prove arbitrarily good lower bounds for the discrepancy. What makes this a strategy, rather than a trivial reformulation of the problem (after all, one might suggest that there is no point in taking all HAPs) is that one would be aiming to find subhypergraphs of the HAP hypergraph that lent themselves particularly well to certain kinds of combinatorial arguments.
This point will be much clearer if we discuss an actual example of a subhypergraph that might do the trick. (I shall follow the usual practice, when talking informally about asymptotic arguments, of talking about "a subhypergraph" when strictly speaking what I mean is a sequence of larger and larger subhypergraphs. I shall picture a single subhypergraph that is very large.
A collection of HAPs that might be useful
Let us pick three very large integers L, M and N, with N much larger than M and M much larger than L. Let us then take the set of all HAPs of length L that live inside the interval [N,N+M] and have common difference d such that dL is at most o(M). (This o(M) is shorthand for some convenient function we choose later.) I should make clear that what I am calling a HAP here is an arithmetic progression of the form (x,x+d,...,x+(L-1)d) such that d is a factor of x (so that if you continued the progression backwards it would include zero).
One might hope that if L, M and N are sufficiently large, then a general principle along the following lines should hold: almost all numbers behave like average numbers. An example of where this principle does hold is in the number of prime factors. If n is a random number close to N, then not only does n have, on average, roughly log log N prime factors, but the Erdős-Kac theorem tells us that the number of prime factors is approximately normally distributed with mean log log N and standard deviation [math]\displaystyle{ \sqrt{\log n} }[/math]. Unfortunately, this suggests that the number of factors (as opposed to prime factors) is not concentrated. This could be a problem.
A toy model of what is going on
Let us consider a measurable function f defined on the real interval [N,N+M]. (We could just as well take the interval [0,1], but it may be clearer to keep the same distance scale as we had in the integer case.) We ... to be continued.