The Erdős discrepancy problem
Introduction and statement of problem
The Erdős discrepancy problem is a problem that has been around since the 1930s and has resisted attack ever since. It is extremely easy to state, and a very appealing problem, so perhaps Polymath can succeed where individual mathematicians have failed. If so, then, given the notoriety of the problem, it would be a significant coup for the multiply collaborative approach to mathematics. But even if a full solution is not reached, preliminary investigations have already thrown up some interesting ideas, which could perhaps be developed into publishable results.
To state the problem, we first define a homogeneous arithmetic progression to be a set of positive integers of the form {d,2d,3d,...,nd}. (It is also reasonable to define it to be a set of non-negative integers of the form {0,d,2d,3d,...,nd}: at the time of writing it is not clear what the "official" definition should be.) Suppose that we have a sequence [math]\displaystyle{ x_1,x_2,x_3,\dots }[/math] of elements of the set [math]\displaystyle{ \{-1,1\} }[/math] (henceforth to be referred to as a [math]\displaystyle{ \pm 1 }[/math] sequence). We say that the discrepancy of the sequence [math]\displaystyle{ (x_n) }[/math] with respect to a set A of positive integers is [math]\displaystyle{ |\sum_{n\in A}x_n| }[/math]. The reason for this name is that we can think of the sequence as a red/blue-colouring of the positive integers, and the discrepancy with respect to A is then the difference between the number of red elements of A and the number of blue elements of A. The Erdős discrepancy problem is the following question.
Problem. Let [math]\displaystyle{ (x_n) }[/math] be a [math]\displaystyle{ \pm 1 }[/math] sequence. Must it be the case that for every constant C there exists a homogeneous arithmetic progression P such that the discrepancy of the sequence [math]\displaystyle{ (x_n) }[/math] with respect to P is at least C?
Some definitions and notational conventions
A homogeneous arithmetic progression, or HAP, is an arithmetic progression of the form {d,2d,3d,...,nd}.
The discrepancy of a sequence [math]\displaystyle{ (x_n) }[/math] is the supremum of [math]\displaystyle{ |\sum_{n\in P}x_n| }[/math] over all homogeneous arithmetic progressions P.
A sequence [math]\displaystyle{ (x_n) }[/math] has discrepancy at most φ(n) if for every natural number N the maximum value of [math]\displaystyle{ |\sum_{n\in P}x_n| }[/math] over all homogeneous arithmetic progressions P that are subsets of {1,2,...,N} is at most φ(N).
The EDP is the Erdős discrepancy problem. (This may conceivably be changed if enough people don't like it.)
A sequence [math]\displaystyle{ (x_n) }[/math] is completely multiplicative if [math]\displaystyle{ x_{mn}=x_mx_n }[/math] for any two positive integers m and n. We shall sometimes abbreviate this to "multiplicative", but the reader should be aware that the word "multiplicative" normally refers to the more general class of sequences such that [math]\displaystyle{ x_{mn}=x_mx_n }[/math] whenever m and n are coprime.
A [math]\displaystyle{ HAP-subsequence }[/math] of a sequence [math]\displaystyle{ (x_n) }[/math] is a subsequence of the form [math]\displaystyle{ x_d,x_{2d},x_{3d},\dots }[/math] for some d. If [math]\displaystyle{ (x_n) }[/math] is a multiplicative sequence, then every HAP-subsequence is equal to either [math]\displaystyle{ (x_n) }[/math] or [math]\displaystyle{ (-x_n) }[/math]. We shall call a sequence [math]\displaystyle{ weakly multiplicative }[/math] if it has only finitely many distinct HAP-subsequences. Let us also call a [math]\displaystyle{ \pm 1 }[/math] sequence quasi-multiplicative if it is a composition of a completely multiplicative function from [math]\displaystyle{ \mathbb{N} }[/math] to an Abelian group G with a function from G to {-1,1}. It can be shown that if [math]\displaystyle{ (x_n) }[/math] is a weakly multiplicative sequence, then it has an HAP-subsequence that is quasi-multiplicative.
Simple observations
- To answer the problem negatively, it is sufficient to find a completely multiplicative sequence (that is, a sequence [math]\displaystyle{ (x_n) }[/math] such that [math]\displaystyle{ x_{mn}=x_mx_n }[/math] for every m,n) such that its partial sums [math]\displaystyle{ x_1+\dots+x_n }[/math] are bounded. First mentioned in the proof of the theorem in this post.
- The function that takes n to 1 if the last non-zero digit of n in its ternary representation is 1 and -1 if the last non-zero digit is 2 is completely multiplicative and the partial sum up to n is easily shown to be at most [math]\displaystyle{ \log_3n +1 }[/math]. Therefore, the rate at which the worst discrepancy grows, as a function of the length of the homogeneous progression, can be as slow as logarithmic. [1]
It turns out that the base can be made significantly higher than 3, so this example is not best possible.Correction
- To answer the problem negatively, it is also sufficient to find a completely multiplicative function f from [math]\displaystyle{ \mathbb{N} }[/math] to a finite Abelian group G (meaning that f(mn)=f(m)+f(n) for every m,n) and a function h from G to {-1,1} such that for each g in G there exists d with f(d)=g and with the partial sums hf(d)+hf(2d)+...+hf(nd) bounded. In that case, one can set [math]\displaystyle{ x_n=hf(n) }[/math]. (Though this observation is simple with the benefit of hindsight, it might not have been made had it not been for the fact that this sort of structure was identified in a long sequence that had been produced experimentally). First mentioned here. Se also this post and this comment.
- It can be shown that if a [math]\displaystyle{ \pm 1 }[/math] sequence [math]\displaystyle{ (x_n) }[/math] starts with 1 and has only finitely many distinct subsequences of the form [math]\displaystyle{ (x_d,x_{2d},x_{3d},\dots) }[/math], then it must arise in the above way.
- The problem for the positive integers is equivalent to the problem for the positive rationals. [2](Sketch proof: if you have a function f from the positive integers to {-1,1} that works, then define [math]\displaystyle{ f_q(x) }[/math] to be f(q!x) whenever q!x is an integer. Then take a pointwise limit of a subsequence of the functions [math]\displaystyle{ f_q }[/math].)
Experimental evidence
A nice feature of the problem is that it lends itself well to investigation on a computer. Although the number of [math]\displaystyle{ \pm 1 }[/math] sequences grows exponentially, the constraints on these sequences are such that depth-first searches can lead quite quickly to interesting examples. For instance, at the time of writing they have yielded a sequence of length 1124 with discrepancy 2. See this page for more details about this and for further links.
Annoted Bibliography
- Mathias, A. R. D. On a conjecture of Erdős and Čudakov. Combinatorics, geometry and probability (Cambridge, 1993).
This is a short paper establishing the maximal length of sequence for the case where C=1 is 11, and is the starting point for our experimental studies.
- Borwein, Peter, and Choi, Stephen. A variant of Liouville's lambda function: some surprizing formulae.
Explicit formulas for the discrepancy of some completely multiplicative functions whose discrepancy is logarithmic. A typical example is: Let [math]\displaystyle{ \lambda_3(n) = (-1)^{\omega_3(n)} }[/math] where [math]\displaystyle{ \omega_3(n) }[/math] is the number of distinct prime factors congruent to [math]\displaystyle{ -1 \bmod 3 }[/math] in [math]\displaystyle{ n }[/math] (with multiple factors counted multiply). Then the discrepancy of the [math]\displaystyle{ \lambda_3 }[/math] sequence up to [math]\displaystyle{ N }[/math] is exactly the number of 1's in the base three expansion of [math]\displaystyle{ N }[/math].
- Vijay, Sujith. On the discrepancy of quasi-progressions. Electronic Journal of Combinatorics, R104 of Volume 15(1), 2008.
A quasi-progression is a sequence of the form [math]\displaystyle{ \lfloor k \alpha \rfloor }[/math], with [math]\displaystyle{ 1\leq k \leq K }[/math] for some K. The main theorem of interest is: If the integers from 0 to n are 2-coloured, there exists a quasi-progression that has discrepancy at least [math]\displaystyle{ (1/50)n^{1/6} }[/math].
- Granville and Soundararajan. Multiplicative functions in arithmetic progressions
In this blog post Terry Tao wrote: "Granville and Soundararajan have made some study of the discrepancy of bounded multiplicative functions. The situation is remarkably delicate and number-theoretical (and is closely tied with the Granville-Soundararajan theory of pretentious characters)."
- As this web page reveals, the Erdős discrepancy problem was a $500-dollar problem of Erdős, so it is clear that he regarded it as pretty hard.