The Erdős discrepancy problem: Difference between revisions

Latest revision as of 10:31, 21 September 2015

If you want to you can jump straight to the main experimental results page.

Introduction and statement of problem

Let [math]\displaystyle{ (x_n) }[/math] be a [math]\displaystyle{ \scriptstyle \pm 1 }[/math] sequence and let [math]\displaystyle{ C }[/math] be a constant. Must there exist positive integers [math]\displaystyle{ d,k }[/math] such that

[math]\displaystyle{ \left| \sum_{i=1}^k x_{id} \right| \gt C }[/math]?

Known colloquially as "The Erdős discrepancy problem", this question has remained unanswered since the 1930s (Erdős, 1957) and Erdős offered $500 for an answer. It was also asked by Chudakov (1956). It is extremely easy to state, and a very appealing problem, so perhaps Polymath can succeed where individual mathematicians have failed. If so, then, given the notoriety of the problem, it would be a significant coup for the multiply collaborative approach to mathematics. But even if a full solution is not reached, preliminary investigations have already thrown up some interesting ideas which could perhaps be developed into publishable results.

It seems likely that the answer to Erdős's question is yes. If it is, then an easy compactness argument tells us that for every [math]\displaystyle{ C }[/math] there exists [math]\displaystyle{ n }[/math] such that for every [math]\displaystyle{ \scriptstyle \pm 1 }[/math] sequence of length [math]\displaystyle{ n }[/math] there exist [math]\displaystyle{ d,k }[/math] with [math]\displaystyle{ dk\leq n }[/math] such that [math]\displaystyle{ \left| \sum_{i=1}^k x_{id} \right| \gt C }[/math]. In view of this, we make some definitions that allow one to talk about the dependence between [math]\displaystyle{ C }[/math] and [math]\displaystyle{ n }[/math]. For any finite set [math]\displaystyle{ A }[/math] of integers, we define the error on [math]\displaystyle{ A }[/math] by

[math]\displaystyle{ E(A) := \sum_{a\in A} x_a }[/math],

and for a set [math]\displaystyle{ \mathcal{A} }[/math] of finite sets of integers, we define the discrepancy

[math]\displaystyle{ \delta(\mathcal{A},x) := \sup_{A\in \mathcal{A}} |E(A)|. }[/math]

We can think of the values taken by the sequence [math]\displaystyle{ (x_n) }[/math] as a red/blue colouring of the integers that tries to make the number of reds and blues in each [math]\displaystyle{ \scriptstyle A\in\mathcal{A} }[/math] as equal as possible. The discrepancy measures the extent to which the sequence fails in this attempt. Taking [math]\displaystyle{ \scriptstyle \mathcal{HAP}(N) }[/math] to be the set of homogeneous arithmetic progressions [math]\displaystyle{ \{d, 2d, 3d, ..., nd\} }[/math] contained in [math]\displaystyle{ \{1, 2, ..., N\} }[/math], we can restate the question as whether [math]\displaystyle{ \scriptstyle \delta(\mathcal{HAP}(N),x) \to \infty }[/math] for every sequence [math]\displaystyle{ x }[/math].

Two related questions have already been solved in the literature. Letting [math]\displaystyle{ \scriptstyle \mathcal{AP}(N) }[/math] be the collection of all (not necessarily homogeneous) arithmetic progressions in [math]\displaystyle{ \{1, 2, ..., N\} }[/math], Roth (1964) proved that [math]\displaystyle{ \scriptstyle \delta(\mathcal{AP}(N),x) \geq c n^{1/4} }[/math], independent of the sequence [math]\displaystyle{ x }[/math]. Letting [math]\displaystyle{ \scriptstyle \mathcal{HQAP}(N) }[/math] be the collection of all homogeneous quasi-arithmetic progressions [math]\displaystyle{ \scriptstyle\{\lfloor \alpha \rfloor,\lfloor 2\alpha \rfloor,\dots,\lfloor k\alpha \rfloor\} }[/math] contained in [math]\displaystyle{ \{1, 2, ..., N\} }[/math], Vijay (2008) proved that [math]\displaystyle{ \scriptstyle \delta(\mathcal{HQAP}(N),x) \geq 0.02 n^{1/6} }[/math], independent of the sequence [math]\displaystyle{ x }[/math].

Some definitions and notational conventions

A homogeneous arithmetic progression, or HAP, is an arithmetic progression of the form [math]\displaystyle{ \{d,2d,3d,...,nd\} }[/math].

When [math]\displaystyle{ x }[/math] is clear from context, we write [math]\displaystyle{ \delta(N) }[/math] in place of [math]\displaystyle{ \delta(\mathcal{HAP}(N),x) }[/math].

The discrepancy of a sequence [math]\displaystyle{ (x_n) }[/math] is the supremum of [math]\displaystyle{ |\sum_{n\in P}x_n| }[/math] over all homogeneous arithmetic progressions P. (Strictly speaking, we should call this something like the HAP-discrepancy, but since we will almost always be talking about HAPs, we adopt the convention that "discrepancy" always means "HAP-discrepancy" unless it is stated otherwise.)

A sequence [math]\displaystyle{ (x_n) }[/math] has discrepancy at most φ(n) if for every natural number N the maximum value of [math]\displaystyle{ |\sum_{n\in P}x_n| }[/math] over all homogeneous arithmetic progressions P that are subsets of {1,2,...,N} is at most φ(N).

The EDP is the Erdős discrepancy problem.

A sequence [math]\displaystyle{ (x_n) }[/math] is completely multiplicative if [math]\displaystyle{ x_{mn}=x_mx_n }[/math] for any two positive integers m and n. We shall sometimes abbreviate this to "multiplicative", but the reader should be aware that the word "multiplicative" normally refers to the more general class of sequences such that [math]\displaystyle{ x_{mn}=x_mx_n }[/math] whenever m and n are coprime. A completely multiplicative function is determined by the values it takes at primes. The Liouville function λ is the unique completely multiplicative function that takes the value -1 at every prime: if the prime factorization of n is [math]\displaystyle{ \prod p_i^{a_i} }[/math] then λ(n) equals [math]\displaystyle{ (-1)^{\sum a_i} }[/math]. Another important multiplicative function is [math]\displaystyle{ \mu_3 }[/math], the multiplicative function that’s -1 at 3, 1 at numbers that are 1 mod 3 and -1 at numbers that are 2 mod 3. This function has mean-square partial sums which grow logarithmically, see this calculation.

A HAP-subsequence of a sequence [math]\displaystyle{ (x_n) }[/math] is a subsequence of the form [math]\displaystyle{ x_d,x_{2d},x_{3d},... }[/math] for some d. If [math]\displaystyle{ (x_n) }[/math] is a multiplicative [math]\displaystyle{ \pm 1 }[/math] sequence, then every HAP-subsequence is equal to either [math]\displaystyle{ (x_n) }[/math] or [math]\displaystyle{ (-x_n) }[/math]. We shall call a sequence weakly multiplicative if it has only finitely many distinct HAP-subsequences. Let us also call a [math]\displaystyle{ \pm 1 }[/math] sequence quasi-multiplicative if it is a composition of a completely multiplicative function from [math]\displaystyle{ \mathbb{N} }[/math] to an Abelian group G with a function from G to {-1,1} (This definition is too general. See this and the next comment). It can be shown that if [math]\displaystyle{ (x_n) }[/math] is a weakly multiplicative sequence, then it has an HAP-subsequence that is quasi-multiplicative.

It is convenient to define the maps [math]\displaystyle{ T_k }[/math] for [math]\displaystyle{ k \geq 1 }[/math] by [math]\displaystyle{ T_k(x)_n = x_{kn} }[/math], where [math]\displaystyle{ x = (x_1, x_2, ...) }[/math] is a sequence.

Simple observations

To answer the problem negatively, it is sufficient to find a completely multiplicative sequence (that is, a sequence [math]\displaystyle{ (x_n) }[/math] such that [math]\displaystyle{ x_{mn}=x_mx_n }[/math] for every m,n) such that its partial sums [math]\displaystyle{ x_1+\dots+x_n }[/math] are bounded. First mentioned in the proof of the theorem in this post.

The function that takes n to 1 if the last non-zero digit of n in its ternary representation is 1 and -1 if the last non-zero digit is 2 is completely multiplicative and the partial sum up to n is easily shown to be at most [math]\displaystyle{ \log_3n +1 }[/math]. Therefore, the rate at which the worst discrepancy grows, as a function of the length of the homogeneous progression, can be as slow as logarithmic. More examples of this sort are here. [1] ~~It turns out that the base can be made significantly higher than 3, so this example is not best possible.~~ Correction

To answer the problem negatively, it is also sufficient to find a completely multiplicative function f from [math]\displaystyle{ \mathbb{N} }[/math] to a finite Abelian group G (meaning that f(mn)=f(m)+f(n) for every m,n) and a function h from G to {-1,1} such that for each g in G there exists d with f(d)=g and with the partial sums hf(d)+hf(2d)+...+hf(nd) bounded. In that case, one can set [math]\displaystyle{ x_n=hf(n) }[/math]. (Though this observation is simple with the benefit of hindsight, it might not have been made had it not been for the fact that this sort of structure was identified in a long sequence that had been produced experimentally). First mentioned here. Se also this post and this comment.

It can be shown that if a [math]\displaystyle{ \pm 1 }[/math] sequence [math]\displaystyle{ (x_n) }[/math] starts with 1 and has only finitely many distinct subsequences of the form [math]\displaystyle{ (x_d,x_{2d},x_{3d},\dots) }[/math], then it must have a subsequence that arises in the above way. (This was mentioned just above in the terminology section.)

The problem for the positive integers is equivalent to the problem for the positive rationals. [2](Sketch proof: if you have a function f from the positive integers to {-1,1} that works, then define [math]\displaystyle{ f_q(x) }[/math] to be f(q!x) whenever q!x is an integer. Then take a pointwise limit of a subsequence of the functions [math]\displaystyle{ f_q }[/math]. Click here for a more detailed discussion of this construction and what it is good for.

Define the drift of a sequence to be the maximal value of |f(md)+...+f(nd)|. The discrepancy problem is equivalent to showing that the drift is always infinite. It is obvious that it is at least 2 (because |f(2)+f(4)|, |f(2)+f(3)|, |f(3)+f(4)| cannot all be at most 1); one can show that it is at least 3 (which implies as a corollary that the discrepancy is at least 2). One can also show that the upper and lower discrepancy are each at least 2.

One can formulate the problem using topological dynamics.

Using Fourier analysis, one can reduce the problem to one about completely multiplicative functions.

Here is an algorithm for finding multiplicative sequences with bounded discrepancy.

Here is a page whose aim is to record a human proof that completely multiplicative sequences have discrepancy at least 2.

Here is a page whose aim is to record a Human proof that completely multiplicative sequences have discrepancy greater than 3

Here is an argument that shows that bounded discrepancy multiplicative functions do not correlate with characters.

The answer to a function field version of the problem seems to be negative.

The problem can be generalized to pseudointegers. Here we have found problems similar to EDP with a negative answer.

Experimental evidence

Main page: Experimental results

A nice feature of the problem is that it lends itself well to investigation on a computer. Although the number of [math]\displaystyle{ \pm 1 }[/math] sequences grows exponentially, the constraints on these sequences are such that depth-first searches can lead quite quickly to interesting examples. For instance, at the time of writing they have yielded several examples of sequences of length 1124 with discrepancy 2. (This is the current record.) See this page for more details and further links.

Another sort of evidence is to bound the discrepancy for specific sequences. For example, setting [math]\displaystyle{ x_1=-x_2=1 }[/math] and [math]\displaystyle{ x_n=-x_{n/3} }[/math] or [math]\displaystyle{ x_n=x_{n-3} }[/math] according to whether [math]\displaystyle{ n }[/math] is a multiple of 3 or not, yields a sequence with

[math]\displaystyle{ \delta(N) \leq \log_9(N)+1 }[/math]

for sufficiently large [math]\displaystyle{ N }[/math]. Currently, this is the record-holder for slow growing discrepancy.

The Thue-Morse sequence has discrepancy [math]\displaystyle{ \delta(N) \gg N^{\log_4(3)} }[/math].

A random sequence (each [math]\displaystyle{ x_i }[/math] chosen independently) has discrepancy at least (asymptotically) [math]\displaystyle{ \sqrt{2N \log\log N} }[/math] by the law of the iterated logarithm.

Determining if the discrepancy of Liouville's lambda function is [math]\displaystyle{ O(n^{1/2+\epsilon}) }[/math] is equivalent to solving the Riemann hypothesis. However, this growth cannot be bounded above by [math]\displaystyle{ n^{1/2-\epsilon} }[/math] for any positive ε.

Interesting subquestions

Given the length of time that the Erdős discrepancy problem has been open, the chances that it will be solved by Polymath5 are necessarily small. However, there are a number of interesting questions that we do not know the answers to, several of which have arisen naturally from the experimental evidence. Good answers to some of these would certainly constitute publishable results. Here is a partial list -- further additions are very welcome. (When the more theoretical part of the project starts, this section will probably grow substantially and become a separate page.)

Is there an infinite sequence of discrepancy 2? (Given how long a finite sequence can be, it seems unlikely that we could answer this question just by a clever search of all possibilities on a computer.)

The long sequences of low discrepancy discovered by computer all have some kind of approximate weak multiplicativity. If we take a hypothetical counterexample [math]\displaystyle{ (x_n) }[/math] (which we could even assume has discrepancy 2), can we prove that it, or some other counterexample derived from it by passing to HAP-subsequences and taking pointwise limits, has some kind of interesting multiplicative structure?

Closely related to the previous question. If [math]\displaystyle{ (x_n) }[/math] is a hypothetical counterexample, must it satisfy a compactness property of the following kind: for every positive c there exists M such that if you take any M HAP-subsequences, then there must be two of them, [math]\displaystyle{ (y_n) }[/math] and [math]\displaystyle{ (z_n) }[/math], such that [math]\displaystyle{ \lim\inf N^{-1}\sum_{n=1}^Ny_nz_n\geq 1-c }[/math]?

An even weaker question. If [math]\displaystyle{ (x_n) }[/math] is a hypothetical counterexample, must it have two HAP-subsequences [math]\displaystyle{ (y_n) }[/math] and [math]\displaystyle{ (z_n) }[/math] such that the lim inf of [math]\displaystyle{ N^{-1}\sum_{n=1}^N y_nz_n }[/math] is greater than 0?

A similar question, perhaps equivalent to the previous one (this should be fairly easy to check). Given a sequence [math]\displaystyle{ (x_n) }[/math] define f(N) to be the average of [math]\displaystyle{ x_ax_bx_cx_d }[/math] over all quadruples (a,b,c,d) such that ab=cd and [math]\displaystyle{ a,b,c,d\leq N }[/math]. If [math]\displaystyle{ (x_n) }[/math] is a counterexample, does that imply that [math]\displaystyle{ \lim\inf f(N) }[/math] is greater than 0? See here for a computation of this average for the first 1124 sequence.

Is there any completely multiplicative counterexample? (This may turn out to be a very hard question. If so, then answering the previous questions could turn out to be the best we can hope for without making a substantial breakthrough in analytic number theory.)

Does there exist a [math]\displaystyle{ \pm 1 }[/math] sequence such that the corresponding discrepancy function grows at a rate that is slower than [math]\displaystyle{ c\log n }[/math] for any positive c?

Does the number of sequences of length n and discrepancy at most C grow exponentially with n or more slowly than exponentially? (Obviously if the conjecture is true then it must be zero for large enough n, but the hope is that this question is a more realistic initial target.)

General proof strategies

This section contains links to other pages in which potential approaches to solving the problem are described.

First obtain multiplicative structure and then obtain a contradiction

Find a different parameter, show that it tends to infinity, and show that that implies that the discrepancy tends to infinity

Prove the result for shifted HAPs instead of HAPs

Find a good configuration of HAPs

Generalize to a graph-theoretic formulation

Representation of the diagonal

Bounding the discrepancy in terms of the common difference

Annotated Bibliography

Blog Discussion on Gowers's Weblog
- Zeroth Post (Dec 17 2009 - Jan 6 2010).
- First Post (Jan 6 - Jan 12).
- Second Post (Jan 9 - Jan 11).
- Third Post (Jan 11 - Jan 14).
- Fourth Post (Jan 14 - 16).
- Fifth Post (Jan 16-19).
- First Theoretical Post (Jan 19-21)
- Second Theoretical Post (Jan 21-26)
- Third Theoretical Post (Jan 26 -?)
- Fourth Theoretical Post (Jan 30- Feb 2)
- Fifth Theoretical Post (Feb 2 - Feb 5)
- Sixth Theoretical Post (Feb 5- Feb 8)
- Seventh Theoretical Post (Feb 8 - 19)
- Eighth Theoretical Post (Feb 19-Feb 24)
- Ninth Theoretical Most (Feb 24-Mar 2)
- Tenth Theoretical Post (Mar 2-Mar 7)
- Eleventh Theoretical Post (Mar 7-Mar 12)
- Twelfth Theoretical Post (Mar 13-Mar 22)
- Thirteenth Theoretical Post (Mar 23-Apr 24)
- Fourteenth Theoretical Post (Apr 25-Jun 20)
- Fifteenth Theoretical Post (Jun 21-Jul 3)
- Sixteenth theoretical Post (Jul 4-Jul 17)
- Seventeenth Theoretical Post (Jul 18-Sep 2)
- Eighteenth Theoretical Post (Sep 3-Sep 5)
- Nineteenth Theoretical Post (Sep 6-Sep 9)
- Twentieth Theoretical Post (Sep 10-Sep 20)
- Twenty-first Theoretical Post (Sep 21 - Aug 21, 2012)
- A possible revival (Aug 22)
- First guest post from Gil Kalai (Aug 22-26)
- Second guest post from Gil Kalai (Aug 27-30)
- An attempt to get back into the diagonal decomposition approach (Aug 30-Sep 3)
- Third guest post from Gil Kalai (Sep 4-Sep 5)
- Three generalizations (Sep 6-Sep 18)
- The modular version of Roth's AP discrepancy theorem (Sep 19-)

Several questions on MathOverflow have been given the `polymath5' tag.

Mathias, A. R. D. On a conjecture of Erdős and Čudakov. Combinatorics, geometry and probability (Cambridge, 1993).

This one page paper establishes that the maximal length of sequence for the case where C=1 is 11, and is the starting point for our experimental studies.

Tchudakoff, N. G. Theory of the characters of number semigroups. J. Indian Math. Soc. (N.S.) 20 (1956), 11--15. MR0083515 (18,719e)

The Mathias paper states that one of Erdős's problem lists states that this paper "studies related questions". The Math Review for this paper states that it summarizes results from seven other papers that are in Russian. The Math Review leaves the impression that the topic concerns characters of modulus 1.

Borwein, Peter, and Choi, Stephen. A variant of Liouville's lambda function: some surprizing formulae.

Explicit formulas for the discrepancy of some completely multiplicative functions whose discrepancy is logarithmic. A typical example is: Let [math]\displaystyle{ \lambda_3(n) = (-1)^{\omega_3(n)} }[/math] where [math]\displaystyle{ \omega_3(n) }[/math] is the number of distinct prime factors congruent to [math]\displaystyle{ -1 \bmod 3 }[/math] in [math]\displaystyle{ n }[/math] (with multiple factors counted multiply). Then the discrepancy of the [math]\displaystyle{ \lambda_3 }[/math] sequence up to [math]\displaystyle{ N }[/math] is exactly the number of 1's in the base three expansion of [math]\displaystyle{ N }[/math]. (This turns out not to be so surprising after all, since [math]\displaystyle{ \lambda_3(n) }[/math] is precisely the same as the ternary function defined in the second item of the Simple Observations section.)

Borwein, Peter, Choi, Stephen and Coons, Michael. Completely multiplicative functions taking values in {-1,1}. The published version of the above paper, which explains the results and proofs more clearly than the preprint, and is more explicit about the relationship with Erdős's question.

Granville and Soundararajan. Multiplicative functions in arithmetic progressions

In this blog post Terry Tao wrote: "Granville and Soundararajan have made some study of the discrepancy of bounded multiplicative functions. The situation is remarkably delicate and number-theoretical (and is closely tied with the Granville-Soundararajan theory of pretentious characters)."

Wirsing, E. Das asymptotische Verhalten von Summen über multiplikative Funktionen. II. (German) [The asymptotic behavior of sums of multiplicative functions. II.] Acta Math. Acad. Sci. Hungar. 18 1967 411--467. MR0223318 (36 #6366) (partial) English translation

Newman, D. J. On the number of binary digits in a multiple of three. Proc Amer Math Soc. 21(1969): 719--721.

Proves that the Thue-Morse sequence has discrepancy [math]\displaystyle{ \gg N^{\log_4(3)} }[/math]

Halász, G. On random multiplicative functions. Hubert Delange colloquium (Orsay, 1982), 74–96, Publ. Math. Orsay, 83-4, Univ. Paris XI, Orsay, 1983.

The discrepancy of a random multiplicative function is close to [math]\displaystyle{ \sqrt{N} }[/math].

Problem lists on which this problem appears

Erdős, P. and Graham, R. Old and New Problems and Results in Combinatorial Number Theory: van der Waerden's Theorem and Related Topics, L'Enseignement Math. 25 (1979), 325-344.

Our problem is mentioned at the bottom of page 331, where they indicate knowledge of a coloring with logarithmic discrepancy. On pages 330-1, they review work on the non-homogeneous problem.

Erdős, Paul, Some of my favourite unsolved problems. A tribute to Paul Erdős, 467--478, Cambridge Univ. Press, Cambridge, 1990. MR1117038 (92f:11003)

"Let [math]\displaystyle{ f(n)=\pm1 }[/math]. Is it true that for every [math]\displaystyle{ c }[/math] there is a [math]\displaystyle{ d }[/math] and an [math]\displaystyle{ m }[/math] so that

[math]\displaystyle{ \left| \sum_{k=1}^m f(kd) \right| \gt c }[/math]?

I will pay $500 for an answer. Let [math]\displaystyle{ f(n)=\pm1 }[/math] and [math]\displaystyle{ f(ab)=f(a)f(b) }[/math]. Is it then true that [math]\displaystyle{ \left| \sum_{k=1}^m f(kd) \right| }[/math] cannot be bounded?"

P. Erdős. Some unsolved problems, Michigan Math. J. 4 (1957), 291--300 MR20 #5157; Zentralblatt 81,1.

Problem 9 of this paper is ours. Erdős dates the problem to around 1932, and notes what we know about Liouville's function (lower and upper bound).

Finch, S. Two-colorings of positive integers. Dated May 27, 2008.

Summarizes knowledge of discrepancy of two colorings when the discrepancy is restricted to homogeneous progressions, nonhomogeneous progressions, and homogeneous quasi-progressions. Contains bibliography with 17 entries, including most of those above.

non-homogeneous AP, quasi-AP, and other related discrepancy papers

Hochberg, Robert. Large Discrepancy In Homogenous Quasi-Arithmetic Progressions. Combinatorica, Volume 26 (2006), Number 1.

Roth's method for the [math]\displaystyle{ n^{1/4} }[/math] lower bound on nonhomogeneous AP discrepancy is adapted to give a lower bound for homogeneous quasi-AP discrepancy. This result is weaker than the Vijay result, but uses a different method.

Vijay, Sujith. On the discrepancy of quasi-progressions. Electronic Journal of Combinatorics, R104 of Volume 15(1), 2008.

A quasi-progression is a sequence of the form [math]\displaystyle{ \lfloor k \alpha \rfloor }[/math], with [math]\displaystyle{ 1\leq k \leq K }[/math] for some K. The main theorem of interest is: If the integers from 0 to n are 2-coloured, there exists a quasi-progression that has discrepancy at least [math]\displaystyle{ (1/50)n^{1/6} }[/math].

Doerr, Benjamin; Srivastav, Anand; Wehr, Petra. Discrepancy of Cartesian products of arithmetic progressions. Electron. J. Combin. 11 (2004), no. 1, Research Paper 5, 16 pp. (electronic).

Abstract: We determine the combinatorial discrepancy of the hypergraph H of cartesian products of d arithmetic progressions in the [math]\displaystyle{ [N]^d }[/math] –lattice. The study of such higher dimensional arithmetic progressions is motivated by a multi-dimensional version of van derWaerden’s theorem, namely the Gallai-theorem (1933). We solve the discrepancy problem for d–dimensional arithmetic progressions by proving [math]\displaystyle{ disc(H) = \Theta(N^{d/4}) }[/math] for every fixed positive integer d. This extends the famous lower bound of [math]\displaystyle{ \Omega(N^{1/4}) }[/math] of Roth (1964) and the matching upper bound [math]\displaystyle{ O(N^{1/4}) }[/math] of Matoušek and Spencer (1996) from d = 1 to arbitrary, fixed d. To establish the lower bound we use harmonic analysis on locally compact abelian groups. For the upper bound a product coloring arising from the theorem of Matoušek and Spencer is sufficient. We also regard some special cases, e.g., symmetric arithmetic progressions and infinite arithmetic progressions.

Cilleruelo, Javier; Hebbinghaus, Nils Discrepancy in generalized arithmetic progressions. European J. Combin. 30 (2009), no. 7, 1607--1611. MR 2547932

Valkó, Benedek. [Discrepancy of arithmetic progressions in higher dimensions]. (English summary)

J. Number Theory 92 (2002), no. 1, 117--130. MR1880588 (2003b:11071)

Abstract: K. F. Roth (1964, Acta. Arith.9, 257–260) proved that the discrepancy of arithmetic progressions contained in [N] is at least [math]\displaystyle{ cN^{1/4} }[/math], and later it was proved that this result is sharp. We consider the d-dimensional version of this problem. We give a lower estimate for the discrepancy of arithmetic progressions on [math]\displaystyle{ [N]^d }[/math] and prove that this result is nearly sharp. We use our results to give an upper estimate for the discrepancy of lines on an N×N lattice, and we also give an estimate for the discrepancy of a related random hypergraph.

Roth, K. F., Remark concerning integer sequences. Acta Arith. 9 1964 257--260. MR0168545 (29 #5806)

Shows that the discrepancy on APs is at least [math]\displaystyle{ cN^{1/4} }[/math].

T. Gowers, Erdős and Arithmetic Progressions, in Erdos and arithmetic progressions, in Erdos Centennial, Bolyai Society Mathematical Studies, 25, L. Lovasz, I. Z. Ruzsa, V. T. Sos eds., Springer 2013, pp. 265-287.

A survey of the Polymath5 project.

The Erdős discrepancy problem: Difference between revisions

Latest revision as of 10:31, 21 September 2015

Contents

Introduction and statement of problem

Some definitions and notational conventions

Simple observations

Experimental evidence

Interesting subquestions

General proof strategies

Annotated Bibliography

Problem lists on which this problem appears

non-homogeneous AP, quasi-AP, and other related discrepancy papers

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools

@@ Line 3: / Line 3: @@
 ==Introduction and statement of problem==
-Is there a <math>\scriptstyle \pm 1</math> sequence <math>(x_n)</math> and constant <math>C</math>, such that for all positive integers <math> d,k </math>
+Let <math>(x_n)</math> be a <math>\scriptstyle \pm 1</math> sequence and let <math>C</math> be a constant. Must there exist positive integers <math> d,k </math> such that
-:<math> \left| \sum_{i=1}^k x_{id} \right| \leq C </math>?
+:<math> \left| \sum_{i=1}^k x_{id} \right| > C </math>?
-Known colloquially as "The Erd&#337;s discrepancy problem", this question has remained unanswered since the 1930s [citation needed]. It was also asked by Chudakov [again a citation needed]. It is extremely easy to state, and a very appealing problem, so perhaps Polymath can succeed where individual mathematicians have failed. If so, then, given the notoriety of the problem, it would be a significant coup for the multiply collaborative approach to mathematics. But even if a full solution is not reached, preliminary investigations have already thrown up some interesting ideas which could perhaps be developed into publishable results.
+Known colloquially as "The Erd&#337;s discrepancy problem", this question has remained unanswered since the 1930s (Erd&#337;s, 1957) and Erd&#337;s offered $500 for an answer. It was also asked by Chudakov (1956). It is extremely easy to state, and a very appealing problem, so perhaps Polymath can succeed where individual mathematicians have failed. If so, then, given the notoriety of the problem, it would be a significant coup for the multiply collaborative approach to mathematics. But even if a full solution is not reached, preliminary investigations have already thrown up some interesting ideas which could perhaps be developed into publishable results.
-Anticipating a negative answer to Erd&#337;s's question, we now quantify the problem. For any finite set <math>A</math> of integers, we define the error <math>A</math> by
+It seems likely that the answer to Erd&#337;s's question is yes. If it is, then an easy compactness argument tells us that for every <math>C</math> there exists <math>n</math> such that for every <math>\scriptstyle \pm 1</math> sequence of length <math>n</math> there exist <math>d,k</math> with <math>dk\leq n</math> such that <math> \left| \sum_{i=1}^k x_{id} \right| > C </math>. In view of this, we make some definitions that allow one to talk about the dependence between <math>C</math> and <math>n</math>. For any finite set <math>A</math> of integers, we define the <em>error</em> on <math>A</math> by
 :<math> E(A) := \sum_{a\in A} x_a </math>,
-and for a set <math>\mathcal{A}</math> of finite sets of integers, we define the discrepancy
+and for a set <math>\mathcal{A}</math> of finite sets of integers, we define the <em>discrepancy</em>
 :<math> \delta(\mathcal{A},x) := \sup_{A\in \mathcal{A}} |E(A)|. </math>
 We can think of the values taken by the sequence <math>(x_n)</math> as a red/blue colouring of the integers that tries to make the number of reds and blues in each <math>\scriptstyle A\in\mathcal{A}</math> as equal as possible. The discrepancy measures the extent to which the sequence fails in this attempt. Taking <math>\scriptstyle \mathcal{HAP}(N)</math> to be the set of <em>homogeneous arithmetic progressions</em> <math>\{d, 2d, 3d, ..., nd\}</math> contained in <math>\{1, 2, ..., N\}</math>, we can restate the question as whether <math>\scriptstyle \delta(\mathcal{HAP}(N),x) \to \infty</math> for every sequence <math>x</math>.
-Two related questions have already been solved in the literature. Letting <math>\scriptstyle \mathcal{AP}(N) </math> be the collection of all (not necessarily homogeneous) arithmetic progressions in <math>\{1, 2, ..., N\}</math>, Roth [citation needed] proved that <math> \scriptstyle \delta(\mathcal{AP}(N),x) \geq c n^{1/4}</math>, independent of the sequence <math>x</math>. Letting <math> \scriptstyle \mathcal{HQAP}(N) </math> be the collection of all homogeneous quasi-arithmetic progressions <math>\scriptstyle\{\lfloor \alpha \rfloor,\lfloor 2\alpha \rfloor,\dots,\lfloor k\alpha \rfloor\}</math> contained in <math>\{1, 2, ..., N\}</math>, Vijay [citation needed] proved that <math>\scriptstyle \delta(\mathcal{HQAP}(N),x) \geq 0.02 n^{1/6}</math>, independent of the sequence <math>x</math>.
+Two related questions have already been solved in the literature. Letting <math>\scriptstyle \mathcal{AP}(N) </math> be the collection of all (not necessarily homogeneous) arithmetic progressions in <math>\{1, 2, ..., N\}</math>, Roth (1964) proved that <math> \scriptstyle \delta(\mathcal{AP}(N),x) \geq c n^{1/4}</math>, independent of the sequence <math>x</math>. Letting <math> \scriptstyle \mathcal{HQAP}(N) </math> be the collection of all homogeneous quasi-arithmetic progressions <math>\scriptstyle\{\lfloor \alpha \rfloor,\lfloor 2\alpha \rfloor,\dots,\lfloor k\alpha \rfloor\}</math> contained in <math>\{1, 2, ..., N\}</math>, Vijay (2008) proved that <math>\scriptstyle \delta(\mathcal{HQAP}(N),x) \geq 0.02 n^{1/6}</math>, independent of the sequence <math>x</math>.
 ==Some definitions and notational conventions==
@@ Line 24: / Line 24: @@
 The <em>discrepancy</em> of a sequence <math>(x_n)</math> is the supremum of <math>|\sum_{n\in P}x_n|</math> over all homogeneous arithmetic progressions P. (Strictly speaking, we should call this something like the HAP-discrepancy, but since we will almost always be talking about HAPs, we adopt the convention that "discrepancy" always means "HAP-discrepancy" unless it is stated otherwise.)
-<math>Insert formula here</math>
 A sequence <math>(x_n)</math> <em>has discrepancy at most</em> &phi;(n) if for every natural number N the maximum value of <math>|\sum_{n\in P}x_n|</math> over all homogeneous arithmetic progressions P that are subsets of {1,2,...,N} is at most &phi;(N).
-The EDP is the Erd&#337;s discrepancy problem. (This may conceivably be changed if enough people don't like it.)
+The EDP is the Erd&#337;s discrepancy problem.
-A sequence <math>(x_n)</math> is <em>completely multiplicative</em> if <math>x_{mn}=x_mx_n</math> for any two positive integers m and n. We shall sometimes abbreviate this to "multiplicative", but the reader should be aware that the word "multiplicative" normally refers to the more general class of sequences such that <math>x_{mn}=x_mx_n</math> whenever m and n are coprime. A completely multiplicative function is determined by the values it takes at primes. The <em>Liouville function</em> &lambda; is the unique completely multiplicative function that takes the value -1 at every prime: if the prime factorization of n is <math>\prod p_i^{a_i}</math> then &lambda;(n) equals <math>(-1)^{\sum a_i}</math>.
+A sequence <math>(x_n)</math> is <em>completely multiplicative</em> if <math>x_{mn}=x_mx_n</math> for any two positive integers m and n. We shall sometimes abbreviate this to "multiplicative", but the reader should be aware that the word "multiplicative" normally refers to the more general class of sequences such that <math>x_{mn}=x_mx_n</math> whenever m and n are coprime. A completely multiplicative function is determined by the values it takes at primes. The <em>Liouville function</em> &lambda; is the unique completely multiplicative function that takes the value -1 at every prime: if the prime factorization of n is <math>\prod p_i^{a_i}</math> then &lambda;(n) equals <math>(-1)^{\sum a_i}</math>.  Another important multiplicative function is <math>\mu_3</math>, the multiplicative function that’s -1 at 3, 1 at numbers that are 1 mod 3 and -1 at numbers that are 2 mod 3. This function has mean-square partial sums which grow logarithmically, [http://gowers.wordpress.com/2010/01/26/edp3-a-very-brief-report-on-where-we-are/#comment-5585 see this calculation].
 A <em>HAP-subsequence</em> of a sequence <math>(x_n)</math> is a subsequence of the form <math>x_d,x_{2d},x_{3d},...</math> for some d. If <math>(x_n)</math> is a multiplicative <math>\pm 1</math> sequence, then every HAP-subsequence is equal to either <math>(x_n)</math> or <math>(-x_n)</math>. We shall call a sequence <em>weakly multiplicative</em> if it has only finitely many distinct HAP-subsequences. Let us also call a <math>\pm 1</math> sequence <em>quasi-multiplicative</em> if it is a composition of a completely multiplicative function from <math>\mathbb{N}</math> to an Abelian group G with a function from G to {-1,1} (This definition is too general. See [http://gowers.wordpress.com/2010/01/09/erds-discrepancy-problem-continued/#comment-4843 this and the next comment]). [http://gowers.wordpress.com/2009/12/17/erdoss-discrepancy-problem/#comment-4705 It can be shown] that if <math>(x_n)</math> is a weakly multiplicative sequence, then it has an HAP-subsequence that is quasi-multiplicative.
@@ Line 39: / Line 40: @@
 *To answer the problem negatively, it is sufficient to find a completely multiplicative sequence (that is, a sequence <math>(x_n)</math> such that <math>x_{mn}=x_mx_n</math> for every m,n) such that its partial sums <math>x_1+\dots+x_n</math> are bounded. First mentioned in the proof of the theorem in [http://gowers.wordpress.com/2009/12/17/erdoss-discrepancy-problem/ this] post.
-*The function that takes n to 1 if the last non-zero digit of n in its ternary representation is 1 and -1 if the last non-zero digit is 2 is completely multiplicative and the partial sum up to n is easily shown to be at most <math>\log_3n +1</math>. Therefore, the rate at which the worst discrepancy grows, as a function of the length of the homogeneous progression, can be as slow as logarithmic. [http://gowers.wordpress.com/2009/12/17/erdoss-discrepancy-problem/] <s>[http://gowers.wordpress.com/2009/12/17/erdoss-discrepancy-problem/#comment-4553 It turns out] that the base can be made significantly higher than 3, so this example is not best possible.</s> [http://gowers.wordpress.com/2010/01/06/erdss-discrepancy-problem-as-a-forthcoming-polymath-project/#comment-4757 Correction]
+*The function that takes n to 1 if the last non-zero digit of n in its ternary representation is 1 and -1 if the last non-zero digit is 2 is completely multiplicative and the partial sum up to n is easily shown to be at most <math>\log_3n +1</math>. Therefore, the rate at which the worst discrepancy grows, as a function of the length of the homogeneous progression, can be as slow as logarithmic. [[Matryoshka_Sequences|More examples of this sort are here.]] [http://gowers.wordpress.com/2009/12/17/erdoss-discrepancy-problem/] <s>[http://gowers.wordpress.com/2009/12/17/erdoss-discrepancy-problem/#comment-4553 It turns out] that the base can be made significantly higher than 3, so this example is not best possible.</s> [http://gowers.wordpress.com/2010/01/06/erdss-discrepancy-problem-as-a-forthcoming-polymath-project/#comment-4757 Correction]
 *To answer the problem negatively, it is also sufficient to find a completely multiplicative function f from <math>\mathbb{N}</math> to a finite Abelian group G (meaning that f(mn)=f(m)+f(n) for every m,n) and a function h from G to {-1,1} such that for each g in G there exists d with f(d)=g and with the partial sums hf(d)+hf(2d)+...+hf(nd) bounded. In that case, one can set <math>x_n=hf(n)</math>. (Though this observation is simple with the benefit of hindsight, it might not have been made had it not been for the fact that this sort of structure was identified in a long sequence that had been produced experimentally). First mentioned [http://gowers.wordpress.com/2009/12/17/erdoss-discrepancy-problem/#comment-4696 here]. Se also [http://gowers.wordpress.com/2010/01/06/erdss-discrepancy-problem-as-a-forthcoming-polymath-project/ this post] and [http://gowers.wordpress.com/2010/01/06/erdss-discrepancy-problem-as-a-forthcoming-polymath-project/#comment-4717 this comment].
@@ Line 46: / Line 47: @@
 *The problem for the positive integers is equivalent to the problem for the positive rationals. [http://gowers.wordpress.com/2009/12/17/erdoss-discrepancy-problem/#comment-4676](Sketch proof: if you have a function f from the positive integers to {-1,1} that works, then define <math>f_q(x)</math> to be f(q!x) whenever q!x is an integer. Then take a pointwise limit of a subsequence of the functions <math>f_q</math>. [[Limits with better properties|Click here for a more detailed discussion of this construction and what it is good for]].
+*Define the [[drift]] of a sequence to be the maximal value of |f(md)+...+f(nd)|.  The discrepancy problem is equivalent to showing that the drift is always infinite.  It is obvious that it is at least 2 (because |f(2)+f(4)|, |f(2)+f(3)|, |f(3)+f(4)| cannot all be at most 1); one can show that [[drift|it is at least 3]] (which implies as a corollary that the discrepancy is at least 2).  One can also show that the [[upper and lower discrepancy]] are each at least 2.
+*One can [[topological dynamics formulation|formulate the problem using topological dynamics]].
+*Using Fourier analysis, one can [[Fourier reduction|reduce the problem to one about completely multiplicative functions]].
+*Here is an [[algorithm for finding multiplicative sequences with bounded discrepancy]].
+*Here is a page whose aim is to record a [[human proof that completely multiplicative sequences have discrepancy at least 2]].
+*Here is a page whose aim is to record a [[Human proof that completely multiplicative sequences have discrepancy greater than 3]]
+*Here is an argument that shows that [[bounded discrepancy multiplicative functions do not correlate with characters]].
+* The answer to a [[function field version]] of the problem seems to be negative.
+* The problem can be generalized to [[pseudointegers]]. Here we have found [[EDP on pseudointegers|problems similar to EDP]] with a negative answer.
 ==Experimental evidence==
+''Main page: [[Experimental results]]''
 A nice feature of the problem is that it lends itself well to investigation on a computer. Although the number of <math>\pm 1</math> sequences grows exponentially, the constraints on these sequences are such that depth-first searches can lead quite quickly to interesting examples. For instance, at the time of writing they have yielded several examples of sequences of length 1124 with discrepancy 2. (This is the current record.) See [[Experimental results|this page]] for more details and further links.
@@ Line 57: / Line 78: @@
 for sufficiently large <math> N </math>. Currently, this is the record-holder for slow growing discrepancy.
-The Thue-Morse sequence has discrepancy <math> \delta(N) \gg \sqrt{N} </math>. (See the discussion following [http://gowers.wordpress.com/2010/01/06/erdss-discrepancy-problem-as-a-forthcoming-polymath-project/#comment-4775 this comment] and the next one.)
+The [[Thue-Morse-Hedlund Sequence|Thue-Morse sequence]] has discrepancy <math> \delta(N) \gg N^{\log_4(3)} </math>.
 A random sequence (each <math> x_i </math> chosen independently) has discrepancy at least (asymptotically) <math> \sqrt{2N \log\log N} </math> by the law of the iterated logarithm.
@@ Line 74: / Line 95: @@
 *An even weaker question. If <math>(x_n)</math> is a hypothetical counterexample, must it have two HAP-subsequences <math>(y_n)</math> and <math>(z_n)</math> such that the lim inf of <math>N^{-1}\sum_{n=1}^N y_nz_n</math> is greater than 0?
+*A similar question, perhaps equivalent to the previous one (this should be fairly easy to check). Given a sequence <math>(x_n)</math> define f(N) to be the average of <math>x_ax_bx_cx_d</math> over all quadruples (a,b,c,d) such that ab=cd and <math>a,b,c,d\leq N</math>. If <math>(x_n)</math> is a counterexample, does that imply that <math>\lim\inf f(N)</math> is greater than 0? [http://thomas1111.wordpress.com/2010/01/15/average-over-quadruples-of-the-first-1124-sequence/ See here for a computation of this average for the first 1124 sequence].
 *Is there any completely multiplicative counterexample? (This may turn out to be a very hard question. If so, then answering the previous questions could turn out to be the best we can hope for without making a substantial breakthrough in analytic number theory.)
 *Does there exist a <math>\pm 1</math> sequence such that the corresponding discrepancy function grows at a rate that is slower than <math>c\log n</math> for any positive c?
+*Does the number of sequences of length n and discrepancy at most C grow exponentially with n or more slowly than exponentially? (Obviously if the conjecture is true then it must be zero for large enough n, but the hope is that this question is a more realistic initial target.)
+==General proof strategies==
+This section contains links to other pages in which potential approaches to solving the problem are described.
+[[First obtain multiplicative structure and then obtain a contradiction]]
+[[Find a different parameter, show that it tends to infinity, and show that that implies that the discrepancy tends to infinity]]
+[[Prove the result for shifted HAPs instead of HAPs]]
+[[Find a good configuration of HAPs]]
+[[Generalize to a graph-theoretic formulation]]
+[[Representation of the diagonal]]
+[[Bounding the discrepancy in terms of the common difference]]
 ==Annotated Bibliography==
 *Blog Discussion on Gowers's Weblog
-**[http://gowers.wordpress.com/2009/12/17/erdoss-discrepancy-problem/ First Thread].
+**[http://gowers.wordpress.com/2009/12/17/erdoss-discrepancy-problem/ Zeroth Post] (Dec 17 2009 - Jan 6 2010).
-**[http://gowers.wordpress.com/2010/01/06/erdss-discrepancy-problem-as-a-forthcoming-polymath-project/ Second thread].
+**[http://gowers.wordpress.com/2010/01/06/erdss-discrepancy-problem-as-a-forthcoming-polymath-project/ First Post] (Jan 6 - Jan 12).
-**[http://gowers.wordpress.com/2010/01/09/erds-discrepancy-problem-continued/ Third thread].
+**[http://gowers.wordpress.com/2010/01/09/erds-discrepancy-problem-continued/ Second Post] (Jan 9 - Jan 11).
-**[http://gowers.wordpress.com/2010/01/11/the-erds-discrepancy-problem-iii/ Fourth thread].
+**[http://gowers.wordpress.com/2010/01/11/the-erds-discrepancy-problem-iii/ Third Post] (Jan 11 - Jan 14).
+**[http://gowers.wordpress.com/2010/01/14/the-erds-discrepancy-problem-iv/ Fourth Post] (Jan 14 - 16).
+**[http://gowers.wordpress.com/2010/01/16/the-erds-discrepancy-problem-v/ Fifth Post] (Jan 16-19).
+**[http://gowers.wordpress.com/2010/01/19/edp1-the-official-start-of-polymath5/ First Theoretical Post] (Jan 19-21)
+**[http://gowers.wordpress.com/2010/01/21/edp2-a-few-lessons-from-edp1/ Second Theoretical Post] (Jan 21-26)
+**[http://gowers.wordpress.com/2010/01/26/edp3-a-very-brief-report-on-where-we-are/ Third Theoretical Post] (Jan 26 -?)
+**[http://gowers.wordpress.com/2010/01/30/edp4-focusing-on-multiplicative-functions/ Fourth Theoretical Post] (Jan 30- Feb 2)
+**[http://gowers.wordpress.com/2010/02/02/edp5-another-very-brief-summary/ Fifth Theoretical Post] (Feb 2 - Feb 5)
+**[http://gowers.wordpress.com/2010/02/05/edp6-what-are-the-chances-of-success/ Sixth Theoretical Post] (Feb 5- Feb 8)
+**[http://gowers.wordpress.com/2010/02/08/edp7-emergency-post/ Seventh Theoretical Post] (Feb 8 - 19)
+**[http://gowers.wordpress.com/2010/02/19/edp8-what-next/ Eighth Theoretical Post] (Feb 19-Feb 24)
+**[http://gowers.wordpress.com/2010/02/24/edp9-a-change-of-focus/ Ninth Theoretical Most] (Feb 24-Mar 2)
+**[http://gowers.wordpress.com/2010/03/02/edp10-a-new-and-very-promising-approach/ Tenth Theoretical Post] (Mar 2-Mar 7)
+**[http://gowers.wordpress.com/2010/03/07/edp11-the-search-continues/ Eleventh Theoretical Post] (Mar 7-Mar 12)
+**[http://gowers.wordpress.com/2010/03/13/edp12-representing-diagonal-maps/ Twelfth Theoretical Post] (Mar 13-Mar 22)
+**[http://gowers.wordpress.com/2010/03/23/edp13-quick-summary/ Thirteenth Theoretical Post] (Mar 23-Apr 24)
+**[http://gowers.wordpress.com/2010/04/25/edp14-strategic-questions/ Fourteenth Theoretical Post] (Apr 25-Jun 20)
+**[http://gowers.wordpress.com/2010/06/21/edp15-finding-a-diagonal-matrix/ Fifteenth Theoretical Post] (Jun 21-Jul 3)
+**[http://gowers.wordpress.com/2010/07/04/edp16-from-ap-discrepancy-to-hap-discrepancy/ Sixteenth theoretical Post] (Jul 4-Jul 17)
+**[http://gowers.wordpress.com/2010/07/18/edp17-are-we-nearly-there/ Seventeenth Theoretical Post] (Jul 18-Sep 2)
+**[http://gowers.wordpress.com/2010/09/03/edp18-apparently-p-does-not-equal-np/ Eighteenth Theoretical Post] (Sep 3-Sep 5)
+**[http://gowers.wordpress.com/2010/09/06/edp19-removing-some-vagueness/ Nineteenth Theoretical Post] (Sep 6-Sep 9)
+**[http://gowers.wordpress.com/2010/09/10/edp20-squares-and-fly-traps/ Twentieth Theoretical Post] (Sep 10-Sep 20)
+**[http://gowers.wordpress.com/2010/09/21/edp21-restrictions-on-possible-proofs/ Twenty-first Theoretical Post] (Sep 21 - Aug 21, 2012)
+**[http://gowers.wordpress.com/2012/08/22/edp-a-possible-revival/ A possible revival] (Aug 22)
+**[http://gowers.wordpress.com/2012/08/22/edp22-first-guest-post-from-gil-kalai/ First guest post from Gil Kalai] (Aug 22-26)
+**[http://gowers.wordpress.com/2012/08/27/edp23-second-guest-post-by-gil-kalai/ Second guest post from Gil Kalai] (Aug 27-30)
+**[http://gowers.wordpress.com/2012/08/31/edp24-an-attempt-to-get-back-into-the-diagonal-decomposition-approach/ An attempt to get back into the diagonal decomposition approach] (Aug 30-Sep 3)
+**[http://gowers.wordpress.com/2012/09/04/edp25-third-guest-post-by-gil-kalai/ Third guest post from Gil Kalai] (Sep 4-Sep 5)
+**[http://gowers.wordpress.com/2012/09/06/edp26-three-generalizations/ Three generalizations] (Sep 6-Sep 18)
+**[http://gowers.wordpress.com/2012/09/19/edp27-the-modular-version-of-roths-ap-discrepancy-theorem/ The modular version of Roth's AP discrepancy theorem] (Sep 19-)
+*[http://mathoverflow.net/questions/tagged/polymath5 Several questions on MathOverflow have been given the `polymath5' tag.]
 *Mathias, A. R. D. [http://www.dpmms.cam.ac.uk/~ardm/erdoschu.pdf On a conjecture of Erdős and Čudakov]. Combinatorics, geometry and probability (Cambridge, 1993).
-This is a short paper establishing the maximal length of sequence for the case where C=1 is 11, and is the starting point for our experimental studies.
+This one page paper establishes that the maximal length of sequence for the case where C=1 is 11, and is the starting point for our experimental studies.
+*Tchudakoff, N. G. Theory of the characters of number semigroups. J. Indian Math. Soc. (N.S.) 20 (1956), 11--15. MR0083515 (18,719e)
+The Mathias paper states that one of Erdős's problem lists states that this paper "studies related questions". The Math Review for this paper states that it summarizes results from seven other papers that are in Russian. The Math Review leaves the impression that the topic concerns characters of modulus 1.
 *Borwein, Peter, and Choi, Stephen. [http://www.cecm.sfu.ca/personal/pborwein/PAPERS/P204.pdf A variant of Liouville's lambda function: some surprizing formulae].
-Explicit formulas for the discrepancy of some completely multiplicative functions whose discrepancy is logarithmic. A typical example is: Let <math>\lambda_3(n) = (-1)^{\omega_3(n)}</math> where <math>\omega_3(n)</math> is the number of distinct prime factors congruent to <math>-1 \bmod 3</math> in <math>n</math> (with multiple factors counted multiply). Then the discrepancy of the <math>\lambda_3</math> sequence up to <math>N</math> is exactly the number of 1's in the base three expansion of <math>N</math>.
+Explicit formulas for the discrepancy of some completely multiplicative functions whose discrepancy is logarithmic. A typical example is: Let <math>\lambda_3(n) = (-1)^{\omega_3(n)}</math> where <math>\omega_3(n)</math> is the number of distinct prime factors congruent to <math>-1 \bmod 3</math> in <math>n</math> (with multiple factors counted multiply). Then the discrepancy of the <math>\lambda_3</math> sequence up to <math>N</math> is exactly the number of 1's in the base three expansion of <math>N</math>. (This turns out not to be so surprising after all, since <math>\lambda_3(n)</math> is precisely the same as the ternary function defined in the second item of the Simple Observations section.)
-*Vijay, Sujith. [http://www.combinatorics.org/Volume_15/PDF/v15i1r104.pdf On the discrepancy of quasi-progressions]. Electronic Journal of Combinatorics, R104 of Volume 15(1), 2008.
+*Borwein, Peter, Choi, Stephen and Coons, Michael. [http://www.cecm.sfu.ca/personal/pborwein/PAPERS/P225.pdf  Completely multiplicative functions taking values in {-1,1}]. The published version of the above paper, which explains the results and proofs more clearly than the preprint, and is more explicit about the relationship with Erd&#337;s's question.
-A quasi-progression is a sequence of the form <math>\lfloor k \alpha \rfloor </math>, with <math>1\leq k \leq K</math> for some K. The main theorem of interest is: If the integers from 0 to n are 2-coloured, there exists a quasi-progression that has discrepancy at least <math> (1/50)n^{1/6}</math>.
 *Granville and Soundararajan. [http://arxiv.org/abs/math/0702389 Multiplicative functions in arithmetic progressions]
@@ Line 103: / Line 180: @@
 In this [http://gowers.wordpress.com/2010/01/06/erdss-discrepancy-problem-as-a-forthcoming-polymath-project/#comment-4729 blog post] Terry Tao wrote: "Granville and Soundararajan have made some study of the discrepancy of bounded multiplicative functions. The situation is remarkably delicate and number-theoretical (and is closely tied with the Granville-Soundararajan theory of pretentious characters)."
-*[http://www.math.niu.edu/~rusin/known-math/93_back/prizes.erd As this web page reveals], the Erd&#337;s discrepancy problem was a $500-dollar problem of Erd&#337;s, so it is clear that he regarded it as pretty hard.
+*Wirsing, E. Das asymptotische Verhalten von Summen über multiplikative Funktionen. II. (German) [The asymptotic behavior of sums of multiplicative functions. II.] Acta Math. Acad. Sci. Hungar. 18 1967 411--467. MR0223318 (36 #6366) [http://michaelnielsen.org/polymath1/index.php?title=Wirsing_translation (partial) English translation]
+*Newman, D. J. [http://www.jstor.org/stable/2036455 On the number of binary digits in a multiple of three]. Proc Amer Math Soc. 21(1969): 719--721.
+Proves that the Thue-Morse sequence has discrepancy <math> \gg N^{\log_4(3)}</math>
+*Halász, G. [http://gowers.wordpress.com/2010/02/02/edp5-another-very-brief-summary/#comment-5806 On random multiplicative functions]. Hubert Delange colloquium (Orsay, 1982), 74–96, Publ. Math. Orsay, 83-4, Univ. Paris XI, Orsay, 1983.
+The discrepancy of a random multiplicative function is close to <math>\sqrt{N}</math>.
+=== Problem lists on which this problem appears ===
+*Erd&#337;s, P. and Graham, R. [http://www.math.ucsd.edu/~fan/ron/papers/79_09_combinatorial_number_theory.pdf Old and New Problems and Results in Combinatorial Number Theory: van der Waerden's Theorem and Related Topics], L'Enseignement Math. 25 (1979), 325-344.
+Our problem is mentioned at the bottom of page 331, where they indicate knowledge of a coloring with logarithmic discrepancy. On pages 330-1, they review work on the non-homogeneous problem.
+*Erdős, Paul, Some of my favourite unsolved problems. A tribute to Paul Erdős, 467--478, Cambridge Univ. Press, Cambridge, 1990. MR1117038 (92f:11003)
+"Let <math>f(n)=\pm1</math>. Is it true that for every <math>c</math> there is a <math>d</math> and an <math>m</math> so that
+:<math> \left| \sum_{k=1}^m f(kd) \right| > c</math>?
+I will pay $500 for an answer. Let <math>f(n)=\pm1</math> and <math>f(ab)=f(a)f(b)</math>. Is it then true that <math>\left| \sum_{k=1}^m f(kd) \right|</math> cannot be bounded?"
 *P. Erd&#337;s. [http://www.renyi.hu/~p_erdos/1957-13.pdf Some unsolved problems], Michigan Math. J. 4 (1957), 291--300 MR20 #5157; Zentralblatt 81,1.
@@ Line 109: / Line 205: @@
 Problem 9 of this paper is ours. Erd&#337;s dates the problem to around 1932, and notes what we know about Liouville's function (lower and upper bound).
-*Erd&#337;s, P. and Graham, R. [http://www.math.ucsd.edu/~fan/ron/papers/79_09_combinatorial_number_theory.pdf Old and New Problems and Results in Combinatorial Number Theory: van der Waerden's Theorem and Related Topics], L'Enseignement Math. 25 (1979), 325-344.
+*Finch, S. [http://algo.inria.fr/csolve/ec.pdf Two-colorings of positive integers]. Dated May 27, 2008.
+Summarizes knowledge of discrepancy of two colorings when the discrepancy is restricted to homogeneous progressions, nonhomogeneous progressions, and homogeneous quasi-progressions. Contains bibliography with 17 entries, including most of those above.
-Our problem is mentioned at the bottom of page 331, where they indicate knowledge of a coloring with logarithmic discrepancy. On pages 330-1, they review work on the non-homogeneous problem.
+=== non-homogeneous AP, quasi-AP, and other related discrepancy papers ===
 *Hochberg, Robert. [http://www.springerlink.com/content/q02424284373qw45/ Large Discrepancy In Homogenous Quasi-Arithmetic Progressions]. Combinatorica, Volume 26 (2006), Number 1.
@@ Line 117: / Line 215: @@
 Roth's method for the <math> n^{1/4}</math> lower bound on nonhomogeneous AP discrepancy is adapted to give a lower bound for homogeneous quasi-AP discrepancy. This result is weaker than the Vijay result, but uses a different method.
-*Finch, S. [http://algo.inria.fr/csolve/ec.pdf Two colorings of positive integers]. Dated May 27, 2008.
+*Vijay, Sujith. [http://www.combinatorics.org/Volume_15/PDF/v15i1r104.pdf On the discrepancy of quasi-progressions]. Electronic Journal of Combinatorics, R104 of Volume 15(1), 2008.
+A quasi-progression is a sequence of the form <math>\lfloor k \alpha \rfloor </math>, with <math>1\leq k \leq K</math> for some K. The main theorem of interest is: If the integers from 0 to n are 2-coloured, there exists a quasi-progression that has discrepancy at least <math> (1/50)n^{1/6}</math>.
-Summarizes knowledge of discrepancy of two colorings when the discrepancy is restricted to homogeneous progressions, nonhomogeneous progressions, and homogeneous quasi-progressions. Contains bibliography with 17 entries, including most of those above.
+* Doerr, Benjamin; Srivastav, Anand; Wehr, Petra. [http://www.combinatorics.org/Volume_11/Abstracts/v11i1r5.html Discrepancy of Cartesian products of arithmetic progressions]. Electron. J. Combin. 11 (2004), no. 1, Research Paper 5, 16 pp. (electronic).
+Abstract: We determine the combinatorial discrepancy of the hypergraph H of cartesian
+products of d arithmetic progressions in the <math> [N]^d</math> –lattice.
+The study of such higher dimensional arithmetic progressions is motivated by a
+multi-dimensional version of van derWaerden’s theorem, namely the Gallai-theorem
+(1933). We solve the discrepancy problem for d–dimensional arithmetic progressions
+by proving <math>disc(H) = \Theta(N^{d/4})</math> for every fixed positive integer d. This extends the famous
+lower bound of <math> \Omega(N^{1/4})</math> of Roth (1964) and the matching upper bound <math> O(N^{1/4})</math>
+of Matoušek and Spencer (1996) from d = 1 to arbitrary, fixed d. To establish
+the lower bound we use harmonic analysis on locally compact abelian groups. For
+the upper bound a product coloring arising from the theorem of Matoušek and
+Spencer is sufficient. We also regard some special cases, e.g., symmetric arithmetic
+progressions and infinite arithmetic progressions.
+*Cilleruelo, Javier; Hebbinghaus, Nils [http://www.ams.org/mathscinet-getitem?mr=2547932 Discrepancy in generalized arithmetic progressions]. European J. Combin. 30 (2009), no. 7, 1607--1611. MR 2547932
+*Valkó, Benedek. [Discrepancy of arithmetic progressions in higher dimensions]. (English summary)
+J. Number Theory 92 (2002), no. 1, 117--130. MR1880588 (2003b:11071)
+Abstract:
+K. F. Roth (1964, Acta. Arith.9, 257–260) proved that the discrepancy of arithmetic progressions contained in [N] is at least <math> cN^{1/4} </math>, and later it was proved that this result is sharp. We consider the d-dimensional version of this problem. We give a lower estimate for the discrepancy of arithmetic progressions on <math> [N]^d </math> and prove that this result is nearly sharp. We use our results to give an upper estimate for the discrepancy of lines on an N×N lattice, and we also give an estimate for the discrepancy of a related random hypergraph.
+*Roth, K. F., Remark concerning integer sequences. Acta Arith. 9 1964 257--260. MR0168545 (29 #5806)
+Shows that the discrepancy on APs is at least <math>cN^{1/4}</math>.
+*T. Gowers, [http://arxiv.org/abs/1509.03421 Erdős and Arithmetic Progressions], in Erdos and arithmetic progressions, in Erdos Centennial, Bolyai Society Mathematical Studies, 25, L. Lovasz, I. Z. Ruzsa, V. T. Sos eds., Springer 2013, pp. 265-287.
+A survey of the Polymath5 project.