Experimental results: Difference between revisions
(102 intermediate revisions by 13 users not shown) | |||
Line 1: | Line 1: | ||
[[The Erdős discrepancy problem|To return to the main Polymath5 page, click here]]. | |||
Perhaps we should have two kinds of subpages to this page: Pages about finding examples, and pages about analyzing them? | |||
== | == Experimental data== | ||
* A [[maximal discrepancy-2 sequence]] (of length 1160) | |||
* [[The first 1124-sequence]] with discrepancy 2. ''Some more description'' | |||
* Other [[length 1124 sequences]] with discrepancy 2. ''Some more description'' | |||
* Analysis of a large set of [[length 1120 sequences]] with discrepancy 2 | |||
* A [[sequence of length 1091]] with discrepancy 2. | |||
* A [[sequence of length 1112]] derived from one with nice multiplicative properties. | |||
* [[Sequences given by modulated Sturmian functions]]. | |||
* Some data about the problem with [[different upper and lower bound]]. Let N(a,b) be the largest N such that there is a sequence <math>x_1,\dots,x_N</math> all of whose HAP-errors are between -a and b, inclusive. | |||
* Sequences taking values in <math>\mathbb{T}</math>: | |||
** [[4th roots of unity]] | |||
** [[6th roots of unity]] | |||
* [http://thomas1111.wordpress.com/2010/01/10/tables-for-a-c10-candidate/ A sequence of length 407] with discrepancy 2 such that <math>x_n=x_{32 n}</math> for every n. [[The HAP-subsequence structure of that sequence]]. | |||
* More [[T32-invariant sequences]]. | |||
* Long [[multiplicative sequences]]. | |||
* Long sequences satisfying [[T2(x) = -x]]. | |||
* Long sequences satisfying [[T2(x) = T5(x) = -x]] | |||
* Long sequences satisfying [[T2(x) = -T3(x)]]. | |||
* Long sequences satisfying constraints of the form [[T_m(x) = (+/-)T_n(x)]]. | |||
* Table of [[longest constrained sequences]]. | |||
* Table of [[short sequences statistics]]. | |||
* [[Dirichlet inverses]] of good sequences. | |||
* Sequences with [[bounded Dirichlet inverse]]. | |||
* [[Shifts and signs]] related to an interesting structure of the first 1124-sequence. | |||
* Long sequences with [[low discrepancy on PAPs]]. | |||
* [[Forced Drifts in Multiplicative Sequences]]. The tree of possible assignments to primes is traversed, and for each possible assignment an interval is found where the function must have substantive drift. Data includes drifts of 2, 3, 4, and 5. | |||
* Bounds for the numbers [[Omega(N)]] | |||
* Tables of [[vectors for the dual SDP]] | |||
* Long sequences with [[low discrepancy on primes and powers of 2]] | |||
==Source code== | |||
* [[Convert raw input string into CSV table]] | |||
* [[Create tables in an HTML file from an input sequence]] | |||
* [[Verify the bounded discrepancy of an input sequence]] | |||
* [[Depth-first search]] | |||
* [[Depth-first search for multiplicative sequences]] | |||
* [[Search for completely multiplicative sequences]] | |||
* [[Refined greedy computation of multiplicative sequences]] | |||
* [[Computing a HAP basis]] | |||
* [[Estimate the number of discrepancy 2 sequences]] | |||
* [[Updating partial sums with Fenwick tree]] | |||
==Wish list== | |||
There is a separate page for [[proposals for finding long low-discrepancy sequences]]. It goes without saying that implementing any of these proposals belongs to the wish list. | |||
<!-- | |||
* What is the discrepancy of the sequence defined in [http://gowers.wordpress.com/2010/01/09/erds-discrepancy-problem-continued/ this post], | |||
DONE, i think. | |||
--> | |||
* Find long/longest quasi-multiplicative sequences with some fixed group G, function <math>G\to \{-1,1\}</math> and maximal discrepancy C | |||
** <math>G=C_6</math> and the function that sends 0,1 and 2 to 1 (because this seems to be a good choice) | |||
* Do a "Mark-Bennet-style analysis" of one of the new 1124-sequences. [http://gowers.wordpress.com/2010/01/06/erdss-discrepancy-problem-as-a-forthcoming-polymath-project/#comment-4827] Also [http://gowers.wordpress.com/2010/01/06/erdss-discrepancy-problem-as-a-forthcoming-polymath-project/#comment-4842 done] (by Mark Bennet). | |||
*. Take a moderately large k and search for the longest sequence of discrepancy 2 that's constructed as follows. First, pick a completely multiplicative function f to the group <math>C_{2k}</math>. Then set <math>x_n</math> to be 1 if f(n) lies between 0 and k-1, and -1 if f(n) lies between k and 2k-1. Alec has already [http://gowers.wordpress.com/2009/12/17/erdoss-discrepancy-problem/#comment-4563 done this for k=1] and [http://gowers.wordpress.com/2010/01/06/erdss-discrepancy-problem-as-a-forthcoming-polymath-project/#comment-4734 partially done it for k=3]. | |||
*Search for the longest sequence of discrepancy 2 with the property that <math>x_n=x_{32n}</math> for every n. The motivation for this is to produce a fundamentally different class of examples (different because their group structure would include an element of order 5). It's not clear that it will work, since 32 is a fairly large number. However, if you've chosen <math>x_{32n}</math> then that will have some influence on several other choices, such as <math>x_{4n},x_{8n}</math> and <math>x_{16n}</math>, so maybe it will lead to something interesting. Alec [http://gowers.wordpress.com/2010/01/09/erds-discrepancy-problem-continued/#comment-4873 has made a start on this] and an [http://gowers.wordpress.com/2010/01/09/erds-discrepancy-problem-continued/#comment-4874 initial investigation] suggests that the sequence he has found does indeed have some <math>C_{10}</math>-related structure. | |||
*Here's another experiment that should be pretty easy to program and might yield something interesting. It's to look at the how the discrepancy appears to grow when you define a sequence using a greedy algorithm. I say "a" greedy algorithm because there are various algorithms that could reasonably be described as greedy. Here are a few. | |||
<blockquote> | |||
1. For each n let <math>x_n</math> be chosen so as to minimize the discrepancy so far, given the choices already made for <math>x_1,\dots,x_{n-1}</math>. (If this does not uniquely determine <math>x_n</math> then choose it arbitrarily, or randomly, or according to some simple rule like always equalling 1.) | |||
</blockquote> | |||
<blockquote> | |||
2. Same as 1 but with additional constraints, in the hope that these make the sequence more likely to be good. For instance, one might insist that <math>x_{2k}=x_{3k}</math> for every k. Here, when choosing <math>x_n</math> one would probably want to minimize the discrepancy up to <math>x_{n+k}</math> if <math>x_{n+1},\dots,x_{n+k}</math> had already been chosen. Another obvious constraint to try is complete multiplicativity. | |||
</blockquote> | |||
<blockquote> | |||
3. A greedy algorithm of sorts, but this time trying to minimize a different parameter. The first algorithm will do this: when you pick <math>x_n</math> you look, for each factor d of n, at the partial sum along the multiples of d up to but not including n. This will give you a set A of numbers (the possible partial sums). If max(A) is greater than max(-A) then you set <math>x_n=-1</math>, if max(-A) is greater than max(A) then you let <math>x_n=1</math>, and if they are equal then you make the decision according to some rule that seems sensible. But it might be that you would end up with a slower-growing discrepancy if you regarded A as a multiset and made the decision on some other basis. For instance, you could take the sum of <math>2^k</math> over all positive elements <math>k\in A</math> (with multiplicity) and the sum of <math>2^{-k}</math> over all negative elements and choose <math>x_n</math> according to which was bigger. Although that wouldn't minimize the discrepancy at each stage, it might make the sequence better for future development because it wouldn't sacrifice the needs of an overwhelming majority to those of a few rogue elements. | |||
</blockquote> | |||
<blockquote> | |||
4. A greedy algorithm to choose a good completely multiplicative low-discrepancy sequence. Now you are free only to choose the values at primes. If you have chosen the values up to but not including p, then fill in all the values that are forced by multiplicativity and then make whatever seems to be the best choice for the value at p. Again, there are several approaches that could be reasonable here. One is simply to ensure that the partial sum of the sequence up to p is as small (in modulus) as you can make it. But that would be foolish if you've already filled in the values at p+1,...,p+k. So an only slightly less greedy algorithm is to look at the effect of your choice at p on the partial sums all the way up to the next prime and choose the best value accordingly. If you do that, then at what rate do the partial sums grow? In particular, do they grow at least logarithmically? [http://michaelnielsen.org/polymath1/index.php?title=Multiplicative_sequences#Minimizing_D_up_to_the_next_prime This is being addressed here] | |||
</blockquote> | |||
<blockquote> | |||
The motivation for these experiments is to see whether they, or some variants, appear to lead to sublogarithmic growth. If they do, then we could start trying to prove rigorously that sublogarithmic growth is possible. I still think that a function that arises in nature and satisfies f(1124)=2 ought to be sublogarithmic. | |||
</blockquote> | |||
*What happens if one applies a backtracking algorithm to try to extend the following discrepancy-2 sequence, which satisfies <math>x_{2n}=-x_n</math> for every n, to a much longer discrepancy-2 sequence: + - - + - + + - - + + - + - + + - + - - + - - + + - - + + - + - - + - - + + + + - - - + + + - - + - + + - + - - + - ? This question has been answered [http://gowers.wordpress.com/2010/01/09/erds-discrepancy-problem-continued/#comment-4893 in the comments following the asking of the question on the blog]. | |||
* Investigate what happens if our HAPs are restricted to allow differences divisible only by 2 or 3 [and then other sets of primes including 2] - {2,3,5,7} would be interesting - is there an infinite sequence of discrepancy 2 in these simple cases - is it easy to find an infinite sequence with finite discrepancy in these cases? [for sets of odd primes, take a sequence which is 1 on odd numbers, -1 on even numbers. Including 2 is the non-trivial case]. It is possible that completely multiplicative sequences could be found for some of these cases. | |||
* Compute the Dirichlet series <math>f(s) = \sum x_n n^{-s}</math> for some of our long low-discrepancy series, and see what this function looks like in the vicinity of <math>1</math>, and elsewhere. [http://gowers.wordpress.com/2010/01/11/the-erds-discrepancy-problem-iii/#comment-5062 Alec has now looked at this]. | |||
*Take a long sequence <math>(x_n)</math> of discrepancy 2 and try to create a new long sequence <math>(y_n)</math> subject to the constraint that <math>y_{2n}=x_n</math>. How far does one typically get before getting stuck? And how much further does one get if one uses the resulting sequence as a seed for the usual algorithm? [http://gowers.wordpress.com/2010/01/14/the-erds-discrepancy-problem-iv/#comment-5096 One does not get too far, as Alec showed]. | |||
*Take some of the good sequences and calculate the following parameter, which is supposed to measure the amount of multiplicity. If the sequence is defined up to n, then the parameter is the expected value of <math>x_ax_bx_cx_d</math> over all quadruples (a,b,c,d) such that a,b,c,d are at most n and ab=cd. I'm expecting that the answer will be significantly greater than zero -- perhaps something like 0.3 -- but would like to have this confirmed. [http://gowers.wordpress.com/2010/01/14/the-erds-discrepancy-problem-iv/#comment-5132 It has now been confirmed]. | |||
* ... you are welcome to add more. |
Latest revision as of 12:09, 14 February 2014
To return to the main Polymath5 page, click here.
Perhaps we should have two kinds of subpages to this page: Pages about finding examples, and pages about analyzing them?
Experimental data
- A maximal discrepancy-2 sequence (of length 1160)
- The first 1124-sequence with discrepancy 2. Some more description
- Other length 1124 sequences with discrepancy 2. Some more description
- Analysis of a large set of length 1120 sequences with discrepancy 2
- A sequence of length 1091 with discrepancy 2.
- A sequence of length 1112 derived from one with nice multiplicative properties.
- Sequences given by modulated Sturmian functions.
- Some data about the problem with different upper and lower bound. Let N(a,b) be the largest N such that there is a sequence [math]\displaystyle{ x_1,\dots,x_N }[/math] all of whose HAP-errors are between -a and b, inclusive.
- Sequences taking values in [math]\displaystyle{ \mathbb{T} }[/math]:
- A sequence of length 407 with discrepancy 2 such that [math]\displaystyle{ x_n=x_{32 n} }[/math] for every n. The HAP-subsequence structure of that sequence.
- More T32-invariant sequences.
- Long multiplicative sequences.
- Long sequences satisfying T2(x) = -x.
- Long sequences satisfying T2(x) = T5(x) = -x
- Long sequences satisfying T2(x) = -T3(x).
- Long sequences satisfying constraints of the form T_m(x) = (+/-)T_n(x).
- Table of longest constrained sequences.
- Table of short sequences statistics.
- Dirichlet inverses of good sequences.
- Sequences with bounded Dirichlet inverse.
- Shifts and signs related to an interesting structure of the first 1124-sequence.
- Long sequences with low discrepancy on PAPs.
- Forced Drifts in Multiplicative Sequences. The tree of possible assignments to primes is traversed, and for each possible assignment an interval is found where the function must have substantive drift. Data includes drifts of 2, 3, 4, and 5.
- Bounds for the numbers Omega(N)
- Tables of vectors for the dual SDP
- Long sequences with low discrepancy on primes and powers of 2
Source code
- Convert raw input string into CSV table
- Create tables in an HTML file from an input sequence
- Verify the bounded discrepancy of an input sequence
- Depth-first search
- Depth-first search for multiplicative sequences
- Search for completely multiplicative sequences
- Refined greedy computation of multiplicative sequences
- Computing a HAP basis
- Estimate the number of discrepancy 2 sequences
- Updating partial sums with Fenwick tree
Wish list
There is a separate page for proposals for finding long low-discrepancy sequences. It goes without saying that implementing any of these proposals belongs to the wish list.
- Find long/longest quasi-multiplicative sequences with some fixed group G, function [math]\displaystyle{ G\to \{-1,1\} }[/math] and maximal discrepancy C
- [math]\displaystyle{ G=C_6 }[/math] and the function that sends 0,1 and 2 to 1 (because this seems to be a good choice)
- Do a "Mark-Bennet-style analysis" of one of the new 1124-sequences. [1] Also done (by Mark Bennet).
- . Take a moderately large k and search for the longest sequence of discrepancy 2 that's constructed as follows. First, pick a completely multiplicative function f to the group [math]\displaystyle{ C_{2k} }[/math]. Then set [math]\displaystyle{ x_n }[/math] to be 1 if f(n) lies between 0 and k-1, and -1 if f(n) lies between k and 2k-1. Alec has already done this for k=1 and partially done it for k=3.
- Search for the longest sequence of discrepancy 2 with the property that [math]\displaystyle{ x_n=x_{32n} }[/math] for every n. The motivation for this is to produce a fundamentally different class of examples (different because their group structure would include an element of order 5). It's not clear that it will work, since 32 is a fairly large number. However, if you've chosen [math]\displaystyle{ x_{32n} }[/math] then that will have some influence on several other choices, such as [math]\displaystyle{ x_{4n},x_{8n} }[/math] and [math]\displaystyle{ x_{16n} }[/math], so maybe it will lead to something interesting. Alec has made a start on this and an initial investigation suggests that the sequence he has found does indeed have some [math]\displaystyle{ C_{10} }[/math]-related structure.
- Here's another experiment that should be pretty easy to program and might yield something interesting. It's to look at the how the discrepancy appears to grow when you define a sequence using a greedy algorithm. I say "a" greedy algorithm because there are various algorithms that could reasonably be described as greedy. Here are a few.
1. For each n let [math]\displaystyle{ x_n }[/math] be chosen so as to minimize the discrepancy so far, given the choices already made for [math]\displaystyle{ x_1,\dots,x_{n-1} }[/math]. (If this does not uniquely determine [math]\displaystyle{ x_n }[/math] then choose it arbitrarily, or randomly, or according to some simple rule like always equalling 1.)
2. Same as 1 but with additional constraints, in the hope that these make the sequence more likely to be good. For instance, one might insist that [math]\displaystyle{ x_{2k}=x_{3k} }[/math] for every k. Here, when choosing [math]\displaystyle{ x_n }[/math] one would probably want to minimize the discrepancy up to [math]\displaystyle{ x_{n+k} }[/math] if [math]\displaystyle{ x_{n+1},\dots,x_{n+k} }[/math] had already been chosen. Another obvious constraint to try is complete multiplicativity.
3. A greedy algorithm of sorts, but this time trying to minimize a different parameter. The first algorithm will do this: when you pick [math]\displaystyle{ x_n }[/math] you look, for each factor d of n, at the partial sum along the multiples of d up to but not including n. This will give you a set A of numbers (the possible partial sums). If max(A) is greater than max(-A) then you set [math]\displaystyle{ x_n=-1 }[/math], if max(-A) is greater than max(A) then you let [math]\displaystyle{ x_n=1 }[/math], and if they are equal then you make the decision according to some rule that seems sensible. But it might be that you would end up with a slower-growing discrepancy if you regarded A as a multiset and made the decision on some other basis. For instance, you could take the sum of [math]\displaystyle{ 2^k }[/math] over all positive elements [math]\displaystyle{ k\in A }[/math] (with multiplicity) and the sum of [math]\displaystyle{ 2^{-k} }[/math] over all negative elements and choose [math]\displaystyle{ x_n }[/math] according to which was bigger. Although that wouldn't minimize the discrepancy at each stage, it might make the sequence better for future development because it wouldn't sacrifice the needs of an overwhelming majority to those of a few rogue elements.
4. A greedy algorithm to choose a good completely multiplicative low-discrepancy sequence. Now you are free only to choose the values at primes. If you have chosen the values up to but not including p, then fill in all the values that are forced by multiplicativity and then make whatever seems to be the best choice for the value at p. Again, there are several approaches that could be reasonable here. One is simply to ensure that the partial sum of the sequence up to p is as small (in modulus) as you can make it. But that would be foolish if you've already filled in the values at p+1,...,p+k. So an only slightly less greedy algorithm is to look at the effect of your choice at p on the partial sums all the way up to the next prime and choose the best value accordingly. If you do that, then at what rate do the partial sums grow? In particular, do they grow at least logarithmically? This is being addressed here
The motivation for these experiments is to see whether they, or some variants, appear to lead to sublogarithmic growth. If they do, then we could start trying to prove rigorously that sublogarithmic growth is possible. I still think that a function that arises in nature and satisfies f(1124)=2 ought to be sublogarithmic.
- What happens if one applies a backtracking algorithm to try to extend the following discrepancy-2 sequence, which satisfies [math]\displaystyle{ x_{2n}=-x_n }[/math] for every n, to a much longer discrepancy-2 sequence: + - - + - + + - - + + - + - + + - + - - + - - + + - - + + - + - - + - - + + + + - - - + + + - - + - + + - + - - + - ? This question has been answered in the comments following the asking of the question on the blog.
- Investigate what happens if our HAPs are restricted to allow differences divisible only by 2 or 3 [and then other sets of primes including 2] - {2,3,5,7} would be interesting - is there an infinite sequence of discrepancy 2 in these simple cases - is it easy to find an infinite sequence with finite discrepancy in these cases? [for sets of odd primes, take a sequence which is 1 on odd numbers, -1 on even numbers. Including 2 is the non-trivial case]. It is possible that completely multiplicative sequences could be found for some of these cases.
- Compute the Dirichlet series [math]\displaystyle{ f(s) = \sum x_n n^{-s} }[/math] for some of our long low-discrepancy series, and see what this function looks like in the vicinity of [math]\displaystyle{ 1 }[/math], and elsewhere. Alec has now looked at this.
- Take a long sequence [math]\displaystyle{ (x_n) }[/math] of discrepancy 2 and try to create a new long sequence [math]\displaystyle{ (y_n) }[/math] subject to the constraint that [math]\displaystyle{ y_{2n}=x_n }[/math]. How far does one typically get before getting stuck? And how much further does one get if one uses the resulting sequence as a seed for the usual algorithm? One does not get too far, as Alec showed.
- Take some of the good sequences and calculate the following parameter, which is supposed to measure the amount of multiplicity. If the sequence is defined up to n, then the parameter is the expected value of [math]\displaystyle{ x_ax_bx_cx_d }[/math] over all quadruples (a,b,c,d) such that a,b,c,d are at most n and ab=cd. I'm expecting that the answer will be significantly greater than zero -- perhaps something like 0.3 -- but would like to have this confirmed. It has now been confirmed.
- ... you are welcome to add more.