Updating partial sums with Fenwick tree: Difference between revisions
mNo edit summary |
No edit summary |
||
Line 3: | Line 3: | ||
The idea is to have two actions on positive integers, say <math>A</math> and <math>B</math>, which satisfy the following: | The idea is to have two actions on positive integers, say <math>A</math> and <math>B</math>, which satisfy the following: | ||
'''Property 1''' | '''Property 1''' For any two positive integers <math>x</math> and <math>y</math>, orbit of <math>x</math> under action of <math>A</math>, and orbit of <math>y</math> under action of <math>B</math> intersect in exactly one point if <math>x \leq y</math>, and nowhere otherwise. | ||
The general algorithm when we have <math>A</math> and <math>B</math> that satisfy the property above is as follows. We will maintain an array of size <math>N</math> with all elements initialized to 0. By updating something we mean adding a quantity to it, whether positive or negative. On each update of the <math>i^{th}</math> sequence element, update all the elements of the array with indices in the <math>A</math> orbit of <math>i</math> by that same amount. To retrieve the partial sum of elements <math>1</math> thru <math>j</math>, sum up all the elements from the array with indices in the <math>B</math> orbit of <math>j</math>. | The general algorithm when we have <math>A</math> and <math>B</math> that satisfy the property above is as follows. We will maintain an array of size <math>N</math> with all elements initialized to 0. By updating something we mean adding a quantity to it, whether positive or negative. On each update of the <math>i^{th}</math> sequence element, update all the elements of the array with indices in the <math>A</math> orbit of <math>i</math> by that same amount. To retrieve the partial sum of elements <math>1</math> thru <math>j</math>, sum up all the elements from the array with indices in the <math>B</math> orbit of <math>j</math>. |
Revision as of 16:19, 1 February 2010
The two straightforward ways to code updating and retrieval of partial sums of sequences are: (a) maintain the array of sequence elements - on each update of a sequence element update just the sequence element, and each time you need a partial sum you compute it from scratch, (b) maintain the array of partial sums - on each update of a sequence element update all the partial sums this element contributes to. In the case (a) update takes [math]\displaystyle{ O(1) }[/math] time and retrieval takes [math]\displaystyle{ O(N) }[/math], in the case (b) update takes [math]\displaystyle{ O(N) }[/math] and retrieval [math]\displaystyle{ O(1) }[/math] time, where [math]\displaystyle{ N }[/math] is the length of the sequence. The Fenwick tree data structure has both update and retrieval taking [math]\displaystyle{ O(\log N) }[/math] time in the worst case, while still using the same [math]\displaystyle{ O(N) }[/math] amount of space.
The idea is to have two actions on positive integers, say [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math], which satisfy the following:
Property 1 For any two positive integers [math]\displaystyle{ x }[/math] and [math]\displaystyle{ y }[/math], orbit of [math]\displaystyle{ x }[/math] under action of [math]\displaystyle{ A }[/math], and orbit of [math]\displaystyle{ y }[/math] under action of [math]\displaystyle{ B }[/math] intersect in exactly one point if [math]\displaystyle{ x \leq y }[/math], and nowhere otherwise.
The general algorithm when we have [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math] that satisfy the property above is as follows. We will maintain an array of size [math]\displaystyle{ N }[/math] with all elements initialized to 0. By updating something we mean adding a quantity to it, whether positive or negative. On each update of the [math]\displaystyle{ i^{th} }[/math] sequence element, update all the elements of the array with indices in the [math]\displaystyle{ A }[/math] orbit of [math]\displaystyle{ i }[/math] by that same amount. To retrieve the partial sum of elements [math]\displaystyle{ 1 }[/math] thru [math]\displaystyle{ j }[/math], sum up all the elements from the array with indices in the [math]\displaystyle{ B }[/math] orbit of [math]\displaystyle{ j }[/math].
The two straightforward algorithms we discussed in the first paragraph already fit into this scheme. In the case (a) above, [math]\displaystyle{ A }[/math] is identity and [math]\displaystyle{ B(j) = j-1 }[/math], for (b) it's [math]\displaystyle{ A(i) = i+1 }[/math] and [math]\displaystyle{ B }[/math] is the identity. It's obvious that in both cases the pair [math]\displaystyle{ A }[/math] and [math]\displaystyle{ B }[/math] satisfy Property 1.
Here's the punchline: for a positive integer [math]\displaystyle{ n }[/math], if we call [math]\displaystyle{ r(n) }[/math] the integer we get when we isolate the rightmost, i.e. the least significant, set bit in the binary representation of [math]\displaystyle{ n }[/math] (so e.g. [math]\displaystyle{ r(6) = 2 }[/math] and [math]\displaystyle{ r(8) = 8 }[/math]), then [math]\displaystyle{ A(i) = i + r(i) }[/math] and [math]\displaystyle{ B(i) = i - r(i) }[/math] satisfy Property 1.
Proof: [math]\displaystyle{ A }[/math] increases its argument while [math]\displaystyle{ B }[/math] decreases it, so if [math]\displaystyle{ x=y }[/math] then [math]\displaystyle{ A^0(x) = B^0(y) }[/math], and that's the only intersection. If [math]\displaystyle{ 0 \lt x \lt y }[/math], then binary representations of [math]\displaystyle{ x }[/math] and [math]\displaystyle{ y }[/math] are [math]\displaystyle{ y = a1b_y }[/math] and [math]\displaystyle{ x = a0b_x }[/math] where [math]\displaystyle{ a }[/math], [math]\displaystyle{ b_x }[/math], [math]\displaystyle{ b_y }[/math] are (possibly empty) binary sequences, and [math]\displaystyle{ b_x }[/math] and [math]\displaystyle{ b_y }[/math] are of same length [math]\displaystyle{ |b| }[/math]. We may have had to pad the representation of [math]\displaystyle{ x }[/math] by zeros to the left as needed for this. The action [math]\displaystyle{ B }[/math] unsets the rightmost set bit, so eventually [math]\displaystyle{ y }[/math] will under action of [math]\displaystyle{ B }[/math] arrive at [math]\displaystyle{ a10^{|b|} }[/math], if it's not of that form to begin with. If [math]\displaystyle{ b_x }[/math] has no set bits, then one more action of [math]\displaystyle{ B }[/math] on [math]\displaystyle{ y }[/math] will make it equal to [math]\displaystyle{ x }[/math]. If [math]\displaystyle{ b_x }[/math] has set bits, then [math]\displaystyle{ A }[/math], which turns a number of the form [math]\displaystyle{ z01^{m}0^{n} }[/math] into [math]\displaystyle{ z10^{m+n} }[/math] will eventually bring [math]\displaystyle{ x }[/math] to [math]\displaystyle{ a10^{|b|} }[/math], which is where we left [math]\displaystyle{ y }[/math]. By the argument from the beginning, this is the only place the two orbits intersect. QED
What's more, in C and its ilk, [math]\displaystyle{ r(i) }[/math] is coded as (-i)&i
. (Actually, -i&i
will parse just fine since unary minus has higher precedence in C than bitwise and.)
Here's a simple implementation in C++, add error checking as needed
#include <vector> class fenwick_tree { std::vector<int> v_; public: fenwick_tree(int N) : v_(N+1) {} // neither add() nor get() will work // for k outside of [1, N] range // add val to the k-th element void add(int k, int val) { for (int i=k; i<int(v_.size()); i += -i&i) v_[i] += val; } // get sum of elements 1 thru k int get(int k) const { int r=0; for (int i=k; i>0; i -= -i&i) r += v_[i]; return r; } };