Updating partial sums with Fenwick tree

From Polymath1Wiki

Jump to: navigation, search

The two straightforward ways to code updating and retrieval of partial sums of sequences are: (a) maintain the array of sequence elements - on each update of a sequence element update just the sequence element, and each time you need a partial sum you compute it from scratch, (b) maintain the array of partial sums - on each update of a sequence element update all the partial sums this element contributes to. In the case (a) update takes O(1) time and retrieval takes O(N), in the case (b) update takes O(N) and retrieval O(1) time, where N is the length of the sequence. The Fenwick tree data structure has both update and retrieval taking O(logN) time in the worst case, while still using the same O(N) amount of space.

The idea is to have two actions on positive integers, say A and B, which satisfy the following:

Property 1 For any two positive integers x and y, orbit of x under action of A, and orbit of y under action of B intersect in exactly one point if x \leq y, and nowhere otherwise.

The general algorithm when we have A and B that satisfy the property above is as follows. We will maintain an array of size N with all elements initialized to 0. By updating something we mean adding a quantity to it, whether positive or negative. On each update of the ith sequence element, update all the elements of the array with indices in the A orbit of i by that same amount. To retrieve the partial sum of elements 1 thru j, sum up all the elements from the array with indices in the B orbit of j.

The two straightforward algorithms we discussed in the first paragraph already fit into this scheme. In the case (a) above, A is identity and B(j) = j − 1, for (b) it's A(i) = i + 1 and B is the identity. It's obvious that in both cases the pair A and B satisfy Property 1.

Here's the punchline: for a positive integer n, if we call r(n) the integer we get when we isolate the rightmost, i.e. the least significant, set bit in the binary representation of n (so e.g. r(6) = 2 and r(8) = 8), then A(i) = i + r(i) and B(i) = ir(i) satisfy Property 1.

Proof: A increases its argument while B decreases it, so if x = y then A0(x) = B0(y), and that's the only intersection. If 0 < x < y, then binary representations of x and y are y = a1by and x = a0bx where a, bx, by are (possibly empty) binary sequences, and bx and by are of same length | b | . We may have had to pad the representation of x by zeros to the left as needed for this. The action B unsets the rightmost set bit, so eventually y will under action of B arrive at a10 | b | , if it's not of that form to begin with. If bx has no set bits, then one more action of B on y will make it equal to x. If bx has set bits, then A, which turns a number of the form z01m0n into z10m + n will eventually bring x to a10 | b | , which is where we left y. By the argument from the beginning, this is the only place the two orbits intersect. QED

What's more, in C and its ilk, r(i) is coded as (-i)&i. (Actually, -i&i will parse just fine since unary minus has higher precedence in C than bitwise and.)

Here's a simple implementation in C++, add error checking as needed

#include <vector>

class fenwick_tree {
    std::vector<int> v_;

  public:
    fenwick_tree(int N) : v_(N+1) {}

    // neither add() nor get() will work 
    // for k outside of [1, N] range

    // add val to the k-th element
    void add(int k, int val) {
        for (int i=k, e=v_.size(); i<e; i += -i&i)
            v_[i] += val;
    }

    // get sum of elements 1 thru k
    int get(int k) const {
        int r=0;
        for (int i=k; i>0; i -= -i&i)
            r += v_[i];
        return r;
    }
};
Personal tools