Getting to Know Binary Indexed Trees with LeetCode Examples

Getting to Know Binary Indexed Trees with LeetCode Examples

This post introduces Binary Indexed Trees (BITs), discusses their applications to LeetCode problems, and provides code samples in C#.

Binary Background

BITs rely on the binary (base 2) representation of numbers. Consider, for example, x = 22. It can be represented as a sum of powers of 2: x = 1*16 + 0*8 + 1*4 + 1*2 + 0*1. Thus its binary representation is 10110. Each one and zero in the representation is called a "bit".

int x = 22;
Console.WriteLine(Convert.ToString(x, toBase: 2));
// Output: 10110

The representation of a negative number can be deduced by recalling that x + (-x) = 0. The representation of -x matches the rightmost 1 in the representation of x and has the opposite for all bits to the left of it.

int x = 22;
Console.WriteLine(Convert.ToString(-x, toBase: 2));
// Output: 11111111111111111111111111101010

The operator & represents the bitwise logical AND. Since the only matching non-zero bit in the representations of x and -x is the rightmost 1, it can be accessed with x & -x. This bit is called the least significant set bit. Let's define LSB(x) = x & -x.

int x = 22;
Console.WriteLine(Convert.ToString(x & -x, toBase: 2));
// Output: 10

Consider intervals of the form (x - LSB(x), x]. The range from 0 to x can be decomposed into such intervals. For example, 22 = 16+4+2, so the intervals are (0, 16], (16, 20], (20, 22]. The intervals can be iterated by repeatedly subtracting the least significant set bit.

int x = 22;
for (int i = x; i > 0; i -= i & -i)
{
    Console.WriteLine(Convert.ToString(i, toBase: 2));
}
// Output:
// 10110
// 10100
// 10000

Conversely, repeatedly adding the least significant set bit results in all intervals that cover the interval (x - LSB(x), x].

int x = 22;
for (int i = x; i < 100; i += i & -i)
{
    Console.WriteLine($"{i - (i & -i)} to {i}");
}
// Output:
// 20 to 22
// 16 to 24
// 0 to 32
// 0 to 64

Binary Indexed Trees

Given a range 1, 2, ..., n and weights w(i) for each i = 1, ... , n, the BIT is a data structures that supports two operations in O(log(n)) time:

  1. Updating the weight w(i) for any i
  2. Finding the sum of weights from i to j for any i and j

BIT works by maintaining pre-computed weights over all intervals of the form (x - LSB(x), x] for x = 1, 2, ..., n.

Consider, for example, x = 22. The weight over the interval (0, 22] can be computed by summing the weights over the intervals found by repeatedly subtracting the least significant set bit: (0, 16], (16, 20], (20, 22]. Updating the weight w(22) requires updating the weights over all intervals found by repeatedly adding the least significant set bit: (20, 22], (16, 24], (0, 32], (0, 64], ....

Classic Application

The LeetCode problem 307. Range Sum Query - Mutable is a classic application of BIT.s

Given an integer array nums, handle multiple queries of the following types:

  1. Update the value of an element in nums.
  2. Calculate the sum of the elements of nums between indices left and right inclusive where left <= right.

Implement the NumArray class:

  • NumArray(int[] nums) Initializes the object with the integer array nums.
  • void update(int index, int val) Updates the value of nums[index] to be val.
  • int sumRange(int left, int right) Returns the sum of the elements of nums between indices left and right inclusive (i.e. nums[left] + nums[left + 1] + ... + nums[right]).

Constraints:

  • 1 <= nums.length <= 3 * 10^4
  • -100 <= nums[i] <= 100
  • 0 <= index < nums.length
  • -100 <= val <= 100
  • 0 <= left <= right < nums.length
  • At most 3 * 10^4 calls will be made to update and sumRange.

Below is a solution.

Note that the range of indexes of nums is 0 to n while BIT operates on 1-based ranges. To account for that, each call to tree is first incremented by 1.

Additionally, note that the Update operation sets the value instead of increasing/decreasing it. This necessitates maintaining the nums array.

public class NumArray {
    private int[] nums;
    private int[] tree;

    public NumArray(int[] nums) {
        int n = nums.Length;
        this.nums = nums;
        tree = new int[n+1];
        for (int i = 1; i <= n; i++)
        {
            tree[i] += nums[i-1];
            int parent = i + (i & -i);
            if (parent <= n)
            {
                tree[parent] += tree[i];
            }
        }
    }
    
    public void Update(int index, int val) {
        int increment = val - nums[index];
        if (increment == 0)
        {
            return;
        }
        nums[index] = val;
        for (int i = index+1; i < tree.Length; i += i & -i)
        {
            tree[i] += increment;
        }
    }
    
    public int SumRange(int left, int right) {
        return SumPrefix(right) - SumPrefix(left - 1);
    }

    private int SumPrefix(int x)
    {
        int result = 0;
        for (int i = x+1; i > 0; i -= i & -i)
        {
            result += tree[i];
        }
        return result;
    }
}

Values Counting Application

Recognising that a problem can be solved with BIT is not always as straightforward as with the classic problem above.

A common application of BIT is for counting the number of values in a specific range when values can be repeatedly added to the collection.

In this case, the weight w(x) represents the number of times the value x has appeared. Adding a new value to the collection involves incrementing its weight by 1. Finding the count of values in a range can be achieved by summing the weights in that range.

The problem 1649. Create Sorted Array through Instructions demonstrates this concept.

Given an integer array instructions, you are asked to create a sorted array from the elements in instructions. You start with an empty container nums. For each element from left to right in instructions, insert it into nums. The cost of each insertion is the minimum of the following:

  • The number of elements currently in nums that are strictly less than instructions[i].
  • The number of elements currently in nums that are strictly greater than instructions[i].

For example, if inserting element 3 into nums = [1,2,3,5], the cost of insertion is min(2, 1)(elements 1 and 2 are less than 3, element 5 is greater than 3) and nums will become [1,2,3,3,5].

Return the total cost to insert all elements from instructions into nums. Since the answer may be large, return it modulo 10^9 + 7

Constraints:

  • 1 <= instructions.length <= 10^5
  • 1 <= instructions[i] <= 10^5

Here is a solution. Note that the ranges of values, [1, 100000], makes it straightforward to use BIT. Also note that the BIT interface is simpler than in the classic problem: a number can be added with void Add(int x), and the count of values in the range [1, x] can be found with int Count(int x).

public class Solution {
    public int CreateSortedArray(int[] instructions) {
        int result = 0;
        int p = 1_000_000_007;
        BIT tree = new(100_000);
        for (int i = 1; i < instructions.Length; i++)
        {
            tree.Add(instructions[i-1]);
            int lessCount = tree.Count(instructions[i]-1);
            int greaterCount = i - tree.Count(instructions[i]);
            result += Math.Min(lessCount, greaterCount);
            result %= p;
        }
        return result;
    }

    public class BIT
    {
        private int[] tree;
        public BIT(int n)
        {
            tree = new int[n+1];
        }
        public void Add(int x)
        {
            for (int i = x; i < tree.Length; i += i & -i)
            {
                tree[i]++;
            }
        }
        public int Count(int x)
        {
            int result = 0;
            for (int i = x; i > 0; i -= i & -i)
            {
                result += tree[i];
            }
            return result;
        }
    }
}

Normalisation

In the above problem, the counted values lie in the range [1, 100000], making it straightforward to apply BIT. In other cases, where values can be negative or very large, preprocessing is necessary. This section discusses these preprocessing techniques.

Offsetting

Assume the range is relatively small but spans both positive and negative numbers. An offset can be added to each number to shift the range. For example, adding 1001 shifts the range [-1000, 1000] to [1, 2001], making it suitable for BIT.

Here are two problems that can be solved with BIT plus offsetting:

Relabelling

When using BIT for values counting, the difference between values does not matter, only their order (greater than, less than, or equal to) matters. This enables relabelling of values with labels in the range 1, ..., n while maintaining the order.

For example, consider the list of values 8, -50, -1000000000, 2000000000, 5. After relabelling, it becomes 4, 2, 1, 5, 3.

Relabelling works best when the number of values (in this case, 5) is much smaller than the range of values (in this case, 3000000000). A limitation to be aware of is that C# does not support arrays longer than int.MaxValue, which is approximately just over 2 * 10^9. When exceeding this range, relabelling is necessary.

Here is an example implementation of relabelling:

public Dictionary<int, int> Normalise(int[] nums)
{
	Dictionary<int, int> result = new();
	int label = 1;
	foreach (int num in nums.Order())
	{
		result[num] = label;
		label++;
	}
	return result;
}

Note that this implementation does not modify the nums array and is able to handle duplicates:

int[] nums = [1, 2, 1, -10, -10, 20];
Dictionary<int, int> labels = Normalise(nums);
for (int i = 0; i < nums.Length; i++)
{
    Console.WriteLine($"{nums[i]} -> {labels[nums[i]]}");
}
// Output:
// 1-> 4
// 2-> 5
// 1-> 4
// - 10-> 2
// - 10-> 2
// 20-> 6

Here are two problems that can be solved with BIT plus relabelling.

Properties Tracking Application

Another application of BIT is tracking properties of numbers in a list. Consider the list x_1, x_2, ..., x_n. BIT operates on the range [1, n], with the weight w(i) being 1 if x_i satisfies a certain property, and 0 otherwise. This allows for modifying the values in the list and counting the number of values in a range that satisfy the property.

The problem 3187. Peaks in Array is a straightforward application of this idea. The property in question is whether x_i is greater than both x_{i-1} and x_{i+1}.

How does properties tracking differ from values counting? In this case, BIT operates on the indexes rather than the values, so normalisation is unnecessary. In a sense, properties tracking is what values counting becomes after normalisation.

Another example of a trackable property is whether the value is still present. This allows tracking the indexes of values in the list when values can be repeatedly deleted. Consider the list a, b, c, d, e. At the start, all the values are present, so the weights are 1, 1, 1, 1, 1. If b and c are deleted, the weights become 1, 0, 0, 1, 1. The indexes of a, d, and e in the original list are 1, 4, and 5. After deletions, the new indexes can be found by summing up the weights to be 1, 2, and 3.

Here is a problem that demonstrates the idea of tracking indexes after deletions: 2659. Make Array Empty

You are given an integer array nums containing distinct numbers, and you can perform the following operations until the array is empty:

  • If the first element has the smallest value, remove it
  • Otherwise, put the first element at the end of the array.

Return an integer denoting the number of operations it takes to make nums empty.

Constraints:

  • 1 <= nums.length <= 10^5
  • -10^9 <= nums[i] <= 10^9
  • All values in nums are distinct.

Here is a solution. The idea is to construct the indexes array such that indexes[i] represents the index in nums of the ith smallest element in nums. Then, the ith smallest number will be deleted after tree.Count(indexes[i]) - tree.Count(indexes[i-1]) operations following the deletion of the i-1th smallest number.

public class Solution {
    public long CountOperationsToEmptyArray(int[] nums) {
        int n = nums.Length;
        int[] indexes = new int[n];
        for (int i = 0; i < n; i++)
        {
            indexes[i] = i+1;
        }
        Array.Sort(nums, indexes);
        BIT tree = new(n);
        long result = indexes[0];
        tree.Remove(indexes[0]);
        for (int i = 1; i < n; i++)
        {
            if (indexes[i] > indexes[i-1])
            {
                result += tree.Count(indexes[i]) - tree.Count(indexes[i-1]);
            }
            else
            {
                result += (n - i) - tree.Count(indexes[i-1]) + tree.Count(indexes[i]);
            }
            tree.Remove(indexes[i]);
        }
        return result;
    }

    public class BIT
    {
        private int[] tree;
        public BIT(int n)
        {
            tree = new int[n+1];
            for (int i = 1; i <= n; i++)
            {
                tree[i]++;
                int parent = i + (i & -i);
                if (parent <= n)
                {
                    tree[parent] += tree[i];
                }
            }
        }
        public void Remove(int x)
        {
            for (int i = x; i < tree.Length; i += i & -i)
            {
                tree[i]--;
            }
        }
        public int Count(int x)
        {
            int result = 0;
            for (int i = x; i > 0; i -= i & -i)
            {
                result += tree[i];
            }
            return result;
        }
    }
}

Here is another problem that can be solved by tracking indexes after deletions:

Conclusion

This post has covered several applications of Binary Indexed Trees, including a classic problem, values counting, and properties tracking. Two normalisation techniques - offsetting and relabelling - have also been mentioned. This post by no means covers all the capabilities of BITs but highlights a few commonalities observed in BIT-solvable LeetCode problems.

Here is the full list of mentioned problems for further practice:

Read more