Getting to Know Binary Indexed Trees with LeetCode Examples
This post introduces Binary Indexed Trees (BITs), discusses their applications to LeetCode problems, and provides code samples in C#.
Binary Background
BITs rely on the binary (base 2) representation of numbers. Consider, for example, x = 22
. It can be represented as a sum of powers of 2: x = 1*16 + 0*8 + 1*4 + 1*2 + 0*1
. Thus its binary representation is 10110
. Each one and zero in the representation is called a "bit".
int x = 22;
Console.WriteLine(Convert.ToString(x, toBase: 2));
// Output: 10110
The representation of a negative number can be deduced by recalling that x + (-x) = 0
. The representation of -x
matches the rightmost 1
in the representation of x
and has the opposite for all bits to the left of it.
int x = 22;
Console.WriteLine(Convert.ToString(-x, toBase: 2));
// Output: 11111111111111111111111111101010
The operator &
represents the bitwise logical AND. Since the only matching non-zero bit in the representations of x
and -x
is the rightmost 1
, it can be accessed with x & -x
. This bit is called the least significant set bit. Let's define LSB(x) = x & -x
.
int x = 22;
Console.WriteLine(Convert.ToString(x & -x, toBase: 2));
// Output: 10
Consider intervals of the form (x - LSB(x), x]
. The range from 0
to x
can be decomposed into such intervals. For example, 22 = 16+4+2
, so the intervals are (0, 16], (16, 20], (20, 22]
. The intervals can be iterated by repeatedly subtracting the least significant set bit.
int x = 22;
for (int i = x; i > 0; i -= i & -i)
{
Console.WriteLine(Convert.ToString(i, toBase: 2));
}
// Output:
// 10110
// 10100
// 10000
Conversely, repeatedly adding the least significant set bit results in all intervals that cover the interval (x - LSB(x), x]
.
int x = 22;
for (int i = x; i < 100; i += i & -i)
{
Console.WriteLine($"{i - (i & -i)} to {i}");
}
// Output:
// 20 to 22
// 16 to 24
// 0 to 32
// 0 to 64
Binary Indexed Trees
Given a range 1, 2, ..., n
and weights w(i)
for each i = 1, ... , n
, the BIT is a data structures that supports two operations in O(log(n))
time:
- Updating the weight
w(i)
for anyi
- Finding the sum of weights from
i
toj
for anyi
andj
BIT works by maintaining pre-computed weights over all intervals of the form (x - LSB(x), x]
for x = 1, 2, ..., n
.
Consider, for example, x = 22
. The weight over the interval (0, 22]
can be computed by summing the weights over the intervals found by repeatedly subtracting the least significant set bit: (0, 16], (16, 20], (20, 22]
. Updating the weight w(22)
requires updating the weights over all intervals found by repeatedly adding the least significant set bit: (20, 22], (16, 24], (0, 32], (0, 64], ...
.
Classic Application
The LeetCode problem 307. Range Sum Query - Mutable is a classic application of BIT.s
Given an integer array
nums
, handle multiple queries of the following types:
- Update the value of an element in
nums
.- Calculate the sum of the elements of
nums
between indicesleft
andright
inclusive whereleft <= right
.Implement the
NumArray
class:
NumArray(int[] nums)
Initializes the object with the integer arraynums
.void update(int index, int val)
Updates the value ofnums[index]
to beval
.int sumRange(int left, int right)
Returns the sum of the elements ofnums
between indicesleft
andright
inclusive (i.e.nums[left] + nums[left + 1] + ... + nums[right]
).Constraints:
1 <= nums.length <= 3 * 10^4
-100 <= nums[i] <= 100
0 <= index < nums.length
-100 <= val <= 100
0 <= left <= right < nums.length
- At most
3 * 10^4
calls will be made toupdate
andsumRange
.
Below is a solution.
Note that the range of indexes of nums
is 0
to n
while BIT operates on 1-based ranges. To account for that, each call to tree
is first incremented by 1
.
Additionally, note that the Update
operation sets the value instead of increasing/decreasing it. This necessitates maintaining the nums
array.
public class NumArray {
private int[] nums;
private int[] tree;
public NumArray(int[] nums) {
int n = nums.Length;
this.nums = nums;
tree = new int[n+1];
for (int i = 1; i <= n; i++)
{
tree[i] += nums[i-1];
int parent = i + (i & -i);
if (parent <= n)
{
tree[parent] += tree[i];
}
}
}
public void Update(int index, int val) {
int increment = val - nums[index];
if (increment == 0)
{
return;
}
nums[index] = val;
for (int i = index+1; i < tree.Length; i += i & -i)
{
tree[i] += increment;
}
}
public int SumRange(int left, int right) {
return SumPrefix(right) - SumPrefix(left - 1);
}
private int SumPrefix(int x)
{
int result = 0;
for (int i = x+1; i > 0; i -= i & -i)
{
result += tree[i];
}
return result;
}
}
Values Counting Application
Recognising that a problem can be solved with BIT is not always as straightforward as with the classic problem above.
A common application of BIT is for counting the number of values in a specific range when values can be repeatedly added to the collection.
In this case, the weight w(x)
represents the number of times the value x
has appeared. Adding a new value to the collection involves incrementing its weight by 1
. Finding the count of values in a range can be achieved by summing the weights in that range.
The problem 1649. Create Sorted Array through Instructions demonstrates this concept.
Given an integer array
instructions
, you are asked to create a sorted array from the elements ininstructions
. You start with an empty containernums
. For each element from left to right ininstructions
, insert it intonums
. The cost of each insertion is the minimum of the following:
- The number of elements currently in
nums
that are strictly less thaninstructions[i]
.- The number of elements currently in
nums
that are strictly greater thaninstructions[i]
.For example, if inserting element
3
intonums = [1,2,3,5]
, the cost of insertion ismin(2, 1)
(elements1
and2
are less than3
, element5
is greater than3
) andnums
will become[1,2,3,3,5]
.Return the total cost to insert all elements from
instructions
intonums
. Since the answer may be large, return it modulo10^9 + 7
Constraints:
1 <= instructions.length <= 10^5
1 <= instructions[i] <= 10^5
Here is a solution. Note that the ranges of values, [1, 100000]
, makes it straightforward to use BIT. Also note that the BIT interface is simpler than in the classic problem: a number can be added with void Add(int x)
, and the count of values in the range [1, x]
can be found with int Count(int x)
.
public class Solution {
public int CreateSortedArray(int[] instructions) {
int result = 0;
int p = 1_000_000_007;
BIT tree = new(100_000);
for (int i = 1; i < instructions.Length; i++)
{
tree.Add(instructions[i-1]);
int lessCount = tree.Count(instructions[i]-1);
int greaterCount = i - tree.Count(instructions[i]);
result += Math.Min(lessCount, greaterCount);
result %= p;
}
return result;
}
public class BIT
{
private int[] tree;
public BIT(int n)
{
tree = new int[n+1];
}
public void Add(int x)
{
for (int i = x; i < tree.Length; i += i & -i)
{
tree[i]++;
}
}
public int Count(int x)
{
int result = 0;
for (int i = x; i > 0; i -= i & -i)
{
result += tree[i];
}
return result;
}
}
}
Normalisation
In the above problem, the counted values lie in the range [1, 100000]
, making it straightforward to apply BIT. In other cases, where values can be negative or very large, preprocessing is necessary. This section discusses these preprocessing techniques.
Offsetting
Assume the range is relatively small but spans both positive and negative numbers. An offset can be added to each number to shift the range. For example, adding 1001
shifts the range [-1000, 1000]
to [1, 2001]
, making it suitable for BIT.
Here are two problems that can be solved with BIT plus offsetting:
Relabelling
When using BIT for values counting, the difference between values does not matter, only their order (greater than, less than, or equal to) matters. This enables relabelling of values with labels in the range 1, ..., n
while maintaining the order.
For example, consider the list of values 8, -50, -1000000000, 2000000000, 5
. After relabelling, it becomes 4, 2, 1, 5, 3
.
Relabelling works best when the number of values (in this case, 5) is much smaller than the range of values (in this case, 3000000000). A limitation to be aware of is that C# does not support arrays longer than int.MaxValue
, which is approximately just over 2 * 10^9
. When exceeding this range, relabelling is necessary.
Here is an example implementation of relabelling:
public Dictionary<int, int> Normalise(int[] nums)
{
Dictionary<int, int> result = new();
int label = 1;
foreach (int num in nums.Order())
{
result[num] = label;
label++;
}
return result;
}
Note that this implementation does not modify the nums
array and is able to handle duplicates:
int[] nums = [1, 2, 1, -10, -10, 20];
Dictionary<int, int> labels = Normalise(nums);
for (int i = 0; i < nums.Length; i++)
{
Console.WriteLine($"{nums[i]} -> {labels[nums[i]]}");
}
// Output:
// 1-> 4
// 2-> 5
// 1-> 4
// - 10-> 2
// - 10-> 2
// 20-> 6
Here are two problems that can be solved with BIT plus relabelling.
Properties Tracking Application
Another application of BIT is tracking properties of numbers in a list. Consider the list x_1, x_2, ..., x_n
. BIT operates on the range [1, n]
, with the weight w(i)
being 1
if x_i
satisfies a certain property, and 0
otherwise. This allows for modifying the values in the list and counting the number of values in a range that satisfy the property.
The problem 3187. Peaks in Array is a straightforward application of this idea. The property in question is whether x_i
is greater than both x_{i-1}
and x_{i+1}
.
How does properties tracking differ from values counting? In this case, BIT operates on the indexes rather than the values, so normalisation is unnecessary. In a sense, properties tracking is what values counting becomes after normalisation.
Another example of a trackable property is whether the value is still present. This allows tracking the indexes of values in the list when values can be repeatedly deleted. Consider the list a, b, c, d, e
. At the start, all the values are present, so the weights are 1, 1, 1, 1, 1
. If b
and c
are deleted, the weights become 1, 0, 0, 1, 1
. The indexes of a
, d
, and e
in the original list are 1
, 4
, and 5
. After deletions, the new indexes can be found by summing up the weights to be 1
, 2
, and 3
.
Here is a problem that demonstrates the idea of tracking indexes after deletions: 2659. Make Array Empty
You are given an integer array
nums
containing distinct numbers, and you can perform the following operations until the array is empty:
- If the first element has the smallest value, remove it
- Otherwise, put the first element at the end of the array.
Return an integer denoting the number of operations it takes to make
nums
empty.Constraints:
1 <= nums.length <= 10^5
-10^9 <= nums[i] <= 10^9
- All values in
nums
are distinct.
Here is a solution. The idea is to construct the indexes
array such that indexes[i]
represents the index in nums
of the i
th smallest element in nums
. Then, the i
th smallest number will be deleted after tree.Count(indexes[i]) - tree.Count(indexes[i-1])
operations following the deletion of the i-1
th smallest number.
public class Solution {
public long CountOperationsToEmptyArray(int[] nums) {
int n = nums.Length;
int[] indexes = new int[n];
for (int i = 0; i < n; i++)
{
indexes[i] = i+1;
}
Array.Sort(nums, indexes);
BIT tree = new(n);
long result = indexes[0];
tree.Remove(indexes[0]);
for (int i = 1; i < n; i++)
{
if (indexes[i] > indexes[i-1])
{
result += tree.Count(indexes[i]) - tree.Count(indexes[i-1]);
}
else
{
result += (n - i) - tree.Count(indexes[i-1]) + tree.Count(indexes[i]);
}
tree.Remove(indexes[i]);
}
return result;
}
public class BIT
{
private int[] tree;
public BIT(int n)
{
tree = new int[n+1];
for (int i = 1; i <= n; i++)
{
tree[i]++;
int parent = i + (i & -i);
if (parent <= n)
{
tree[parent] += tree[i];
}
}
}
public void Remove(int x)
{
for (int i = x; i < tree.Length; i += i & -i)
{
tree[i]--;
}
}
public int Count(int x)
{
int result = 0;
for (int i = x; i > 0; i -= i & -i)
{
result += tree[i];
}
return result;
}
}
}
Here is another problem that can be solved by tracking indexes after deletions:
Conclusion
This post has covered several applications of Binary Indexed Trees, including a classic problem, values counting, and properties tracking. Two normalisation techniques - offsetting and relabelling - have also been mentioned. This post by no means covers all the capabilities of BITs but highlights a few commonalities observed in BIT-solvable LeetCode problems.
Here is the full list of mentioned problems for further practice:
- 307. Range Sum Query - Mutable
- 1649. Create Sorted Array through Instructions
- 315. Count of Smaller Numbers After Self
- 2426. Number of Pairs Satisfying Inequality
- 3072. Distribute Elements Into Two Arrays II
- 493. Reverse Pairs
- 3187. Peaks in Array
- 2659. Make Array Empty
- 1505. Minimum Possible Integer After at Most K Adjacent Swaps On Digits