This problem is a follow-up of the post titled: Generate the next integer with the same number of bits. If we list all n bit integers which have exactly k bits set to 1, what will be the rank of a given number in that list ? For example, if n=4, k=2, then there are 6 (n choose 2) such numbers, and after listing them by increasing order, the ranking is as follows.
The question is given the bit pattern, as an array of booleans, what is the efficient way to directly compute the rank ? This problem can be handled by examining one element at a time. To formulate the solution, let the notation C(n,k) indicates n choose k.
Examine the most significant bit. If it is zero that bit can be dropped and the problem reduces to a sub-problem of size (n-1,k), and its solution to the original problem is the same as the solution to this sub-problem. If the most significant bit is one, then the problem reduces to a sub-problem of size (n-1,k-1), but in this case we have to add C(n-1,k) to the solution of the subproblem to get the solution to the original problem.
In general the C(n,k) combinations consists of C(n-1,k) patterns which start with zero, and C(n-1,k-1) patterns which start with ones. Recall the identity :
C(n,k) = C(n-1,k) + C(n-1,k-1)
This translates to a recursive algorithm, and since the recursion is only tail-recursion it is readily converted into an iterative algorithm, as follows :
0011 => 0 0101 => 1 0110 => 2 1001 => 3 1010 => 4 1100 => 5
The question is given the bit pattern, as an array of booleans, what is the efficient way to directly compute the rank ? This problem can be handled by examining one element at a time. To formulate the solution, let the notation C(n,k) indicates n choose k.
Examine the most significant bit. If it is zero that bit can be dropped and the problem reduces to a sub-problem of size (n-1,k), and its solution to the original problem is the same as the solution to this sub-problem. If the most significant bit is one, then the problem reduces to a sub-problem of size (n-1,k-1), but in this case we have to add C(n-1,k) to the solution of the subproblem to get the solution to the original problem.
In general the C(n,k) combinations consists of C(n-1,k) patterns which start with zero, and C(n-1,k-1) patterns which start with ones. Recall the identity :
C(n,k) = C(n-1,k) + C(n-1,k-1)
This translates to a recursive algorithm, and since the recursion is only tail-recursion it is readily converted into an iterative algorithm, as follows :
1 2 3 4 5 6 7 8 9 10 | long GetSequenceId(vector<bool>& array, int k) { long rank = 0; for (int i = array.size()-1; i >= 0; --i) { if (array[i] == true) { rank += Combination(i,k); --k; } } return rank; } |
The combination can be pre-computed as a table, using Pascal's triangle. However this algorithm can be made even faster by reversing the direction of the loop, and computing the combinations on-the-fly. Here goes the C++ code.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | uint64_t GetSequenceId(vector<bool>& array) { uint64_t rank = 0, nck = 1; uint32_t numones = 0; for (int i = 0; i < array.size(); ++i) { if (array[i]) { ++numones; nck = (numones == i) ? 1 : (nck * i / numones); rank += nck; } else { nck = (numones >= i) ? 1 : nck * i / (i - numones); } } return rank; } |