Searching for bits in C

Some years ago, a friend asked me how to (efficiently) get the position of some bit with value 1 inside an (unsigned) integer. For example, taking 32 bit integers and taking the least significant bit (LSB) as the bit 0, these are valid answers:

0xa9e7da24 --> 5
0x1d56b8b0 --> 28
0x9459ffbb --> 0
0x9f0c2a38 --> 24

If we reduce the problem to a more specific one, finding the lowest bit with value 1 inside an integer, it’s easy to get a solution just by doing a bitwise AND against a moving mask:

size_t bitscan(uint32_t a)
{
  size_t i;
  uint32_t mask = 1;
  for (i = 0; i < 32 && !(a & mask); i++)
    mask <<= 1;
  return i;
}

but it’s relatively slow. Can we do it faster?

The answer is clearly yes if we are using x86 assembly, as this instruction set has a specific instruction for this purpose: Bit Scan Forward (BSF). But, even in C, we can do better by a three step process:

  • First we isolate the least significant bit that is set by using: a & (-a).
  • Then we take the residue of dividing this power of two by 37.
  • Finally, we look up the associated bit position, using the residue obtained in the previous step as an index.

Why the first step works? We can start with the zero: doing a bitwise AND against -0 gives us just zero. As the zero doesn’t have any 1 bits to start with, we can say we’ve “isolated” the least significant one, though in a somewhat vacuous sense.

To check the nonzero integers, let’s define the bit expansion of an unsigned integer a:

\displaystyle a = \sum_{i=0}^N a_i\,2^i

Then the result of doing –a (remember we are dealing with unsigned integers) will be:

\displaystyle -a = \sum_{i=0}^N (1-a_i)\,2^i + 1

If the number is nonzero, there must a least significant nonzero bit. Let’s call its position M. Writing a & (-a) now:

\displaystyle a\,\&\, (-a) = \left(\sum_{i=0}^N a_i\,2^i\right)\&\left(\sum_{i=0}^N (1-a_i)\,2^i + 1\right)

\displaystyle = \left(\sum_{i=M+1}^N a_i\,2^i + 1\cdot2^M + \sum_{i=0}^{M-1}0\cdot2^i\right)\&

\displaystyle \left(\sum_{i=M+1}^N (1-a_i)\,2^i + (1-1)\cdot2^M + \sum_{i=0}^{M-1}(1-0)\cdot2^i + 1\right) (definition of M)

\displaystyle = \left(\sum_{i=M+1}^N a_i\,2^i + 2^M\right)\&\left(\sum_{i=M+1}^N (1-a_i)\,2^i + \sum_{i=0}^{M-1}2^i + 1\right)

\displaystyle = \left(\sum_{i=M+1}^N a_i\,2^i + 2^M\right)\&\left(\sum_{i=M+1}^N (1-a_i)\,2^i + 2^M\right) (geometric series sum)

\displaystyle = \sum_{i=M+1}^N a_i(1-a_i)\,2^i + 2^M (AND bit by bit)

\displaystyle = \sum_{i=M+1}^N 0\cdot 2^i + 2^M = 2^M

we get the isolated Mth bit, as desired.

We can look for the smallest divisor giving different residues for 20, 21, …, 231 via a small piece of code:

def get_modulo(n):
    m = 1
    while True:
        if len(set(2**i % m for i in range(n))) == n:
            return m
        m += 1

if __name__ == '__main__':
    print(get_modulo(32))

Now we just have to put the bit numbers in their associated positions inside the lookup table. We can calculate the table using the following Python code:

def get_lut(max_exp, modulo):
    lut = [-1] * modulo
    for i in range(max_exp + 1):
        assert lut[2 ** i % modulo] == -1
        lut[2 ** i % modulo] = i
    return lut

def print_table(table):
    print('{%s}' % ', '.join('%d' % t for t in table))

if __name__ == '__main__':
    print_table(get_lut(32, 37))

Then, putting these two pieces together, we get the following C function:

int lut_bit_pos(uint32_t a)
{
  const int bit_pos_lut[37] =
  {
    -1, 0, 1, 26, 2, 23, 27, 32, 3, 16,
    24, 30, 28, 11, -1, 13, 4, 7, 17,
    -1, 25, 22, 31, 15, 29, 10, 12, 6, 
    -1, 21, 14, 9, 5, 20, 8, 19, 18
  };
  return bit_pos_lut[(a & (-a)) % 37];
}

As we are working with 32 bit integers, we can do a full test… though not using Ideone (it exceeds the runtime limits).

Edited 5/30/11 – Bit sets

The original problem was to handle a set of task slots, each slot having two possible states: occupied or empty. This set required an efficient support of the following operations:

  • Checking if a slot is occupied.
  • Occupying/freeing a slot.
  • Getting a free slot if one is available.

As this problem doesn’t ask us for the slot number, we can use an opaque type and internally codify the slot as its associated power of two:

typedef uint32_t set_t;
typedef int32_t set_aux_t;
typedef uint32_t slot_t;

#define NUM_SLOTS 32

#define EMPTY_SET ((set_t)-1)
#define FULL_SET ((set_t)0)

#define IS_VALID_SLOT(slot) ((slot)&&!((slot)&((slot)-1)))
#define SLOT_FROM_INDEX(slot_idx) (1<<(slot_idx))

#define GET_EMPTY_SLOT(set) ((set)&(-(set_aux_t)(set)))

#define SET_SLOT(set, slot) ((void)((set)&=~(slot)))
#define CLEAR_SLOT(set, slot) ((void)((set)|=(slot)))

#define IS_SLOT_SET(set, slot) (!((set)&(slot)))

We can check the implementation using Ideone.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s