Home > Software engineering >  How can I write a function like getc() to read one bit at a time?
How can I write a function like getc() to read one bit at a time?

Time:07-07

I have a function that reads each bit in an unsigned long long and stores the value in an array, and now I want to convert this function into one that returns the next bit in the number and leave handling the data to the caller.

Ideally, it would work similar to getc() where I can do something like:

int n;
while((n = getbit()) != NULL)
    ...

I know a static index variable is an option, but that isn't very elegant. Is there a more practical way to do this?

CodePudding user response:

Much like a FILE structure wraps up various state around input, such as buffering, error states etc, you can wrap up the required state for extracting bits with a simple struct:

// Struct used to manage reading bits from a file
struct bitstream
{
    FILE *fp;
    unsigned char ch;     // current char
    unsigned char mask;   // next bit mask
    unsigned char mask0;  // first bit mask
};

This maintains the FILE being read from, keeps track of the last character read from that file, and manages a bit mask that will select one bit at a time. Because different applications may require a specific bit order (LSB->MSB or MSB->LSB), one extra member mask0 is used for that. You'll see how this is used in a moment.

First, you'll want a function to set up this struct, ready for action. Let's have two functions, which explicitly choose the desired bit ordering:

// Initialize a bitstream where bit ordering is LSB -> MSB
void bitstream_init_lsb(struct bitstream* bs, FILE *fp)
{
    struct bitstream bs_init = { fp };
    bs_init.mask0 = 1u;
    *bs = bs_init;
}

// Initialize a bitstream where bit ordering is MSB -> LSB
void bitstream_init_msb(struct bitstream* bs, FILE *fp)
{
    struct bitstream bs_init = { fp };
    bs_init.mask0 = 1u << (CHAR_BIT - 1);
    *bs = bs_init;
}

Note the difference in mask0 in these two functions. For LSB-priority, mask0 is the lowest-order bit. For MSB-priority, mask0 is the highest-order bit. In both cases, ch and mask are zero-initialized. We can rely on a zero value for mask to indicate that all bits have been read out of ch, and that we must read the next byte from the file.

The way this will work is that every time you read a bit, you shift the mask in the appropriate direction (left or right, depending on selected ordering). In either case, the mask will become zero again when the bit is shifted beyond its storage capabilities. You then know that it's time to read the next byte.

Note that the above logic explicitly relies on mask being an unsigned char type. That's important.

// Get next bit (0 or 1) from bitstream, or EOF
int bitstream_get(struct bitstream *bs)
{
    // Get next character when next bit is zero
    if (bs->mask == 0) {
        int ch = fgetc(bs->fp);
        if (ch == EOF) {
            return EOF;
        }
        bs->ch = (unsigned char) ch;
        bs->mask = bs->mask0;
    }

    // Get next bit
    int result = (bs->ch & bs->mask) ? 1 : 0;
    if (bs->mask0 == 1u) {
        bs->mask <<= 1;
    } else {
        bs->mask >>= 1;
    }
    return result;
}

Thanks to chux - Reinstate Monica for their suggested improvements to my original answer. The above is a revision, incorporating those changes and also implementing bit ordering explicitly.

To use this, you would simply initialize with a valid FILE* and then do a read loop just the same as you would with fgetc:

int bit;
struct bitstream bs;

bitstream_init_msb(&bs, stdin);

while (EOF != (bit = bitstream_get(&bs)))
{
    // ...
}

Here is a live demo: https://godbolt.org/z/KdEMoTavh

CodePudding user response:

Like @paddy's good answer but with alternative (IMO better) EOF handling.

struct bitstream {
  FILE *fp;
  unsigned char ch;    // current char
  unsigned char mask;  // next bit mask
};

int bitstream_get(struct bitstream *bs) {
  // Get next character when next bit is zero
  if (bs->mask == 0) {
    int ch = fgetc(bs->fp);
    if (ch == EOF) {
      return EOF;
    }
    bs->ch = (unsigned char) ch;
    bs->mask = 1u << (CHAR_BIT - 1);
  }

  // Get next bit
  int result = (bs->ch & bs->mask) ? 1 : 0;
  bs->mask >>= 1;
  return result;
}
  •  Tags:  
  • c
  • Related