Home > Blockchain >  Extracting particular bits and packing them into a payload
Extracting particular bits and packing them into a payload

Time:09-26

I need to write a function that uses interface functions from other components to get individual values for year, month, day, hour, minute, second, and then pack those values into a 5-byte message payload and provide it to another component by using an unsigned char pointer as function parameter. The payload structure is strictly defined and must look like this:

|        | bit 7 | bit 6 | bit 5 | bit 4 | bit 3 | bit 2 | bit 1 | bit 0 |
| -------------- | -------------- | -------------- | --------------| --------------| --------------| --------------|-------------- |-------------- |
| byte 0 |      0 |      0 |      0 |      0 |      0 |       0 |     0 | year  |
| byte 1 | year   | year   | year   | year   | year   | year    | month | month |
| byte 2 | month  | month  | day    | day    | day    | day     | day   | hour  |
| byte 3 | hour   | hour   | hour   | hour   | minute | minute | minute | minute |
| byte 4 | minute | minute | second | second | second | second | second | second |

My current approach is:

void prepareDateAndTimePayload(unsigned char * message)
{
    unsigned char payload[5] = {0};

    unsigned char year = getYear();
    unsigned char month = getMonth();
    unsigned char day = getDay();
    unsigned char hour = getHour();
    unsigned char minute = getMinute();
    unsigned char second = getSecond();

    payload[0] = (year & 0x40) >> 6u; //get 7th bit of year and shift it
    payload[1] = ((year & 0x3F) << 2u) | ((month & 0xC) >> 2u); //remaining 6 bits of year and starting with month 
    payload[2] = ((month & 0x3) << 6u) | ((day & 0x1F) << 1u) | ((hour & 0x10) >> 4u); //...
    payload[3] = ((hour & 0xF) << 4u) | ((minute & 0x3C) >> 2u); 
    payload[4] = ((minute & 0x3) << 6u) | (second & 0x3F); //...

    memcpy(message, payload, sizeof(payload));
}

I'm wondering how I should approach extracting the particular bits and packing them into a payload, so they match the required message structure. I find my version with bit masks and bit shifting to be messy and not elegant. Is there any better way to do it?

CodePudding user response:

If you want to write a portable program, then your current implementation is pretty much the way you have to do it. You can make it a bit easier on the eyes by defining some macros to do handle the shifting and masking for you, but that's about it.

Using bitfields, you can offload most of that work to the compiler. Beware though that the compiler is free to implement bitfields in pretty much any way it wants. The resulting memory layout is not always portable between compilers - or even between ISAs using the same compiler.

As an example from the Linux kernel, here you can see how care must be taken to make sure that the CPU's endianness is taken into consideration, for example:

struct iphdr {
#if defined(__LITTLE_ENDIAN_BITFIELD)
    __u8    ihl:4,
        version:4;
#elif defined (__BIG_ENDIAN_BITFIELD)
    __u8    version:4,
        ihl:4;
#else
#error  "Please fix <asm/byteorder.h>"
#endif
    __u8    tos;
    __be16  tot_len;
    __be16  id;
    __be16  frag_off;
    __u8    ttl;
    __u8    protocol;
    __sum16 check;
    __be32  saddr;
    __be32  daddr;
    /*The options start here. */
};

CodePudding user response:

Look at your code with its various magic numbers. Now look at this code below. The compiler will optimise to use registers, so the extra clarity is for the human readers able to see and check that all makes sense.

void prepareDateAndTimePayload(unsigned char msg[5])
{
    unsigned char yr = 0x7F & getYear() -  bias; // 7 bits
    unsigned char mn = 0x0F & getMonth(); // 4 bits
    unsigned char dy = 0x1F & getDay(); // 5 bits
    unsigned char hr = 0x1F & getHour(); // 5 bits
    unsigned char mi = 0x3F & getMinute(); // 6 bits
    unsigned char sc = 0x3F & getSecond(); // 6 bits

//     [4]    mmss ssss
    msg[4]  = sc;      //    6/6 bit sc (0-59)
    msg[4] |= mi << 6; // lo 2/6 bit mi (0-59)

//     [3]    hhhh mmmm
    msg[3]  = mi >> 2; // hi 4/6 bit mi (0-59)
    msg[3] |= hr << 4; // lo 4/5 bit hr (0-23)

//     [2]    MMDD DDDh
    msg[2]  = hr >> 4; // hi 1/5 bit hr (0-23)
    msg[2] |= dy << 1; //    5/5 bit dy (0-31)
    msg[2] |= mn << 6; // lo 2/4 bit mn (1-12)

//     [1]    YYYY YYMM
    msg[1]  = mn >> 2; // hi 2/4 bit mn (1-12)
    msg[1] |= yr << 2; // lo 6/7 bit yr (0-127)

//     [0]    0000 000Y
    msg[0]  = yr >> 6; // hi 1/7 bit yr (0-127)
}

The OP refers to one side of a send/receive operation. This proposal is based on the idea that both sides of that translation are amenable to revision. It does not address the OP directly, but provides an alternative if both sides are still under development. This requires only single byte data (suitable for narrow processors.)

Observation: Oddly packing 6 values into 5 bytes with error prone jiggery-pokery. Hallmark of a badly thought-out design.

Here is a reasonable (and cleaner) alternative.

|        |  b7  |  b6  |  b5  |  b4  |  b3  |  b2  |  b1  |  b0  |
| ------ | ---- | ---- | ---- | ---- | ---- | ---- | ---- | ---- | yr max 127 from bias
| byte 0 |  yr3 |  yr2 |  yr1 |  yr0 |  mo  |  mo  |  mo  |  mo  | max 12
| byte 1 |  yr6 |  yr5 |  yr4 |  dy  |  dy  |  dy  |  dy  |  dy  | max 31
| byte 2 |      |      |      |  hr  |  hr  |  hr  |  hr  |  hr  | max 23
| byte 3 |      |      |  mi  |  mi  |  mi  |  mi  |  mi  |  mi  | max 59
| byte 4 |      |      |  sc  |  sc  |  sc  |  sc  |  sc  |  sc  | max 59

And, in code:

typedef union {
    uint8_t yr; // 7 bits valid. bias   [0-127]
    uint8_t mo; // 1-12. 4 bits
    uint8_t dy; // 1-31. 5 bits
    uint8_t hr; // 0-23. 5 bits
    uint8_t mn; // 0-59. 6 bits
    uint8_t sc; // 0-59. 6 bits
} ymdhms_t;

void pack(unsigned char pyld[5], ymdhms_t *dt)
{
    // year biased into range 0-127 (eg 2022 - 1980 = 42 )
    dt.yr -= bias;
    pyld[0] = dt->mo | (dt->yr & 0x0F) << 4; // mask unnecessary
    pyld[1] = dt->dy | (dt->yr & 0x70) << 1;
    pyld[2] = dt->hr;
    pyld[3] = dt->mn;
    pyld[4] = dt->sc;
}

void unpack(unsigned char pyld[5], ymdhms_t *dt)
{
    dt->mo = pyld[0] & 0x0F;
    dt->dy = pyld[1] & 0x1F;
    dt->hr = pyld[2];
    dt->mn = pyld[3];
    dt->sc = pyld[4];

    dt->yr =  bias   pyld[0] >> 4    pyld[1] >> 1;
}

One might even ask if "shrinking 6 bytes to 5" is worth the effort when the useful bit density only rises from 69% to 82%.

CodePudding user response:

Whatever is chosen, code for 1) portable correctness, 2) clarity and 3) ease of maintenance.

Avoid premature optimization.


OP's code looks like a reasonable way to solve the issue - especially for narrow processors, yet it is challenging to review.

Perhaps use a wider type for the in-between part and let the compiler emit efficient code:

uint64_t datetime = 
    ((uint64_t) getYear()   << 26) | 
    ((uint32_t) getMonth()  << 22) |
    ((uint32_t) getDay()    << 17) |
    ((uint32_t) getHour()   << 12) |
    (           getMinute() <<  6) |
    (           getSecond() <<  0);

payload[0] = datetime >> 32;
payload[1] = datetime >> 24;
payload[2] = datetime >> 16;
payload[3] = datetime >> 8;
payload[4] = datetime >> 0;

or maybe end with

datetime = htob64(datetime);
memcpy(message, &datetime, 5);

Could add mask in like getMonth() --> (getMonth() && 0x0Flu) if concerned about out of range get function values.

  • Related