How to access a 16 bits variable as two 8 bits variables ? And two 8 bits variables as one 16 bit va-CodePudding

I am using C 17.

Let's imagine I have two variables a and b. These variables are of type uint8_t. I would like to be able to access them as uint8_t but also as uint16_t.

For example :

#include <memory>


int main()
{
    uint8_t a = 0xFF;
    uint8_t b = 0x00;
    uint16_t ab; // Should be 0xFF00
}

I thought that using an array would be a good solution, as the two variable should be next to each other in memory. So I did this :

#include <memory>


int main()
{
    uint8_t data[] = {0xFF, 0x00};
    uint8_t * a = data;
    uint8_t * b = data   sizeof(uint8_t);
    uint16_t * ab = reinterpret_cast<uint16_t*>(data);
    
    std::cout << std::hex << (int) *a << "\n";
    std::cout << std::hex << (int) *b << "\n";
    std::cout << std::hex << (int) *ab << "\n";
}

Output:

ff
0
ff

But I would expect :

ff
0
ff00

Can you explain what I am doing wrong here, any red flags or better ways of doing this ?

Thanks !

CodePudding user response：

There are a few other ways to convert between two 8- and one 16-bit value.

But be aware that the results of every solution which directly addresses a single byte in the 16-bit value depend on the byte order of the machine executing it. Intel, for example, uses 'little endian' where the least significant bits are stored first. Other machines may use 'big endian' and store the most significant bits first.

use bitshift and or to calculate the 16-bit value

const uint8_t a = 0xff;
const uint8_t b = 0x00;
const uint16_t ab = a | (b << 8); // works because b is promoted to int before being shifted

use bitshift and and to calculate the 8-bit values

const uint16_t ab = 0xff;
const uint8_t a = ab & 0xff;
const uint8_t b = ab >> 8;

directly address the bytes of the word

uint16_t ab;
auto& a = reinterpret_cast<uint8_t*>(&ab)[0];
auto& b = reinterpret_cast<uint8_t*>(&ab)[1];

using a union

This is explicitly not allowed by the standard (but also done everywhere)

Declare the following union:

union conv
{
    struct {
        uint8_t a, b;
    };
    uint16_t ab;
};

You now can use it for combining two 8 bit values into a single 16 bit value:

conv c;
c.a = 0xFF;
c.b = 0x00;
std::cout << c.ab << std::endl;

On Intel machines this will output 255 (0xff) because Intel uses "little endian" where the least significant bits are stored first. So a is the low byte of ab and b is the high byte.

If you redefine the union as

union conv
{
    struct {
        uint8_t b, a;
    };
    uint16_t ab;
};

The example above would output 65280 (0xff00) on Intel machines because now b represents the least significat 8 bits of ab and a represents the most significant.

Combining unions and bitfields, you can also access each single bit of the 16-bit value:

union bitconv
{
    struct {
        uint16_t
            b0 : 1, b1 : 1, b2 : 1, b3 : 1, b4 : 1, b5 : 1, b6 : 1, b7 : 1,
            b8 : 1, b9 : 1, b10 : 1, b11 : 1, b12 : 1, b13 : 1, b14 : 1, b15 : 1;
    };
    uint16_t word;
};

CodePudding user response：

this is because of the byte order wiki.

As you can see this code determines the byte order in memory

uint16_t x = 0x0001; 
std::cout << (*((uint8_t*)&x) ? "little" : "big") << "-endian\n";

Try swapping numbers

#include <memory>
#include <iostream>


int main()
{
    uint16_t x = 0x0001;
    std::cout << (*((uint8_t*)&x) ? "little" : "big") << "-endian\n";

    uint8_t data[] = { 0x00, 0xFF };
    uint8_t* a = data;
    uint8_t* b = data   sizeof(uint8_t);
    uint16_t* ab = reinterpret_cast<uint16_t*>(data);

    std::cout << std::hex << (int)*a << "\n";
    std::cout << std::hex << (int)*b << "\n";
    std::cout << std::hex << (int)*ab << "\n";
}

result

little-endian
0
ff
ff00

CodePudding user response：

A portable way with no undefined behaviour to pack 2 uint8_t into uint16_t and back:

int main() {
    uint8_t a = 0xFF;
    uint8_t b = 0x00;

    // from a and b to ab
    uint16_t ab = a * 0x100   b;

    // from ab to a and b
    a = ab / 0x100 & 0xff;
    b = ab & 0xff;
}

Note that all methods that rely on casting uint16_t to uint8_t* only happen to work because uint8_t is a type alias for unsigned char and char types are rather special in that they can alias any other types. This method breaks strict aliasing and results in undefined behaviour when casting to any other type larger than uint8_t, e.g. when you cast uint64_t to uint32_t* or uint16_t*.

See What is the Strict Aliasing Rule and Why do we care? for more details.

CodePudding user response：

In your case, it is better and easier to understand to use union and struct. Here is an example:

#include <iostream>

struct mystruct {
      uint8_t a;
      uint8_t b;
    } ;
    union my_unino{
      mystruct x;
      uint16_t ab ;

    }data;

int main(){
  data.x.a=0x1;
  data.x.b=0xff;
  std::cout<<std::hex<<(int)data.x.a<<std::endl;
  std::cout<<std::hex<<data.ab<<std::endl;
  return 0;

}

Results:

1
ff01