I am using C 17.
Let's imagine I have two variables a
and b
. These variables are of type uint8_t
. I would like to be able to access them as uint8_t but also as uint16_t.
For example :
#include <memory>
int main()
{
uint8_t a = 0xFF;
uint8_t b = 0x00;
uint16_t ab; // Should be 0xFF00
}
I thought that using an array would be a good solution, as the two variable should be next to each other in memory. So I did this :
#include <memory>
int main()
{
uint8_t data[] = {0xFF, 0x00};
uint8_t * a = data;
uint8_t * b = data sizeof(uint8_t);
uint16_t * ab = reinterpret_cast<uint16_t*>(data);
std::cout << std::hex << (int) *a << "\n";
std::cout << std::hex << (int) *b << "\n";
std::cout << std::hex << (int) *ab << "\n";
}
Output:
ff
0
ff
But I would expect :
ff
0
ff00
Can you explain what I am doing wrong here, any red flags or better ways of doing this ?
Thanks !
CodePudding user response:
There are a few other ways to convert between two 8- and one 16-bit value.
But be aware that the results of every solution which directly addresses a single byte in the 16-bit value depend on the byte order of the machine executing it. Intel, for example, uses 'little endian' where the least significant bits are stored first. Other machines may use 'big endian' and store the most significant bits first.
use bitshift and or to calculate the 16-bit value
const uint8_t a = 0xff;
const uint8_t b = 0x00;
const uint16_t ab = a | (b << 8); // works because b is promoted to int before being shifted
use bitshift and and
to calculate the 8-bit values
const uint16_t ab = 0xff;
const uint8_t a = ab & 0xff;
const uint8_t b = ab >> 8;
directly address the bytes of the word
uint16_t ab;
auto& a = reinterpret_cast<uint8_t*>(&ab)[0];
auto& b = reinterpret_cast<uint8_t*>(&ab)[1];
using a union
This is explicitly not allowed by the standard (but also done everywhere)
Declare the following union:
union conv
{
struct {
uint8_t a, b;
};
uint16_t ab;
};
You now can use it for combining two 8 bit values into a single 16 bit value:
conv c;
c.a = 0xFF;
c.b = 0x00;
std::cout << c.ab << std::endl;
On Intel machines this will output 255 (0xff) because Intel uses "little endian" where the least significant bits are stored first. So a is the low byte of ab and b is the high byte.
If you redefine the union as
union conv
{
struct {
uint8_t b, a;
};
uint16_t ab;
};
The example above would output 65280 (0xff00) on Intel machines because now b represents the least significat 8 bits of ab and a represents the most significant.
Combining unions and bitfields, you can also access each single bit of the 16-bit value:
union bitconv
{
struct {
uint16_t
b0 : 1, b1 : 1, b2 : 1, b3 : 1, b4 : 1, b5 : 1, b6 : 1, b7 : 1,
b8 : 1, b9 : 1, b10 : 1, b11 : 1, b12 : 1, b13 : 1, b14 : 1, b15 : 1;
};
uint16_t word;
};
CodePudding user response:
this is because of the byte order wiki.
As you can see this code determines the byte order in memory
uint16_t x = 0x0001;
std::cout << (*((uint8_t*)&x) ? "little" : "big") << "-endian\n";
Try swapping numbers
#include <memory>
#include <iostream>
int main()
{
uint16_t x = 0x0001;
std::cout << (*((uint8_t*)&x) ? "little" : "big") << "-endian\n";
uint8_t data[] = { 0x00, 0xFF };
uint8_t* a = data;
uint8_t* b = data sizeof(uint8_t);
uint16_t* ab = reinterpret_cast<uint16_t*>(data);
std::cout << std::hex << (int)*a << "\n";
std::cout << std::hex << (int)*b << "\n";
std::cout << std::hex << (int)*ab << "\n";
}
result
little-endian
0
ff
ff00
CodePudding user response:
A portable way with no undefined behaviour to pack 2 uint8_t
into uint16_t
and back:
int main() {
uint8_t a = 0xFF;
uint8_t b = 0x00;
// from a and b to ab
uint16_t ab = a * 0x100 b;
// from ab to a and b
a = ab / 0x100 & 0xff;
b = ab & 0xff;
}
Note that all methods that rely on casting uint16_t
to uint8_t*
only happen to work because uint8_t
is a type alias for unsigned char
and char
types are rather special in that they can alias any other types. This method breaks strict aliasing and results in undefined behaviour when casting to any other type larger than uint8_t
, e.g. when you cast uint64_t
to uint32_t*
or uint16_t*
.
See What is the Strict Aliasing Rule and Why do we care? for more details.
CodePudding user response:
In your case, it is better and easier to understand to use union and struct. Here is an example:
#include <iostream>
struct mystruct {
uint8_t a;
uint8_t b;
} ;
union my_unino{
mystruct x;
uint16_t ab ;
}data;
int main(){
data.x.a=0x1;
data.x.b=0xff;
std::cout<<std::hex<<(int)data.x.a<<std::endl;
std::cout<<std::hex<<data.ab<<std::endl;
return 0;
}
Results:
1
ff01