Say I have a bigger type.
uint32_t big = 0x01234567;
Then what can I do for (char*)&big
, the pointer interpreted as a char type after casting?
- Is that an undefined behavior to shift the address of
(char*)&big
to(char*&big) 1
,(char*&big) 2
, etc.? - Is that an undefined behavior to both shift and edit
(char*)&big 1
? Like the example below. I think this example should be an undefined behavior because after casting to(char*)
, we then have limited our eyesight to achar
-type pointer, and we ought not access, even change the value outside this scope.
uint32_t big = 0x01234567;
*((char*)&big 1) = 0xff;
printf("x\n\n\n", *((char*)&big 1));
printf("x\n\n\n", big);
(This pass my Visual C compiler. By the way, I want to ask a forked question on that why in this example the first printf
gives ffffffff
? Shouldn't it be ff
?)
- I have seen a code like this. And this is what I usually do when I need to achieve similar task. Is this UB or not? Why or why not? What is the standard way to achieve this?
uint8_t catcher[8] = { 0 };
uint64_t big = 0x1234567812345678;
memcpy(catcher, (uint8_t*)&big, sizeof(uint64_t));
CodePudding user response:
Then what can I do for
(char*)&big
, the pointer interpreted as a char type after casting?
If a char
is eight bits, which it is in most modern C implementations, then there are four bytes in the uint32_t
big
, and you can do arithmetic on the address from (char *) &big 0
to (char *) &big 4
. You can also read and write the bytes from (char *) &big 0
to (char *) &big 3
, and those will access individual bytes in the representation of big
. Although arithmetic is defined to work up to (char *) &big 4
, that is only an endpoint. There is no defined byte there, and you should not use that address to read or write anything.
- Is that an undefined behavior to shift the address of
(char*)&big
to(char*&big) 1
,(char*&big) 2
, etc.?
These are additions, not shifts, and the syntax is (char *) &big 1
, not (char*&big) 1
. Arithmetic is defined for the offsets from 0 to 4.
- Is that an undefined behavior to both shift and edit
(char*)&big 1
?
It is allowed to read and write the bytes in big
using a pointer to char
. This is a special rule for character types. Generally, the bytes of an object should not be accessed using an unrelated type. For example, a float
object could not be accessed using an int
type. However, the character types are special; you may access the bytes of any object using a character type.
However, it is preferable to use unsigned char
for this, as it avoids complications with signed values.
- I have seen a code like this.
It is allowed to read or write the bytes of an object using memcpy
. memcpy
is defined to work as if by copying characters.
Note that, while accessing the bytes of an object is defined by the C standard, how bytes represent values is partly implementation-defined. Different C implementations may use different orders for the bytes within an object, and there can be other differences.
By the way, I want to ask a forked question on that why in this example the first
printf
givesffffffff
? Shouldn't it beff
?
In your C implementation, char
is signed and can represent values from −128 to 127. In *((char*)&big 1) = 0xff;
, 0xff
is 255 and is too big to fit into a char
. It is converted to a char
value in an implementation-defined way. Your C implementation converts it to −1. (The eight-bit two’s complement representation of −1, bits 11111111, uses the same bits as the binary representation of 255, again bits 11111111.)
Then printf("x\n\n\n", *((char*)&big 1));
passes this value, −1, to printf
. Since it is a char
, it is promoted to int
to be passed to printf
. This produces the same value, −1, but it has 32 bits, 11111111111111111111111111111111. Then you are passing an int
, but printf
expects an unsigned int
for x
. The behavior of this is not defined by the C standard, but your C implementation reads the 32 bits as if they were an unsigned int
. As an unsigned int
, the 32 bits 11111111111111111111111111111111 represent the value 4,294,967,295 or 0xffffffff
, so that is what printf
prints.
You can print the correct value by using printf("hhx\n\n\n", * ((unsigned char *) &big 1));
. As an unsigned char
, the bits 11111111 represent 255 or 0xff
, and converting that to an int
produces 255 or 0x000000ff
.
CodePudding user response:
For variadic functions (like printf
) all arguments undergoes default argument promotion which promotes smaller integer types to int
.
This conversion will include sign-extension if the smaller type is signed, so the value keeps its value.
So if char
is a signed type (which is implementation defined) with a value of -1
then it will be promoted to the int
value -1
. Which is what you see.
If you want to print a smaller type you need to first of all cast to the correct type (unsigned char
) then use the proper format (like %hhx
for printing unsigned char
values).