char c[4] = { 'A', '\0', '\0', '\0' };
int* pi = (int*)&c[0];
printf("%x %x %x %x\n", c[0], c[1], c[2], c[3]);
printf("%x %x %x %x\n", *((unsigned char *)pi), *((unsigned char*)pi 1), *((unsigned char*)pi 2), ((unsigned char*)pi)[3]);
printf("%d %c\n", (int)c[0], c[0]);
printf("%d %c\n", *pi, (char)*pi);
In the above code, I declared a character-type array and I printed its contents. I can't understand why the integer printing line prints "65 A".
In this case, the memory contents where a character array is pointing to (if I'm not mistaken, they are 4 bytes) are definitely different from the integer 65 because the int
type requires 4 bytes and the char
type requires a byte. 'A'
was not even the last element.
Am I misunderstanding something?
CodePudding user response:
char c[4] = { 'A', '\0', '\0', '\0' };
arranges for there to be four bytes in memory, of which the first has the value 651, and the remaining three have value 0.
int* pi = (int*)&c[0];
says to make pi
point to the first of these bytes.
Because pi
is a pointer to an int
, *pi
is an lvalue for an int
. An lvalue is an expression that may designate an object in memory. Because it is for an int
, using *pi
for its value nominally tells the compiler to get four bytes2 from memory.
The behavior of this is not defined by the C standard, because, although int* pi = (int*)&c[0];
nominally sets pi
to point to the first byte of c
and *pi
nominally gets an int
from memory, there are rules about how you may use these things in C, and this program violates the rules. Because of those violations, the behavior of the program is not defined by the C standard.
However, if the program does behave according to the nominal behavior, *pi
gets the four bytes 65, 0, 0, and 0 and interprets them as the bytes that represent an int
.
Some C implementations store the bytes of an int
in memory with the lowest-value byte first in memory (at the lowest address), followed by the second lowest, then third, then fourth. Some store the bytes with the highest-value first, then the second highest, and so on. (It is also allowed to store the bytes in different orders. This is rare.) Your C implementation stores the bytes in the first order, which is called little-endian. In this order, the bytes 65, 0, 0, and 0 represent the value 65. So “65” was printed for *pi
.
For (char)*pi)
, four bytes were fetched from memory and interpreted as an int
, yielding the value 65. Then (char)
converted this to a char
, still with the value 65. This char
is actually promoted back to an int
to be passed to printf
. The conversion specification %c
requests that the character with this code, 65, be printed, so “A” was printed.
A proper way to reinterpret bytes as an int
is to copy them in using memcpy
(or a manual copy using a character type). This method has behavior defined by the C standard:
char c[4] = { 'A', '\0', '\0', '\0' };
int i;
memcpy(&i, c, sizeof i);
printf("i = %d.\n", i);
Footnotes
1 Your C implementation uses ASCII for the character codes 1-127, and the ASCII code for “A” is 65. The C standard does not require that ASCII be used; C implementations may use other character codes.
2 I assume your C implementation uses four bytes for int
. The C standard allows some flexibility in this.
CodePudding user response:
The array:
char c[4] = { 'A', '\0', '\0', '\0' };
In memory from low-address to high-address appears as :
0x41 0x00 0x00 0x00
When interpreted as a 32 bit number on a little-endian architecture such as x86, the first byte is the least-significant byte. So the 32-bit integer value is 0x00000041 (or 65).
To construct 0x41000000 you would need to reverse the byte order:
char c[4] = { '\0', '\0', '\0', 'A' };
The line:
printf("%d %c\n", *pi, (char)*pi);
will then print 1090519040 followed by a non-printing NUL character.
More instructively:
printf("%d X\n", *pi, *pi);
will print:
1090519040 0x41000000
and for your original byte-order:
65 0x00000041