I've been trying to understand how data is stored in C but I'm getting confused. I have this code:
int main(){
int a;
char *x;
x = (char *) &a;
x[0] = 0;
x[1] = 3;
printf("%d\n", a);
return 0;
}
I've been messing around with x[0] & x[1], trying to figure out how they work, but I just can't. For example x[1] = 3 outputs 768. Why?
I understand that there are 4 bytes (each holding 8 bits) in an int, and x[1] points to the 2nd byte. But I don't understand how making that second byte equal to 3, means a = 768.
I can visualise this in binary format:
byte 1: 00000000
byte 2: 00000011
byte 3: 00000000
byte 4: 00000000
But where does the 3 come into play? how does doing byte 2 = 3, make it 00000011 or 768.
Additional question: If I was asked to store 545 in memory. What would a[0] and a[1] = ?
I know the layout in binary is:
byte 1: 00100001
byte 2: 00000010
byte 3: 00000000
byte 4: 00000000
CodePudding user response:
It is not specific to C, it is how your computer is storing the data.
There are two different methods called endianess.
Little-endian: the least significant byte is stored first. Example:
0x11223344
will be stored as0x44 0x33 0x22 0x11
Big-endian: the least significant byte is stored last. Example:
0x11223344
will be stored as0x11 0x22 0x33 0x44
Most modern computers use the little-endian system.
Additional question: If I was asked to store 545 in memory
545 in hex is 0x221 so the first byte will be 0x21 and the second one 0x02
as your computer is little-endian.
Why do I use hex numbers? Because every two digits represent exactly one byte in memory.
I've been messing around with x[0] & x[1], trying to figure out how they work, but I just can't. For example x[1] = 3 outputs 768. Why?
768 in hex is 0x300. So the byte representation is 0x00 0x03 0x00 0x00
CodePudding user response:
Warning: by casting the address of an int
to a char *
, the compiler is defenseless trying to maintain order. Casting is the programmer telling the compiler "I know what I am doing." Use it will care.
Another way to refer to the same region of memory in two different modes is to use a union
. Here the compiler will allocate the space required that is addressable as either an int
or an array of signed char
.
This might be a simpler way to experiment with setting/clearing certain bits as you come to understand how the architecture of your computer stores multi-byte datatypes.
See other responses for hints about "endian-ness".
#include <stdio.h>
int main( void ) {
union {
int i;
char c[4];
} x;
x.i = 0;
x.c[1] = 3;
printf( "x x x x x %d\n", x.c[0], x.c[1], x.c[2], x.c[3], x.i, x.i );
x.i = 545;
printf( "x x x x x %d\n", x.c[0], x.c[1], x.c[2], x.c[3], x.i, x.i );
return 0;
}
00 03 00 00 00000300 768
21 02 00 00 00000221 545