is the number 1 stored in memory as 00000001 00000000 00000000 00000000?
#include <stdio.h>
int main()
{
unsigned int a[3] = {1, 1, 0x7f7f0501};
int *p = a;
printf("%d %p\n", *p, p);
p = (long long)p 1;
printf("%d %p\n", *p, p);
char *p3 = a;
int i;
for (i = 0; i < 12; i , p3 )
{
printf("%x %p\n", *p3, p3);
}
return 0;
}
Why is 16777216
printed in the output:
CodePudding user response:
An integer is stored in memory in different ways on different architectures. Most commons ways are called little-endian and big-endian byte ordering.
See Endianness
(long long)p 1
|
v
Your memory: [0x01, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, ...]
You increment p not like pointer but as a long long number, so it does not point to next integer but the next byte. So you will get 0x00, 0x00, 0x00, 0x01 which translates to 0x1000000 (decimal 16777216) in a little-endian arch.
CodePudding user response:
Something to play with (assuming int
is 32 bits wide):
#include <stdio.h>
#include <stdbool.h>
typedef union byte_rec {
struct bit_rec {
bool b0 : 1;
bool b1 : 1;
bool b2 : 1;
bool b3 : 1;
bool b4 : 1;
bool b5 : 1;
bool b6 : 1;
bool b7 : 1;
} bits;
unsigned char value;
} byte_t;
typedef union int_rec {
struct bytes_rec {
byte_t b0;
byte_t b1;
byte_t b2;
byte_t b3;
} bytes;
int value;
} int_t;
void printByte(byte_t *b)
{
printf(
"%d %d %d %d %d %d %d %d ",
b->bits.b0,
b->bits.b1,
b->bits.b2,
b->bits.b3,
b->bits.b4,
b->bits.b5,
b->bits.b6,
b->bits.b7
);
}
void printInt(int_t *i)
{
printf("%p: ", i);
printByte(&i->bytes.b0);
printByte(&i->bytes.b1);
printByte(&i->bytes.b2);
printByte(&i->bytes.b3);
putchar('\n');
}
int main()
{
int_t i1, i2;
i1.value = 0x00000001;
i2.value = 0x80000000;
printInt(&i1);
printInt(&i2);
return 0;
}
Possible output:
0x7ffea0e30920: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0x7ffea0e30924: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
Additional (based on the comment of @chqrlie):
I've previously used the unsigned char
type, but the C Standard allows only 3 - and since C99 - 4 types. Additional implementation-defined types may be acceptable by the C Standard and it seems that gcc was ok with the unsigned char
type for the bit field, but i've changed it nevertheless to the allowed type _Bool
(since C99).
Noteworthy: The order of bit fields within an allocation unit (on some platforms, bit fields are packed left-to-right, on others right-to-left) are undefined (see Notes section in the reference).
Reference to bit fields: https://en.cppreference.com/w/c/language/bit_field
CodePudding user response:
p = (long long)p 1;
is bad code (undefined behavior UB (e.g. bus fault and re-booted machine)) as it is not specified to work in C. The attempted assigned of the newly formed address is not certainly aligned to int *
needs.
Don`t do that.
To look at the bytes of a[]
#include <stdio.h>
#include <stdlib.h>
void dump(size_t sz, const void *ptr) {
const unsigned char *byte_ptr = (const unsigned char *) ptr;
for (size_t i = 0; i < sz; i ) {
printf("%p X\n", (void*) byte_ptr, *byte_ptr);
byte_ptr ;
}
}
int main(void) {
unsigned int a[3] = {1, 1, 0x7f7f0501u};
dump(sizeof a, a);
}
As this is wiki, feel open to edit.
CodePudding user response:
There are many instances of undefined behavior in your code:
in
printf("%d %p\n", *p, p)
you should castp
as(void *)p
to ensureprintf
receives avoid *
as it expects. This is unlikely to pose a problem on most current targets but some ancien systems had different representations forint *
andvoid *
, such as early Cray systems.in
p = (long long)p 1
, you have implementation defined behavior converting a pointer to an integer and implicitly converting the integral result of the addition back to a pointer. More importantly, this may create a pointer with incorrect alignment for accessingint
in memory, resulting in undefined behavior when you dereferencep
. This would cause a bus error on many systems, eg: most RISC architectures, but by chance not on intel processors.
is the number 1 stored in memory as 00000001 00000000 00000000 00000000?
Yes, your system seems to use little endian representation for int
types. The least significant 8 bits are stored in the byte at the address of a
, then the next least significant 8 bits, and so on. As can be seen in the output, 1
is stored as 01 00 00 00
and 0x7f7f0501
stored as 01 05 7f 7f
.
Why is 16777216 printed in the output?
The second instance of printf("%d %p\n", *p, p)
as undefined behavior. On your system, p
points to the second byte of the array a
and *p
reads 4 bytes from this address, namely 00 00 00 01
(the last 3 bytes of 1
and the first byte of the next array element, also 1
), which is the representation of the int
value 16777216
.
To dump the contents of the array as bytes, you should access it using a char *
as you do in the last loop. Be aware that char
may be signed on some systems, causing for example printf("%x\n", *p3);
to output ffffff80
if p3
points to the byte with hex value 80
. Using unsigned char *
is recommended for consistent and portable behavior.