Home > Mobile >  How is an integer stored in C program?
How is an integer stored in C program?

Time:11-18

is the number 1 stored in memory as 00000001 00000000 00000000 00000000?

#include <stdio.h>
int main()
{
    unsigned int a[3] = {1, 1, 0x7f7f0501};
    int *p = a;
    printf("%d %p\n", *p, p);
    p = (long long)p   1;
    printf("%d %p\n", *p, p);
    char *p3 = a;
    int i;
    for (i = 0; i < 12; i  , p3  )
    {
        printf("%x %p\n", *p3, p3);
    }
    return 0;
}

Why is 16777216 printed in the output:

output

CodePudding user response:

An integer is stored in memory in different ways on different architectures. Most commons ways are called little-endian and big-endian byte ordering.

See Endianness

                (long long)p 1
                     |
                     v
Your memory: [0x01, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, ...]
                  

You increment p not like pointer but as a long long number, so it does not point to next integer but the next byte. So you will get 0x00, 0x00, 0x00, 0x01 which translates to 0x1000000 (decimal 16777216) in a little-endian arch.

CodePudding user response:

Something to play with (assuming int is 32 bits wide):

#include <stdio.h>
#include <stdbool.h>

typedef union byte_rec {
    struct bit_rec {
        bool b0 : 1;
        bool b1 : 1;
        bool b2 : 1;
        bool b3 : 1;
        bool b4 : 1;
        bool b5 : 1;
        bool b6 : 1;
        bool b7 : 1;
    } bits;
    unsigned char value;
} byte_t;

typedef union int_rec {
    struct bytes_rec {
        byte_t b0;
        byte_t b1;
        byte_t b2;
        byte_t b3;
    } bytes;
    int value;
} int_t;

void printByte(byte_t *b)
{
    printf(
        "%d %d %d %d %d %d %d %d  ", 
        b->bits.b0,
        b->bits.b1,
        b->bits.b2,
        b->bits.b3,
        b->bits.b4,
        b->bits.b5,
        b->bits.b6,
        b->bits.b7
    );
}

void printInt(int_t *i)
{
    printf("%p: ", i);
    printByte(&i->bytes.b0);
    printByte(&i->bytes.b1);
    printByte(&i->bytes.b2);
    printByte(&i->bytes.b3);
    putchar('\n');
}

int main()
{
    int_t i1, i2;
    
    i1.value = 0x00000001;
    i2.value = 0x80000000;

    printInt(&i1);
    printInt(&i2);

    return 0;
}

Possible output:

0x7ffea0e30920: 1 0 0 0 0 0 0 0  0 0 0 0 0 0 0 0  0 0 0 0 0 0 0 0  0 0 0 0 0 0 0 0  
0x7ffea0e30924: 0 0 0 0 0 0 0 0  0 0 0 0 0 0 0 0  0 0 0 0 0 0 0 0  0 0 0 0 0 0 0 1

Additional (based on the comment of @chqrlie):

I've previously used the unsigned char type, but the C Standard allows only 3 - and since C99 - 4 types. Additional implementation-defined types may be acceptable by the C Standard and it seems that gcc was ok with the unsigned char type for the bit field, but i've changed it nevertheless to the allowed type _Bool (since C99).

Noteworthy: The order of bit fields within an allocation unit (on some platforms, bit fields are packed left-to-right, on others right-to-left) are undefined (see Notes section in the reference).

Reference to bit fields: https://en.cppreference.com/w/c/language/bit_field

CodePudding user response:

p = (long long)p 1; is bad code (undefined behavior UB (e.g. bus fault and re-booted machine)) as it is not specified to work in C. The attempted assigned of the newly formed address is not certainly aligned to int * needs.

Don`t do that.


To look at the bytes of a[]

#include <stdio.h>
#include <stdlib.h>

void dump(size_t sz, const void *ptr) {
  const unsigned char *byte_ptr = (const unsigned char *) ptr;
  for (size_t i = 0; i < sz; i  ) {
    printf("%p X\n", (void*) byte_ptr, *byte_ptr);
    byte_ptr  ;
  }
}

int main(void) {
  unsigned int a[3] = {1, 1, 0x7f7f0501u};
  dump(sizeof a, a);
}

As this is wiki, feel open to edit.

CodePudding user response:

There are many instances of undefined behavior in your code:

  • in printf("%d %p\n", *p, p) you should cast p as (void *)p to ensure printf receives a void * as it expects. This is unlikely to pose a problem on most current targets but some ancien systems had different representations for int * and void *, such as early Cray systems.

  • in p = (long long)p 1, you have implementation defined behavior converting a pointer to an integer and implicitly converting the integral result of the addition back to a pointer. More importantly, this may create a pointer with incorrect alignment for accessing int in memory, resulting in undefined behavior when you dereference p. This would cause a bus error on many systems, eg: most RISC architectures, but by chance not on intel processors.

is the number 1 stored in memory as 00000001 00000000 00000000 00000000?

Yes, your system seems to use little endian representation for int types. The least significant 8 bits are stored in the byte at the address of a, then the next least significant 8 bits, and so on. As can be seen in the output, 1 is stored as 01 00 00 00 and 0x7f7f0501 stored as 01 05 7f 7f.

Why is 16777216 printed in the output?

The second instance of printf("%d %p\n", *p, p) as undefined behavior. On your system, p points to the second byte of the array a and *p reads 4 bytes from this address, namely 00 00 00 01 (the last 3 bytes of 1 and the first byte of the next array element, also 1), which is the representation of the int value 16777216.

To dump the contents of the array as bytes, you should access it using a char * as you do in the last loop. Be aware that char may be signed on some systems, causing for example printf("%x\n", *p3); to output ffffff80 if p3 points to the byte with hex value 80. Using unsigned char * is recommended for consistent and portable behavior.

  •  Tags:  
  • c
  • Related