Home > front end >  confusion regarding alignment of data struct in c
confusion regarding alignment of data struct in c

Time:09-09

I have a question regarding the following:

enter image description here

For the int32_t datatype, does it not have to start at address that is a multiple of 4? So for example it can only start at address such as the following: 0x1004, 0x1008, 0x1012, 0x1016,....

So why is it that b can stores the number 0xEF0369BE when EF starts at 0x1007? I understand we need to add 2 bytes of padding after a[2]. But even if we add 2 bytes of padding, b will still start at 0x1007, so wouldn't that makes b not satisfy the requirement of alignment?

I understand that char a[2] can starts anywhere since char has alignment requirement of 1. But int32_t has alignment requirement of 4, so it can only starts at address that is divisible by 4.

I thought I understand about alignment, but somehow I don't think I am now. Could someone explains a bit what is going on here in terms of alignment? and the alignment of the different types inside a Struct.

CodePudding user response:

The C standard does not have any rule that int32_t must have any particular alignment. Alignment requirements are implementation-defined. Each C implementation may choose what its alignment requires are (with some constraints and, to conform to the standard, it must document them).

It is odd the example shows two bytes of padding between the a and b members, as these bytes are apparently not needed for padding if int32_t does not require four-byte alignment. The standard allows implementations to insert padding between members even if it is not needed for alignment, but implementations do not normally do that.

Overall, you should not give this too much concern. It is most likely an example that was not fully thought out.

However, one way this can arise is using GCC’s packed attribute:

  • struct myStruct is declared as shown in the example.
  • The C implementation normally requires int32_t to have four-byte alignment, so it lays out the structure with two bytes of padding between members a and b.
  • A further structure is declared:
struct foo
{
    char x[3];
    struct myStruct s;
} __attribute__((__packed__));

This declares foo to be packed, meaning the normal alignment requirement of its members is suppressed, and the members are packed into consecutive bytes with no padding. While this means the struct myStruct member s is put into consecutive bytes after x, internally it retains its original layout. It has to keep its original layout to be a proper struct myStruct even though its alignment requirement is suppressed. For example, if we executed this code:

struct myStruct m = { some initialization };
struct foo f;
memcpy(&f.s, &m, sizeof m);

then we would want the memcpy to reproduce m in f.s.

Compiling and this code:

#include <stdint.h>
#include <stdio.h>


int main(void)
{
    struct myStruct { char a[2]; int32_t b; int16_t c; };

    struct foo
    {
        char x[3];
        struct myStruct s;
    } __attribute__((__packed__));

    struct foo f;

    printf("%p\n", (void *) &f.s.b);
}

in a C implementation that normally requires four-byte alignment for int32_t produces output of “0x7ffee7e729f7” in one run, showing that the b member is at an address that is 3 modulo 4.

CodePudding user response:

By default, the compiler aligns the addr of b to 4. If you don't want this auto-alignments, use __attribute__((__packed__)). See the follwing example:

struct myStruct {
    char a[2];
    int32_t b;
    int16_t c;
};
struct myStruct s;

int main(int argc, char *argv[])
{
    printf("Address of s: %p\n", &s);
    printf("Size of s: %zu\n", sizeof(s));
    printf("Address of s.a: %p\n", &s.a);
    printf("Address of s.b: %p\n", &s.b);
    printf("Address of s.c: %p\n", &s.c);
    return 0;
}

Output:

Address of s: 0x601040
Size of s: 12
Address of s.a: 0x601040
Address of s.b: 0x601044
Address of s.c: 0x601048

After adding __attribute__((__packed__)) when defining myStruct:

struct myStruct {
    char a[2];
    int32_t b;
    int16_t c;
} __attribute__((__packed__));

The output of test code as shown above:

Address of s: 0x601040
Size of s: 8
Address of s.a: 0x601040
Address of s.b: 0x601042
Address of s.c: 0x601046
  • Related