char[] size not being counted-CodePudding

I have the following code:

#include <stdio.h>
#include <stdint.h>

typedef struct E_s {
    uint32_t    a;
    uint32_t    b;
    uint32_t    c;
} E_t;

typedef struct S_s {
    uint32_t    data_sz;
    char        data[];
} S_t;

typedef struct F_s {
    E_t     E;
    S_t     S;
    char        data[16];
//} __attribute__((packed)) full_msg_t;
} F_t;


int main(int argc, char* argv[])
{
    F_t out;
    printf("sizeof(out.data) = %lu\n", sizeof(out.data));
    printf("sizeof(out.E) = %lu\n", sizeof(E_t));
    printf("sizeof(out.S) = %lu\n", sizeof(S_t));
    printf("sizeof(out) = %lu\n", sizeof(F_t));

    return EXIT_SUCCESS;
}

When I run the code, I see the following output:

sizeof(out.data) = 16
sizeof(out.E) = 12
sizeof(out.S) = 4
sizeof(out) = 32

Question: Why is the size of S_t 4 (third line of output)? I was expecting it to be 8 (uint32_t char[]). Why is the size of char[] not included?

Furthermore, both out.data and out.S.data point to the same memory location, which caused me to dive deep and find the above observation. Any clue here will also be very helpful. I was not expecting those 2 variables to overlap.

CodePudding user response：

The standard specifies that the variable part of a structure with a flexible array member (FAM) is ignored when the size is calculated:

As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member. In most situations, the flexible array member is ignored. In particular, the size of the structure is as if the flexible array member were omitted except that it may have more trailing padding than the omission would imply. However, when a . (or ->) operator has a left operand that is (a pointer to) a structure with a flexible array member and the right operand names that member, it behaves as if that member were replaced with the longest array (with the same element type) that would not make the structure larger than the object being accessed; the offset of the array shall remain that of the flexible array member, even if this would differ from that of the replacement array. If this array would have no elements, it behaves as if it had one element but the behavior is undefined if any attempt is made to access that element or to generate a pointer one past it.

^{Emphasis added}

Note that the struct F_s (aka F_t) should not be accepted; that violates the constraint in §6.7.2.1 ¶3:

A structure or union shall not contain a member with incomplete or function type (hence, a structure shall not contain an instance of itself, but may contain a pointer to an instance of itself), except that the last member of a structure with more than one named member may have incomplete array type; such a structure (and any union containing, possibly recursively, a member that is such a structure) shall not be a member of a structure or an element of an array.

The compiler should reject that (or, at least, emit a diagnostic) because constraint violations require a diagnostic. Even if the compiler doesn't reject it outright, you can't actually use the FAM of the embedded S_t because the data member of F_t doesn't move — the offsets of the elements of a structure are fixed at compile time. It would, de facto, use the data element of F_t, but that isn't defined behaviour.

CodePudding user response：

In this struct:

typedef struct S_s {
    uint32_t    data_sz;
    char        data[];
} S_t;

The data member is a flexible array member. Such a member does not contribute to the size of a struct as its size is not specified. This is spelled out in section 6.7.2.1p18 of the C standard:

As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member. In most situations, the flexible array member is ignored. In particular, the size of the structure is as if the flexible array member were omitted except that it may have more trailing padding than the omission would imply.

So the size of S_t does not include the data member, which is why sizeof(S_t) is 4.

Such a member can only be used when memory for the struct is dynamically allocated. For example:

S_t *s = malloc(sizeof(S_t)   10);

This allows you to access from s->data[0] to s->data[9]

This also means that you can't put a struct with a flexible array member inside of another struct or in an array, because there's no way to know exactly where the flexible array member ends.

This is spelled out in section 6.7.2.1p3:

A structure or union shall not contain a member with incomplete or function type (hence, a structure shall not contain an instance of itself, but may contain a pointer to an instance of itself), except that the last member of a structure with more than one named member may have incomplete array type; such a structure (and any union containing, possibly recursively, a member that is such a structure) shall not be a member of a structure or an element of an array

CodePudding user response：

char data[]; is a flexible array member and it is explicitly guaranteed not to have its size counted. Because it is mainly supposed to be used as malloc(sizeof(St_t) n), where n is the size of the data array.

As for S_t S; inside the other struct, that's invalid C since the struct containing a flexible array member must be placed at the end and in the outer-most struct and you didn't do that. So your code doesn't compile in standard C and there it isn't possible to make assumptions that out.S.data and out.data are somehow the same memory, because all of that is beyond the scope of the C language. I suppose it might be possible that GNU C offers deterministic behavior in the form of non-standard extensions, but I'm not aware of any such guarantees.

CodePudding user response：

Because char[] in structure S_t is called flexible array, which is a feature introduced in the C99 standard of the C programming language.

This maybe helpful flexible-array-members-structure-c