Structure with zero length array member, being part of another structure-CodePudding

I was reading about zero-length arrays from this link and am unable to wrap my head around this code sorcery :)

struct s {
  int a;
  int b[0];
};

struct t1 {
  struct s f;
  int c[3];
} g1 = {{1},{1,2}};

Then they print it as

printf("%d %d %d %d\n", g1.f.a, g1.f.b[0], g1.f.b[1], g1.f.b[2]);

With output 1 1 2 0

I thought {1,2} be assigned to int c[3] member ? And infact, if I print them it does

How come b[0], b[1] and b[2] from g1.f have those values

I tried to dig further and came across the GNU link but nothing lights up

Any insights on this please ?

CodePudding user response：

Zero-length arrays are a GCC extension. They provide a way to have an array at the end of a structure whose size is determined by the actual memory allocated rather than a fixed structure size. This GCC extension predates the C standard’s flexible array member (which is declared with [] instead of [0]). New code should use the standard method rather than the GCC extension.

In struct t1, the struct s f is followed by int c[3]. This guarantees that in a struct t1, there is room for 3 int elements, so g1.b can be used as if it were an array of 3 int.

Contrary to comments, this is not invalid C code. As defined by the C standard, it is conforming C code, meaning it is code that can be compiled by (is accepted by) a conforming compiler. It is not strictly conforming code, since it is not code that uses only behaviors completely defined by the C standard. Technically, to remain conforming, a compiler must issue a diagnostic message about the zero-length array, but it may still accept and translate the program. GCC has modes (selected by command-line switches) in which it will not issue a diagnostic. In such modes, it is not a conforming compiler. However, the program can still be compiled, with diagnostics, in a conforming mode, as with the switches -std=c18 -pedantic.

g1 = {{1},{1,2}};
…
I thought {1,2} be assigned to int c[3] member ?
…
How come b[0], b[1] and b[2] from g1.f have those values

This is bad practice. It relies on the fact that g1.b and g1.c start in the same place. They start in the same place because g1.b is empty (has zero elements) and because there is no padding after it in the structure struct s. The initialization g1 = {{1},{1,2} puts 1 and 2 into g1.c[0] and g1.c[1], and then g1.f.b[0] and g1.f.b[1] access the same memory, and GCC supports this because of its zero-length array extension.

It is bad practice because the lack of padding at the end of the array is fragile. If struct s were:

struct s
{
    double d;
    int a;
    int b[0];
}

where double is eight bytes with eight-byte alignment and int is four bytes with four-byte alignment, then the structure would have four bytes of padding after b to make the total size 16 bytes so that its size is a multiple of the eight bytes required for double. (Padding structures this way means each element in an array of them has the alignment required for its members.) So the initialization g1 = {{1},{1,2}} can break if anybody modifies the definition of struct s. It would be preferable to give g1 initial values by assignment (g1.f.b[0] = 1; g1.f.b[1] = 2;) or to include a _Static_assert that the f.b and c members have the same offset:

Static_assert(offsetof(struct t1, f.b) == offsetof(struct t1, c), "f.b and c must have the same offset.");

CodePudding user response：

It just happens that the address of g1.f.b is also the same as g1.c, because is points to the end of struct s. To convince yourself, try:

printf("%p %p\n", g1.f.b, g1.c);

Then, g1.f.b and g1.c are interchangeable.

I'm unsure whether using g1.f.b to access g1.c is actually OK or just undefined behaviour that happens to work fine for your combination of compiler/architecture/options.