Say I have two structs: object
and widget
:
struct object {
int field;
void *pointer;
};
struct widget {
int field;
void *pointer;
};
And a function:
void consume(struct object *obj)
{
printf("(%i, %p)\n", obj->field, obj->pointer);
}
I'm aware that if I try and do:
struct widget wgt = {3, NULL};
consume(&wgt);
I would violate the strict aliasing rule, and thus have an undefined behaviour.
As far as I understand, the undefined behaviour results from the fact that the compiler may align the struct fields differently: that is, padding fields to align with address boundaries (but never changing fields order, since the order is guaranteed to be respected by the standard).
But what if the two structs are packed? Will they have the same memory layout? Or, in other words, does the above consume()
still have an undefined behaviour (despite the persistent compiler warning)?
Note: I used struct __attribute__((__packed__)) object { ... };
for packing (GCC).
CodePudding user response:
“As far as I understand, the undefined behaviour results from the fact that the compiler may align the struct fields differently…
No, it does not (solely). Even if two structures have identical member definitions, they are different types. Consider two types:
struct ComplexNumber { double real, imag; };
struct GeometricPoint { double x, y; };
which might be passed to some routine:
double foo(ComplexNumber *c, GeometricPoint *p)
…
Inside the function, code might assign some value *p
and use the value of *c
, or vice-versa. Because these are different and incompatible types, the compiler is allowed to assume that they are not aliases for the same memory. That means, when optimizing, it can assume that assigning a value to *p
will not change the value of *c
, which the compiler might already be holding in registers from a previous use. Therefore, it does not need to reload the registers in case assigning to *p
changed *c
.
Thus the aliasing rule grants compilers license for this and similar behaviors and means that, if you violate the rule, the behavior is not defined, even if the structures have identical layouts.
Note: I used
struct __attribute__((__packed__)) object { ... };
for packing (GCC).
Packing structures is a GCC extension. Because of its specification of the extension, you can expect that identically defined packed structures will have identical memory layouts. However, the aliasing rules of the C standard still apply. GCC has a switch to turn off the requirements of the aliasing rule, -fno-strict-aliasing
.
If you know two objects have identical layout and want to use one as the other without violating the aliasing rule, you can do this by:
- Copying the bytes of one into the other, as with
memcpy(p, c, sizeof *p);
. - Defining a union containing both types, initializing it with one type, and accessing the member of the other type. (This is defined by the C standard but not by the C standard.)
CodePudding user response:
They will most likely have the same layout; that will be part of the compiler's ABI.
The relevant architecture and/or OS may have a standard ABI that may or may not include a specification for packed
. But the compiler will have its own ABI to lay them out in a predictable fashion, although the algorithm may not be written down precisely anywhere except the compiler source code.
However, that does not mean your code is safe. The strict aliasing rule applies to pointers to different types, whether or not they have the same layout.
Here is an example that can be compiled with gcc -O2
:
#include <stdio.h>
__attribute__((packed))
struct object {
int field;
void *pointer;
};
__attribute__((packed))
struct widget {
int field;
void *pointer;
};
struct widget *some_widget;
__attribute__((noipa)) // prevent inlining which hides the bug
void consume(struct object *obj)
{
some_widget->field = 42;
int val = obj->field;
printf("%i\n", val);
}
int main(void) {
struct widget wgt = {3, NULL};
some_widget = &wgt;
consume((struct object *)&wgt);
}
You are probably expecting this code to print 42
, because some_widget
and obj
both point to wgt
and thus val = obj->field
should read the same int
that was written by some_widget->field = 42
. But in fact it prints 3
. The compiler is allowed to assume that obj
and some_widget
do not alias, as they have different types; so the write and the read are considered independent and may be reordered.
On the level of the standard, you are accessing the object wgt
, whose effective type is struct widget
, through the lvalue *some_widget
whose type is struct object
. These types are not compatible because they have different tags (widget
vs object
), and so the behavior is undefined.
CodePudding user response:
To make them compatible you need to typedef them. They will be compatible the portable way.
struct {
int field;
void *pointer;
}object;
struct {
int field;
void *pointer;
}widget;
6.2.7: Moreover, two structure, union, or enumerated types declared in separate translation units are compatible if their tags and members satisfy the following requirements: If one is declared with a tag, the other shall be declared with the same tag. If both are completed anywhere within their respective translation units, then the following additional requirements apply: there shall be a one-to-one correspondence between their members such that each pair of corresponding members are declared with compatible types; if one member of the pair is declared with an alignment specifier, the other is declared with an equivalent alignment specifier; and if one member of the pair is declared with a name, the other is declared with the same name. For two structures, corresponding members shall be declared in the same order. For two structures or unions, corresponding bit-fields shall have the same widths. For two enumerations, corresponding members shall have the same values.
I would violate the strict aliasing rule, and thus have an undefined behaviour.
No, because they have compatible types.
gcc, IAR and Keil packing extensions will not make them not compatible.