In C, suppose for a pointer p we do *p = 0
. If p points to an int variable, is this defined behavior?
You can do arithmetic resulting in pointing one past the end of an "array object" per the standard, but I am unable to find a really precise definition of "array object" in the standard. I don't think in this context it means just an object explicitly defined as an array, because p=malloc(sizeof(int)); p;
pretty clearly is intended to be defined behavior.
If a variable does not qualify as an "array object", then as far as I can tell *p = 0
is undefined behavior.
I am using the C23 draft, but an answer citing the C11 standard would probably answer the question too.
CodePudding user response:
Yes it is well-defined. Pointer arithmetic is defined by the additive operators so that's where you need to look.
C17 6.5.6/7
For the purposes of these operators, a pointer to an object that is not an element of an array behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type.
That is, int x;
is to be regarded as equivalent to int x[1];
for the purpose of determining valid pointer arithmetic.
Given int x; int* p = &x; *p = 0;
then it is fine to point 1 item past it but not to de-reference that item:
C17 6.5.6/8
If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.
This behavior has not changed in the various revisions of the standard. It's the very same from C90 to C23.
CodePudding user response:
There are two separate questions: 1. What constructs does the Standard specify that correct conforming implementations should process meaningfully, and 2. What constructs do clang and gcc actually process meaningfully. The clear intention of the Standard is to define the behavior of a pointer "one past" an array object and a pointer to the start of another array object that happens to immediately follow it. The actual behavior of clang and gcc tells another story, however.
Given the source code:
#include <stdint.h>
extern int x[],y[];
int test1(int *p)
{
y[0] = 1;
if (p == x 1)
*p = 2;
return y[0];
}
int test2(int *p)
{
y[0] = 1;
uintptr_t p1 = 3*(uintptr_t)(x 1);
uintptr_t p2 = 5*(uintptr_t)p;
if (5*p1 == 3*p2)
*p = 2;
return y[0];
}
both clang and gcc will recognize in both functions that the *p=2
assignment will only run if p
happens to be equal to a one-past pointer to x
, and will conclude as a consequence that it would be impossible for p
to equal y
. Construction of an executable example where clang and gcc would erroneously make this assumption is difficult without the ability to execute a program containing two compilation units, but examination of the generated machine code at https://godbolt.org/z/x78GMqbrv will reveal that every ret
instruction is immediately preceded by mov eax,1
, which loads the return value with 1.
Note that the code in test2 doesn't compare pointers, nor even compare integers that are directly formed from pointers, but the fact that clang and gcc are able to show that the numbers being compared can only be equal if the pointers happened to be equal is sufficient for test2() to, as perceived by clang or gcc, invoke UB if the function is passed a pointer to y
, and y
happens to equal x 1
.