Home > Software design >  Why is this code involving arrays and pointers behaving as it does?
Why is this code involving arrays and pointers behaving as it does?

Time:10-08

int a[5] = { 1, 3, 5, 7, 9 };
int *p = (int *)(&a   1);
printf("%d, %d", *(a   1), *(p - 1));

The answer is NO.1

  1. 3, 9
  2. Error
  3. 3, 1
  4. 2, 1

It is easy to get *(a 1) is 3.

But how about int *p = (int *)(&a 1); and *(p - 1) ?

CodePudding user response:

The answer to this could be either "1) 3,9" or "2) Error" (or more specifically undefined behavior) depending on how you read the C standard.

First, let's take this:

&a   1

The & operator takes the address of the array a giving us an expression of type int(*)[5] i.e. a pointer to an array of int of size 5. Adding 1 to this treats the pointer as pointing to the first element of an array of int [5], with the resulting pointer pointing to just after a.

Also, even though &a points to a singular object (in this case an array of type int [5]) we can still add 1 to this address. This is valid because 1) a pointer to a singular object can be treated as a pointer to the first element of an array of size 1, and 2) a pointer may point to one element past the end of an array.

Section 6.5.6p7 of the C standard states the following regarding treating a pointer to an object as a pointer to the first element of an array of size 1:

For the purposes of these operators, a pointer to an object that is not an element of an array behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type.

And section 6.5.6p8 says the following regarding allowing a pointer to point to just past the end of an array:

When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P) N (equivalently, N (P)) and (P)-N (where N has the value n) point to, respectively, the i n-th and i−n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P) 1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.

Now comes the questionable part, which is the cast:

(int *)(&a   1)

This converts the pointer of type int(*)[5] to type int *. The intent here is to change the pointer which points to the end of the 1-element array of int [5] to the end of the 5-element array of int.

However the C standard isn't clear on whether this conversion and the subsequent operation on the result is allowed. It does allow conversion from one object type to another and back, assuming the pointer is properly aligned. While the alignment shouldn't be an issue, using this pointer is iffy.

So this pointer is assigned to p:

int *p = (int *)(&a   1)

Which is then used as follows:

*(p - 1)

If we assume that p validly points to one element past the end of the array a, subtracting 1 from it results in a pointer to the last element of the array. The * operator then dereferences this pointer to the last element, yielding the value 9.

So if we assume that (int *)(&a 1) results in a valid pointer, then the answer is 1) 3,9 otherwise the answer is 2) Error.

CodePudding user response:

This:

&a   1;

is taking the address of a, an array, and adding 1, which adds the size of one a, i.e. 5 integers. Then the indexing "backs down", one integer, ending up in the final element of a.

CodePudding user response:

Normally whenever arrays are used in expressions, they "decay" into a pointer to the first element. There are a few exceptions to this rule and one such exception is the & operator.

&a therefore yields a pointer to the array of type int (*)[5]. Then &a 1 is pointer arithmetic on such a type, meaning the pointer address is increased by the size of one int [5]. We end up pointing just beyond the array, but C actually allows us to do that as long as we don't de-reference that location.

Then the pointer is forced a type conversion to (int *) which we can do too - C allows pretty much any manner of wild pointer conversions as long as we don't de-reference or cause misalignment etc.

p - 1 does pointer arithmetic on type int and the actual type of data in the array is also int, so we are allowed to de-reference that location. We end up at the last item of the array.

CodePudding user response:

In the line

int *p = (int *)(&a 1);

note that &a is being written, not a. This is important.

If simply a had been written, then the array would have decayed to a pointer to the first element. However, since &a was written instead, the result of the expression has the same value as a, but the type is different: The type is a pointer to an array of 5 int elements, not a pointer to a single int element.

According to the rules on pointer arithmetic, incrementing a pointer by 1 will increase the memory address by the size of the object it is pointing to. Since the pointer is not pointing to a single element, but to an array of 5 elements, the memory address will be incremented by 5 * sizeof(int). Therefore, after incrementing the pointer, the value of (but not type of) the pointer will be equivalent to &a[5], i.e. one past the end of the array.

After casting this pointer to int * and assigning the result to p, the expression p is fully equivalent to &a[5] (both in value and in type).

Therefore, the expression *(p - 1) is equivalent to *(&a[5] - 1), which is equivalent to *(&a[4]), or simply a[4].

  • Related