Home > Blockchain >  C pointers arithmetic for arrays
C pointers arithmetic for arrays

Time:01-30

I'm reading the section on array arithmetic in K&R and came across something curious. I posted the whole paragraph for context, but I'm mainly focused on the bold part:

If p and q point to members of the same array, then relations like ==, !=, <, >=, etc. work properly. For example, p < q is true if p points to an earlier member of the array than q does. Any pointer can be meaningfully compared for equality or inequality with zero. But the behavior is undefined for arithmetic or comparisons with pointers that do not point to members of the same array. (There is one exception: the address of the first element past the end of an array can be used in pointer arithmetic.)

I got some answers here (C pointer arithmetic for arrays) but I have doubts described below:

I have a doubt in this since the following code seems to work with dereferencing and comparisons without throwing any exception or errors:

#include <stdio.h>
    
int main() {
    int a[5] = { 1, 2, 3, 4, 5 };
    int b[5] = { 1, 2, 3, 4, 5 };
    int *p = &a[7];
    int *q = &b[3];
    printf("%d\n", p);
    printf("%d\n", q);
    printf("%d\n", q > p); // relational from different arrays
    printf("%d", *p);      // dereferencing also seems to work
}

Can anyone help with this?

The code should throw an error.

CodePudding user response:

Your code has undefined behavior in multiple places, but the C language does not define what happens in case of undefined behavior: anything can happen. There is no exception or error to be thrown, the program may crash, produce unexpected results or seem to work and produce expected results... anything goes, nothing can be expected.

There is undefined behaviour in these places:

  • int *p = &a[7]; you compute the address of a non existing element of the a array beyond the end of the array and not the element just after the end.
  • printf("%d\n", p); you pass a pointer to int where printf expects an int. You should write printf("%p\n", (void *)p);
  • printf("%d\n", q); same as above
  • printf("%d\n", q > p); using the value of p which was not initialized to a valid expression
  • printf("%d", *p); dereferencing invalid pointer p.

Note that the C Standard is more precise than the K&R book about the comparison of pointers for equality:

6.5.8 Relational operators

5 For the purposes of these operators, a pointer to an object that is not an element of an array behaves the same as a pointer to the rst element of an array of length one with the type of the object as its element type.

6 When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to. If two pointers to object types both point to the same object, or both point one past the last element of the same array object, they compare equal. If the objects pointed to are members of the same aggregate object, pointers to structure members declared later compare greater than pointers to members declared earlier in the structure, and pointers to array elements with larger subscript values compare greater than pointers to elements of the same array with lower subscript values. All pointers to members of the same union object compare equal. If the expression P points to an element of an array object and the expression Q points to the last element of the same array object, the pointer expression Q 1 compares greater than P. In all other cases, the behavior is undefined.

6.5.9 Equality operators

6 ... If one operand is a pointer and the other is a null pointer constant, the null pointer constant is converted to the type of the pointer. If one operand is a pointer to an object type and the other is a pointer to a qualified or unqualified version of void, the former is converted to the type of the latter.

7 Two pointers compare equal if and only if both are null pointers, both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function, both are pointers to one past the last element of the same array object, or one is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space.

Here is a modified version with defined behavior:

#include <stdio.h>
    
int main() {
    int a[5] = { 1, 2, 3, 4, 5 };
    int b[5] = { 1, 2, 3, 4, 5 };
    int *p = &a[3];
    int *q = &a[5];
    int *r = &b[0];

    // printing pointer values
    printf("p:      %p\n", (void *)p);
    printf("q:      %p\n", (void *)q);
    printf("r:      %p\n", (void *)r);

    // p, q and r can be compared to 0
    printf("p == 0: %d\n", p == 0);  // outputs 0
    printf("q != 0: %d\n", q != 0);  // outputs 1
    printf("!r:     %d\n", !r);      // outputs 0

    // p and q can be compared for equality and order
    printf("p < q:  %d\n", p < q);  // outputs 1
    printf("p != q: %d\n", p != q); // outputs 1

    // p and r can only be compared for equality
    printf("p != r: %d\n", p != r); // outputs 1

    // q and r can only be compared for equality but the result is unspecified:
    //   it is compiler dependent depends on the memory layout
    printf("q != r: %d\n", q != r); // may output 1 or 0
    return 0;
}

CodePudding user response:

You are working with undefined behavior. This means you can't rely on it - it may work well one day, may not work at all on another. But no matter what - you can't trust it.

And that's why undefined behavior is so dangerous. Sometimes it may work fine until failing horribly one day with no prior warning.

  • Related