Home > Software engineering >  What is the effect of ` c = (int *) ((char *) c 1)`?
What is the effect of ` c = (int *) ((char *) c 1)`?

Time:10-19

I meet the question in OS course. Here is the code from 6.828 (Operating System) online course. It meant to let learners practice the pointers in C programming language.

#include <stdio.h>
#include <stdlib.h>

void
f(void)
{
    int a[4];
    int *b = malloc(16);
    int *c;
    int i;

    printf("1: a = %p, b = %p, c = %p\n", a, b, c);

    c = a;
    for (i = 0; i < 4; i  )
    a[i] = 100   i;
    c[0] = 200;
    printf("2: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
       a[0], a[1], a[2], a[3]);

    c[1] = 300;
    *(c   2) = 301;
    3[c] = 302;
    printf("3: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
       a[0], a[1], a[2], a[3]);

    c = c   1;
    *c = 400;
    printf("4: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
       a[0], a[1], a[2], a[3]);

    c = (int *) ((char *) c   1);
    *c = 500;
    printf("5: a[0] = %d, a[1] = %d, a[2] = %d, a[3] = %d\n",
       a[0], a[1], a[2], a[3]);

    b = (int *) a   1;
    c = (int *) ((char *) a   1);
    printf("6: a = %p, b = %p, c = %p\n", a, b, c);
}

int
main(int ac, char **av)
{
    f();
    return 0;
}

I copy it to a file and compile it use gcc , then I got this output:

$ ./pointer 
1: a = 0x7ffd3cd02c90, b = 0x55b745ec72a0, c = 0x7ffd3cd03079
2: a[0] = 200, a[1] = 101, a[2] = 102, a[3] = 103
3: a[0] = 200, a[1] = 300, a[2] = 301, a[3] = 302
4: a[0] = 200, a[1] = 400, a[2] = 301, a[3] = 302
5: a[0] = 200, a[1] = 128144, a[2] = 256, a[3] = 302
6: a = 0x7ffd3cd02c90, b = 0x7ffd3cd02c94, c = 0x7ffd3cd02c91

I can easily understand the output of 1,2,3,4. But it's hard for me to understand the output of 5. Specially why a[1] = 128144 and a[2] = 256?
It seems this output is the result of

c = (int *) ((char *) c   1);
*c = 500;

I have trouble understand the function of the code c = (int *) ((char *) c 1). c is a pointer by definiton int *c. And before the output of 5th line, c points to the second address of array a by c = a and c = c 1. Now what's the meaning of (char *) c and ((char *) c 1) ,then (int *) ((char *) c 1)?

CodePudding user response:

This is a result of undefined behavior. You invoke undefined behavior because you dereference a null pointer (for array a) and the array size is zero (for array b) - for this case, this is equivalent to c= a; b= 0; c = (int *) ((char *) c 1). This should trigger a warning, which is why I also added -Wall -pedantic -std=c99 in the above example.

To answer your question about (char *) c and ((char *) c 1).

(char *) c: Since c is a pointer, c->type is int * (pointer to int). This makes c->type have type char *. You take the address of the second element in the array c and assign it to a. So, c->type is then char * (address of second element in the array c). c[0] (index 0) is therefore the first element in array c.

((char *) c   1) - c   1 = &c[1]. c[0]   1 = c[1] (first element of the array c 1).

CodePudding user response:

Although this is undefined behavior per the standard, it has a clear meaning in "ancient C", and it clearly works that way on the machine/compiler you're working with.

First, it casts c to a (char *), which means that pointer arithmetic will work in units of sizeof(char) (i.e. one byte) instead of sizeof(int). Then it adds one byte. Then it converts the result back to (int *). The result is an int pointer that now refers to an address one byte higher than it used to. Since c was pointing at a[1] beforehand, afterwards *c = 500 will write to the last three bytes of a[1] and the first byte of a[2].

On many machines (but not x86) this is an outright illegal thing to do. An unaligned access like that would simply crash your program. The C standard goes further and says that that code is allowed to do anything: when the compiler sees it, it can generate code that crashes, does nothing, writes to a completely unrelated bit of memory, or causes a small gnome to pop out of the side of your monitor and hit you with a mallet. However, sometimes the easiest thing to do in the case of UB is also the straightforward obvious thing, and this is one of those cases.

Your course material is trying to show you something about how numbers are stored in memory, and how the same bytes can be interpreted in different ways depending on what you tell the CPU. You should take it in that spirit, and not as a guide to writing decent C.

  • Related