C pointer question for my understanding decimal decimal[2]-CodePudding

Hello I have 3 questions / 3 lines of code that i don´t understand. Would be nice if someone can help me. My understanding is still not the yellow from the egg. The questions are commented as //QUESTION:

#include <stdio.h>
#include <stdlib.h>
struct trace {
    char *sign;
    int *values;
    struct trace *pN;
};

int main() {
    int decimal[] = {4,2,1};
    char text[]="Word-2!";
    struct trace *pV;
    pV = (struct trace*) calloc(2, sizeof(struct trace));
    pV->pN = pV;
    pV->values = decimal   decimal[2]; // = 2   

    // QUESTION: explanation why decimal   decimal[2] is 2 / what is decimal (not *decimal). 
    // My guess is: decimal[0   decimal[2]] = decimal[0   1] = decimal[1] = 2
    
    (*pV).sign = text   *decimal;  // text   4  //"-2!";
    *(pV   1) = pV[0]; // pV[1] = pV[0] = *pV
      pV[1].values;         // QUESTION: what does this do?  the    in front of pV instead of pV[  1].values
      *pV[1].values;        // QUESTION: what does this do?
    printf("%d %s\n", *pV->values, pV->sign);
    printf("%d %s\n",*pV->pN[1].values, pV->pN[1].sign);

    return 0;
}

edit: the goal of this is to find out what is being displayed in those 2 printf, which both are: "2 -2!" and "2 -2!"

CodePudding user response：

// QUESTION: explanation why decimal decimal[2] is 2 / what is decimal (not *decimal).
// My guess is: decimal[0 decimal[2]] = decimal[0 1] = decimal1 = 2

Strictly speaking decimal decimal[2] isn't 2. It is the pointer, pointing to memory containing 2 (because simple decimal points to decimal[0], and adding to this pointer decimal[2], that is 1, gives us decimal[1], that is 2).

For the next two questions it is useful to look at operator precedence table.

  pV[1].values; // it's basically   (pv[1].values), i.e. incrementing of the pointer 'values'
  *pV[1].values; // it's basically   (*(pv[1].values)), i.e. incrementing of the integer value, pointed by the pointer 'values'

CodePudding user response：

There's a good reason you're having trouble understanding that code; it's obnoxious. It's trying to illustrate some of the weirder behaviors of pointers and arrays, but it's done in a way that's excessively "tricky" and difficult to understand. It mixes and matches array and pointer notation, and it's inconsistent in how it accesses members. It's also unsafe as hell. It's a good example of how not to write C code.

Before we start, a little syntax cheat sheet:

a[i] == *(a   i), therefore
a[0] == *(a   0) == *a

p->m == (*p).m == (*(p   0)).m == p[0].m

So:

pV->values = decimal decimal[2];

TL/DR - this is setting pV->values (which is the same as (*pV).values, which is the same as pV[0].values) to point to the second element of the decimals array; graphically, it would look like this:

    ---          ---                            --- 
pV:|   | -----> |   | pV[0].sign      decimal: | 4 | decimal[0]
    ---          ---                            --- 
                |   | pV[0].values ----------> | 2 | decimal[1]
                 ---                            --- 
                |   | pV[0].pN                 | 1 | decimal[2]
                 ---                            --- 
                |   | pV[1].sign
                 --- 
                |   | pV[1].values
                 --- 
                |   | pV[1].pN
                 ---

It's equivalent to writing

pV->values = &decimal[1];

In this context, the expression decimal "decays" from type "3-element array of int" to type "pointer to int", and the value of the expression is the address of the first element of the array (we'll get into why this is later). To this pointer value we are adding the value stored in decimal[2], which is 1:

pV->values = decimal   1;

Adding 1 to a pointer yields a pointer to the next object of the pointed-to type, which is not necessarily the next byte; if the address of decimal[0] is 0x8000 and an int is 4 bytes wide, then the result of the addition above is 0x8004, not 0x8001.

(*pV).sign = text *decimal;

TL/DR - this sets pV->sign to point to the "-" character of the text string; *decimal is the same as decimal[0], which contains the value 4, so the above is equivalent to

(*pV).sign = &text[4];

By this point in the program, we have the following situation:

    ---          ---                              --- 
pV:|   | -- --> |   | pV[0].sign ---    decimal: | 4 | decimal[0]
    ---    |     ---                |             --- 
           |    |   | pV[0].values ------------> | 2 | decimal[1]
           |     ---                |             --- 
            --- |   | pV[0].pN      |            | 1 | decimal[2]
                 ---                |             --- 
                |   | pV[1].sign    |
                 ---                |             --- 
                |   | pV[1].values  |      text: |'W'| text[0]
                 ---                |             --- 
                |   | pV[1].pN      |            |'o'| text[1]
                 ---                |             --- 
                                    |            |'r'| text[2]
                                    |             --- 
                                    |            |'d'| text[3]
                                    |             --- 
                                     ----------> |'-'| text[4]
                                                  --- 
                                                 |'2'| text[5]
                                                  --- 
                                                 |'!'| text[6]
                                                  --- 
                                                 | 0 | text[7]
                                                  ---

pV[1].values;

TL/DR - this sets pV[1].values to point to decimal[2].

The expression pV[1].values is parsed as (pV[1].values) - we're adding 1 to pV[1].values. Earlier in the program we copied the contents of pV[0] to pV[1], and we had set pV[0].values to point to decimal[1]. Like I said above, adding 1 to a pointer yields a pointer to the next object of the pointed-to type; hence, pV[1].values now points to decimal[3].

So now our picture looks like this:

    ---          ---                              --- 
pV:|   | -- --> |   | pV[0].sign ---    decimal: | 4 | decimal[0]
    ---    |     ---                |             --- 
           |    |   | pV[0].values ------------> | 2 | decimal[1]
           |     ---                |             --- 
            --- |   | pV[0].pN      |    ------> | 1 | decimal[2]
           |     ---                |   |         --- 
           |    |   | pV[1].sign ---    |
           |     ---                |   |         --- 
           |    |   | pV[1].values -----   text: |'W'| text[0]
           |     ---                |             --- 
            --- |   | pV[1].pN      |            |'o'| text[1]
                 ---                |             --- 
                                    |            |'r'| text[2]
                                    |             --- 
                                    |            |'d'| text[3]
                                    |             --- 
                                     ----------> |'-'| text[4]
                                                  --- 
                                                 |'2'| text[5]
                                                  --- 
                                                 |'!'| text[6]
                                                  --- 
                                                 | 0 | text[7]
                                                  ---

*pV[1].values;

TL/DR - we are incrementing the value of decimal[3].

Similar to the earlier expression, *pV[1].values is parsed as (*pV[1].values). Instead of adding 1 to pV[1].values, we are adding 1 to the thing pV[1].values points to, which is decimal[3]. So finally, after everything is said and done, our picture looks like this:

    ---          ---                              --- 
pV:|   | -- --> |   | pV[0].sign ---    decimal: | 4 | decimal[0]
    ---    |     ---                |             --- 
           |    |   | pV[0].values ------------> | 2 | decimal[1]
           |     ---                |             --- 
            --- |   | pV[0].pN      |    ------> | 2 | decimal[2]
           |     ---                |   |         --- 
           |    |   | pV[1].sign ---    |
           |     ---                |   |         --- 
           |    |   | pV[1].values -----   text: |'W'| text[0]
           |     ---                |             --- 
            --- |   | pV[1].pN      |            |'o'| text[1]
                 ---                |             --- 
                                    |            |'r'| text[2]
                                    |             --- 
                                    |            |'d'| text[3]
                                    |             --- 
                                     ----------> |'-'| text[4]
                                                  --- 
                                                 |'2'| text[5]
                                                  --- 
                                                 |'!'| text[6]
                                                  --- 
                                                 | 0 | text[7]
                                                  ---

So why do array expressions "decay" into pointer expressions?

C is derived from an earlier programming language named B - in B, when you declared an array, the compiler would set aside a separate word to store the offset to the first element of the array. Given the declaration

auto a[5];

you'd have something like this in memory:

    --- 
a: |   | ---------- 
    ---            |
    ...            |
    ---            |
   |   | a[0] <---- 
    --- 
   |   | a[1]
    --- 
   |   | a[2]
    --- 
   |   | a[3]
    --- 
   |   | a[4]
    ---

The array subscript operation a[i] was defined as *(a i) - given an address stored in a, offset i words from that address and dereference the result.

Ritchie wanted to keep B's array behavior in C (a[i] == *(a i)), but he didn't want to store the separate pointer that behavior required. Instead, we have this rule:

6.3.2.1 Lvalues, arrays, and function designators
...
3 Except when it is the operand of the sizeof operator, the _Alignof operator, or the unary & operator, or is a string literal used to initialize an array, an expression that has type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points to the initial element of the array object and is not an lvalue. If the array object has register storage class, the behavior is undefined.

^{C 2011 Online Draft}

When you declare an array in C like

int a[5];

you get this in memory:

    --- 
a: |   | a[0]
    --- 
   |   | a[1]
    --- 
   |   | a[2]
    --- 
   |   | a[3]
    --- 
   |   | a[4]
    ---

The array subscript operation a[i] is still defined as *(a i), but instead of storing a pointer value in a separate object named a, a pointer value is computed as necessary. Hence why the expressions decimal and text ultimately evaluate to pointer values.