Home > database >  notation confusion in C when variable is a pointer
notation confusion in C when variable is a pointer

Time:06-10

I am confused about a notation in C when I have a pointer variable f pointing to a struct X defined as:

struct Y {
    int d;
    struct X *e;
};

struct X {
    int *a;
    int b[4];
    struct Y c;
};

Then I have this:

f->c.e->c.e[0].a[0]

The thing I don't understand is the part c.e[0].a[0].

I am not sure what is c.e[0] and then also what is c.e[0].a[0]. (also not sure whether c.e[0] is 20-offset from the starting address of a struct X). Assuming here pointer is 4 bytes, integer is 4 bytes. So int *a int b[4] int d = 20 offset? is it the meaning of f->c.e->c.e[0]? is there f->c.e->c.e[3]? f->c.e->c.e[4]? f->c.e->c.e[5]?

I am confused because usually for a pointer variable say k, I always see k->x, k->y, k->l to refer to the variables within a struct when the variable k is pointing to the struct variable. However in this case, I see the notation of c.e->c.e[0].a[0]. Is e[0].a[0] valid? I guess e[0] is not a pointer then, since if it is a pointer e[0] must always use the -> notation to refer to a variable within a struct it pointing to, but since it uses (dot .) instead of (arrow ->), e[0].a[0] so I guess e[0] in this case is not a pointer right?

then I am little confused as to what is the meaning of c.e[0].a[0] in my given struct X, struct Y, and the given pointer variable f here.

CodePudding user response:

c.e is a pointer to a struct X, so c.e[0] is the struct X pointed to by c.e.

If c.e is a pointer to the first element of an array of 4 struct Y, the 4 elements of this array could be referred to as c.e[0], c.e[1], c.e[2] and c.e[3].

For all pointers p, p[0] is equivalent to *p or *(p 0) (or even 0[p]).

In this case, f->c.e->c.e[0].a[0] is equivalent to f->c.e->c.e->a[0] and *f->c.e->c.e->a. Which syntax is used is a question of style and readability. The array syntax using [] is usually more readable when the index is or can be different from zero, in case of pointers to single objects, the -> syntax is preferred.

The actual implementation details, such as pointer and integer sizes is irrelevant here, but bear in mind that the offset of a member in a structure may be affected by alignment constraints: for example in most current 64-bit systems, an int still has 4 bytes but a pointer uses 8 bytes, so the offset of e in struct Y must be aligned on a multiple of 8, hence is 8, not 4. 4 padding bytes are inserted between d and e. Note also that if d is meant to store the number of elements in the array pointed to by e, it should probably be defined with type size_t.

CodePudding user response:

The confusion comes from the multiple ways of using pointers in C.

Declaring a simple array goes as follow:

<type> <name>[<size>];
// More concretely
int array_of_ints[5];
// Accessing its elements is straightforward
array_of_ints[0] = 42;

But what if you can't know the size in advance? You'd have to allocate memory with e.g. malloc which gives you a pointer to the beginning of the array:

int * array_of_ints = malloc(sizeof(int) * 5);
// Access its elements the same way as arrays declared on the stack
array_of_ints[0] = 42;

How come the same syntax can be used for both types (int[5] vs. int *)?

This is because under the hood the compiler treats it the exact same way. The name of the array is actually a pointer to its first element, as we can see here:

void foo(int * array)
{
    printf("%d\n", *array); // *array and array[0] are interchangeable
}

int bar()
{
    int array[5];
    array[0] = 42;
    foo(array);
}

So an array can be decayed into a pointer. And the language lets you use the [] operator because the compiler actually translates it using pointer arithmetic:

array[0] is the same as *array but, more precisely, the same as *(array 0).

So how do you know if you have a pointer to a single value or a pointer to a value that's the first value of an array? Well, without context, you can't. From an isolated function's perspective, you can't know if taking a char * parameter means it is a string argument or a pointer to a single char variable. It's up to the developers to make it clear, either by passing the size of the array along with it, or naming the variable correctly (char * str vs. char * c for instance), writing documentation, etc.

  • Related