I am confused about a notation in C when I have a pointer variable f
pointing to a struct X
defined as:
struct Y {
int d;
struct X *e;
};
struct X {
int *a;
int b[4];
struct Y c;
};
Then I have this:
f->c.e->c.e[0].a[0]
The thing I don't understand is the part c.e[0].a[0]
.
I am not sure what is c.e[0]
and then also what is c.e[0].a[0]
. (also not sure whether c.e[0]
is 20-offset from the starting address of a struct X
). Assuming here pointer is 4 bytes, integer is 4 bytes. So int *a
int b[4]
int d
= 20 offset?
is it the meaning of f->c.e->c.e[0]
? is there f->c.e->c.e[3]
? f->c.e->c.e[4]
? f->c.e->c.e[5]
?
I am confused because usually for a pointer variable say k
, I always see k->x
, k->y
, k->l
to refer to the variables within a struct when the variable k
is pointing to the struct variable. However in this case, I see the notation of c.e->c.e[0].a[0]
. Is e[0].a[0]
valid? I guess e[0]
is not a pointer then, since if it is a pointer e[0]
must always use the ->
notation to refer to a variable within a struct it pointing to, but since it uses (dot .
) instead of (arrow ->
), e[0].a[0]
so I guess e[0]
in this case is not a pointer right?
then I am little confused as to what is the meaning of c.e[0].a[0]
in my given struct X
, struct Y
, and the given pointer variable f
here.
CodePudding user response:
c.e
is a pointer to a struct X
, so c.e[0]
is the struct X
pointed to by c.e
.
If c.e
is a pointer to the first element of an array of 4 struct Y
, the 4 elements of this array could be referred to as c.e[0]
, c.e[1]
, c.e[2]
and c.e[3]
.
For all pointers p
, p[0]
is equivalent to *p
or *(p 0)
(or even 0[p]
).
In this case, f->c.e->c.e[0].a[0]
is equivalent to f->c.e->c.e->a[0]
and *f->c.e->c.e->a
. Which syntax is used is a question of style and readability. The array syntax using []
is usually more readable when the index is or can be different from zero, in case of pointers to single objects, the ->
syntax is preferred.
The actual implementation details, such as pointer and integer sizes is irrelevant here, but bear in mind that the offset of a member in a structure may be affected by alignment constraints: for example in most current 64-bit systems, an int
still has 4 bytes but a pointer uses 8 bytes, so the offset of e
in struct Y
must be aligned on a multiple of 8
, hence is 8
, not 4
. 4 padding bytes are inserted between d
and e
. Note also that if d
is meant to store the number of elements in the array pointed to by e
, it should probably be defined with type size_t
.
CodePudding user response:
The confusion comes from the multiple ways of using pointers in C.
Declaring a simple array goes as follow:
<type> <name>[<size>];
// More concretely
int array_of_ints[5];
// Accessing its elements is straightforward
array_of_ints[0] = 42;
But what if you can't know the size in advance? You'd have to allocate memory with e.g. malloc
which gives you a pointer to the beginning of the array:
int * array_of_ints = malloc(sizeof(int) * 5);
// Access its elements the same way as arrays declared on the stack
array_of_ints[0] = 42;
How come the same syntax can be used for both types (int[5]
vs. int *
)?
This is because under the hood the compiler treats it the exact same way. The name of the array is actually a pointer to its first element, as we can see here:
void foo(int * array)
{
printf("%d\n", *array); // *array and array[0] are interchangeable
}
int bar()
{
int array[5];
array[0] = 42;
foo(array);
}
So an array can be decayed into a pointer. And the language lets you use the []
operator because the compiler actually translates it using pointer arithmetic:
array[0]
is the same as*array
but, more precisely, the same as*(array 0)
.
So how do you know if you have a pointer to a single value or a pointer to a value that's the first value of an array? Well, without context, you can't. From an isolated function's perspective, you can't know if taking a char *
parameter means it is a string argument or a pointer to a single char variable. It's up to the developers to make it clear, either by passing the size of the array along with it, or naming the variable correctly (char * str
vs. char * c
for instance), writing documentation, etc.