int is 4 bytes on my machine, long 8 bytes, etc.
Hey, so I've encountered a pretty interesting thing in C and started wondering how structures manage their data inside. I thought it works like an array, but oh boy, I was wrong. So basically, I thought that the data inside sums up itself, but I've found out on stack overflow, that some compilers might do some optimizations due to processor's architecture requirements. And there come alignments. I've found two links about alignments, and I've wanted to calculate my struct's size and I've experimented a bit, but I think I understand that in some ways, and in some not. That's why I wanted to create that topic, since I couldn't fully grasp some of the examples provided by people who were answering in those topics. For example:
#include <stdio.h>
struct test {
char a;
char b;
int c;
long d;
int e;
};
int main(void){
printf("test = %d\n", sizeof(test));
return 0;
}
Output:
test = 24
I was expecting the compiler to do an optimization like this:
char a
is 1 byte, char b
is 1 byte, thus we don't need to align. char b is 1 byte, int c
is 4 bytes, thus we need to align 3 bytes. int c
is 4 bytes, long d
is 8 bytes, thus we need to align 4 bytes. long d
is 8 bytes, int e
is 4 bytes, thus we need to align 4 bytes. And till this point the total size is 29. Rounding it with ceiling to the nearest even number gives 30. Why it is 24 then?
I've also found out that the char a char b
give a padding equal to 2 bytes, so we only need to align 2 more bytes, thus maybe that's where I'm making a mistake. Also if I add more variables:
#include <stdio.h>
struct test {
char a;
char b;
int c;
long d;
int e;
char f;
char g;
char h;
char i;
};
int main(void){
printf("test = %d\n", sizeof(test));
return 0;
}
Output:
test = 24
The total size is still 24 bytes. But if I add one more variable:
#include <stdio.h>
struct test {
char a;
char b;
int c;
long d;
int e;
char f;
char g;
char h;
char i;
char j;
};
int main(void){
printf("test = %d\n", sizeof(test));
return 0;
}
Output:
test = 32
The size changes to total of 32 bytes. Why? What exactly happens? Sorry if an answer for that question is pretty obvious for you, but I truly don't understand. Also I don't know if that differs between compilers, so if I didn't provide some information, just tell me and I will add that.
CodePudding user response:
It all comes down to alignment. The compiler wants to keep each element aligned to an address that's a multiple of that item's size, because the hardware can access it most efficiently that way. (And on some architectures, the hardware can only access it that way); unaligned access are disallowed.)
You've got one element in your structure that's a long int
of size 8, so its alignment is going to drive everything else. Here's how your first structure would be laid out:
0 1 2 3 4 5 6 7
--- --- --- --- --- --- --- ---
0 | a | b | pad | c |
--- --- --- --- --- --- --- ---
8 | d |
--- --- --- --- --- --- --- ---
16 | e | padding |
--- --- --- --- --- --- --- ---
So, as you can see, the size is 24, including two invisible, unnamed "padding" fields of 2 and 4 bytes, respectively.
Structure padding and alignment can be confusing. (It took me an embarrassingly large number of tries to get this answer right.) Fortunately, you usually don't have to worry about any of this, because it's the compiler's problem, not yours.
You can get the compiler to tell you how it's laying a structure out by using the offsetof
macro:
int main(void){
printf("a @ %zd\n", offsetof(struct test, a));
printf("b @ %zd\n", offsetof(struct test, b));
printf("c @ %zd\n", offsetof(struct test, c));
printf("d @ %zd\n", offsetof(struct test, d));
printf("e @ %zd\n", offsetof(struct test, e));
printf("size = %zd\n", sizeof(struct test));
return 0;
}
On my machine (which seems to be behaving the same as yours) this prints:
a @ 0
b @ 1
c @ 4
d @ 8
e @ 16
size = 24
Notice that I have used %zd
instead of %d
, since sizeof
and offsetof
give their answers as type size_t
, not int
.
When you added char
fields f
, g
, h
, and i
, they could fit into the second padding space, without making the overall structure any bigger. It was only when you added j
that it pushed things over into another 8-byte chunk:
0 1 2 3 4 5 6 7
--- --- --- --- --- --- --- ---
0 | a | b | pad | c |
--- --- --- --- --- --- --- ---
8 | d |
--- --- --- --- --- --- --- ---
16 | e | f | g | h | i |
--- --- --- --- --- --- --- ---
24 | j | padding |
--- --- --- --- --- --- --- ---