Considering the below struct, what would be the most efficient/correct way for declaring/using a dynamic array of it:
typedef struct {
int a;
char b;
} Test;
Method a:
Test* tst = malloc(2 * sizeof *tst);
tst[0].a = 0;
tst[0].b = 1;
Method b:
Test **tst = malloc(2 * sizeof(Test*));
tst[0] = malloc(sizeof(Test));
tst[0]->a = 0;
tst[0]->b = 1;
CodePudding user response:
The first one, if the structs aren't very big. Modern CPUs hate dynamic allocation and you should try to avoid "chasing pointers" as much as possible.
Say the array's address is 123456. When you access an element in the first version of the code, the CPU requests the memory chip to tell it the data at address 123456 and then twiddles its thumbs for an hour (metaphorically speaking) until the data comes back. It will get 64 bytes (one cache line) at a time, so most likely 8 elements at a time, and it might even ask for the next 64 bytes (at address 123520) in advance, as well (this is called prefetching). So once you access the first element, you can reasonably guess that accessing the next several elements is fast because they're in the CPU cache.
In the second version of the code, the CPU requests 64 bytes from address 123456 (and maybe the next 64, too), but it consists of 16 pointers. Then it has to look at those pointers and request the data at that address too. If the pointer says the data is at address 654321 then it has to request address 654321 and twiddle its thumbs for another metaphorical hour. Waste of time! Worse, each item is a separate request. The next item won't be at address 654329 in the same cache line - it could be some totally different address, number 555555. So you don't get the other items into the cache for free.
It is possible to use prefetch instructions to tell the CPU to fetch the other items while it works on the first one, but it'll never be as fast.
Now, if the items are very big, then it can make some sense to use an array of pointers. If your items are bigger than say 256 bytes, then they won't fit in the same cache line or (probably) the prefetched cache lines, so you don't get a speed boost by having several items pre-loaded into the cache at once. Still you're not losing anything by putting them directly in the array, so you may as well. If they're really big like 100kB then the amount of wasted memory could be a problem and then you have a good reason to use pointers. Also if you have items with different sizes, then you can't put them directly in an array (not without a lot of work) and that's also a good reason to use pointers.
Notice your example struct is really small, and on some systems (64-bit Windows) it's the same size as a pointer anyway. So in the space where the pointer goes, you might as well just put the struct! Not all structs are that small of course.