Looking for a tool that takes a C structure as input and outputs a structure with minimal size.
For example, given an initial structure with only 3 members
struct Book {
char title[50];
char author[25];
int book_id;
};
there are 6 permutations
struct Book1 {
char title[50];
char author[25];
int book_id;
};
struct Book2 {
char title[50];
int book_id;
char author[25];
};
struct Book3 {
char author[25];
char title[50];
int book_id;
};
struct Book4 {
char author[25];
int book_id;
char title[50];
};
struct Book5 {
int book_id;
char author[25];
char title[50];
};
struct Book6 {
int book_id;
char title[50];
char author[25];
};
The output shows that 80 bytes is the minimal size.
Book1 = 80
Book2 = 84
Book3 = 80
Book4 = 84
Book5 = 80
Book6 = 80
Several projects I work on contain structures with 10 members (3628800 permutations) and are continually appended with new members by coders not familiar with structure packing complexities.
Question
Is it possible to have a tool to refactor the structure into the best minimal size?
CodePudding user response:
Assuming that the size of any member is multiplicity of its alignment requirements which are powers of 2 then the optimal layout can be found by placing the members with the strictest alignment first. There will be no internal padding between members. The total size of the struct would be a sum of its members rounded to alignment of the first member with the strictest alignment, which is a lower bound anyway.
CodePudding user response:
As long as your structure contain native types and there are no composite structures, then there is a very good heuristic (certainly optimal) to solve this problem: sort the fields by decreasing order based on their alignment constraint. The alignment constraint of a native type should be its size while for arrays it is the size of the item type (eg. 1 for a char array). I think this heuristic is also certainly optimal for composite structure having a power of two size or alternatively if all your sub-structures have a size that is a multiple of 8 on 64-bit platforms (except the last one that does not matter). For example, assuming int
are aligned on 4 bytes, then it will be put first, then the two arrays whatvever the number of items (both results in the same overall size).
For composite structures, I think the problem is a bit harder to solve. It is similar to what allocator algorithm uses to pack data in the heap (so to minimize the space overhead). Allocators have the same constraints: the allocated type must follow alignment constraints while minimizing the overall space though they often also have an additional constraint: be fast. A good example is the aligned_alloc
function. Many algorithm use a bucket strategy to solve this problem efficiently though the solution may not be optimal.
Note that compilers like GCC have extensions to pack data structures but they are not compliant to the C standard.