Home > Software engineering >  compress or size optimization for large sparse array in C
compress or size optimization for large sparse array in C

Time:04-20

/* The following codes are compiled into library(test.a) */
typedef struct {
    short x[5];
} weight_t;

typedef struct {
    weight_t wgt;
    char *ptr;
} tbl_t;

/* huge static array and 70% of entries are empty/zero */
static tbl_t tbl[] ={
{ { 0x0102, 0x0102, 0x0102, 0x0102, 0x0102, }, 0 }, /* 0000 */
{ { 0x0103, 0x0103, 0x0103, 0x0103, 0x0103, }, 0 }, /* 0001 */
{ { 0x0104, 0x0104, 0x0104, 0x0104, 0x0104, }, 0 }, /* 0002 */
{ { 0x0105, 0x0105, 0x0105, 0x0105, 0x0105, }, 0 }, /* 0003 */

{ { 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, }, 0 }, /* 3134c */
{ { 0x0000, 0x0000, 0x0000, 0x0000, 0x0000, }, 0 }, /* 3134d */
......
......
.....

}; /* more than one million entries */

Since array1 contains more than one million entries(even though 70% of them are empty/zero entries), the size of library file(test.a) is quite big.

I know using hash table to only include non-empty entries can reduce the size of array1 but is there any way to compress or do size optimization for library file(test.a) without changing array1?


There are tons of places in other libraries accessing the value of x in struct weight_t, there could be huge effort involved to change current static array tbl implementation.

    /* tbl_t obj ==> tbl_t_obj */
    val = tbl_t_obj.wgt.x[1];

CodePudding user response:

any way to compress or do size optimization of binary file without changing current implementation

To reduce the size of the executable, follow @user3386109 idea: Load the table via an external file. Although the executable file is smaller, the total size of executable plus external file is not reduced.

Code could compress that external file and then uncompressed it as part of the tbl[] assignment early in the run of the executable. Depending on compression time, that may make the external file 10x smaller.

CodePudding user response:

Since array tbl[] contains 75% of empty/zero entries

First, introduce getter and setter. Store the index and the value in a dynamically allocated array. Allocate only when really you need to allocate. For the rest, just return zeros.

// stores index and the table
// sorted on indexes, so we can use bsearch
struct maybetbl_s {
   uint32_t index;
   tbl_t tbl;
};

// dynamic memory
static struct maybetbl_s *tblidx = NULL;
static uint32_t tblidxcnt = 0;

// all zeros
const tbl_t zeros = {0};

// check if we have index idx
// if we have, return a pointer to it
tbl_t *_tbl_has(size_t idx) {
   return bsearch(&idx, tblidx, ...);
}

// memmove   realloc to remove an element     
void _tbl_remove(tbl_t *ret) {
    tbl_t *end = tblidx   tblidxcnt;
    if (ret   1 != end) {
        memmove(ret, ret   1, end - ret - 1); // or smth like that
    }
    tblidx = realloc(tblidx, tblidxcnt * sizeof(*tblidx));
    tblidxcnt -= 1;
}

int _tbl_allocate(size_t idx) {
   void *pnt = realloc(tblidx, tblidxcnt * sizeof(*tblidx));
   if (!pnt) return -ENOMEM;
   tblidx = pnt;
   tblidx[tblidxcnt].index = idx;
   tblidxcnt  = 1;
   qsort(tblidx, ...); // so we can use bsearch
   return 0;
}

// public api --------------------------------

// if idx exists, return the value
// if it doesn't, return zeros
tbl_t tbl_get(size_t idx) {
   tbl_t *ret = _tbl_has(idx);
   return ret ? ret : zeros;
}    
 
// set specific idx to the value
// note - can allocate memory, returns 0 on success
int tbl_set(size_t idx, tbl_t value) {
   tbl_t *ret = _tbl_has(idx);
   // if we are storing zeros, just clear the element
   if (memcpy(value, zeros, sizeof(value)) {
       if (ret) {
          _tbl_remove(ret);
       }
       return 0;
   }
   // allocate the element for index if not exists
   if (!ret) {
       int err = _tbl_allocate(idx);
       if (err) err;
       ret = _tbl_has(idx);
       assert(ret);
   }
   // store the value and return success
   *ret = value;
   return 0;
}

CodePudding user response:

TL:DR: store the non-zero entries plus basically run-length-encoding of the zeros, in a zstd / lz4 / gzip format, and initialize a zeroed array from that.

As you mentioned here and in How to compress or do size optimization for large static sparse array in C (basically a duplicate of this where you forgot to link this existing question), you seem to be aiming primarily at reducing file size of the executable, not RAM usage after your code is loaded. (The latter is impossible without an ABI change that means recompiling all code that touches this array. Which is certainly something you should consider, e.g. to use Lundin's answer on your other question: reduce padding in your arrays if char* is wider than short).

Your only option to reduce that is to not statically-initialize this array with non-zero values. The compiler can't optimize this for you. (Or at least, not that I'm aware; I guess in theory an embedded implementation that has to copy nonzero .data initializers from flash to RAM anyway could literally compress that data with lz4 or zstd. IDK if any toolchains do that; it would be a neat feature.)


To do this yourself, zero-init the actual array (static tbl_t tbl[];) so it goes in the BSS and doesn't take space in your .a and call an init function at startup to loop through a more compact representation of your data. Either from a file or from a static const array.

You'd omit the char* member that's initially always(?) zero, but with a with a uint8_t or unsigned short distance_to_next_nonzero member. This is a similar idea to run-length encoding. If you have a run of zeros too long for that member, just have a record with all-zero weights and a new length.

Or it could even be a separate binary file, so it's easier to avoid having that initializer memory allocated to this for the lifetime of your application1.

Perhaps generate the file from compiling source like this in a program that uses fopen / fwrite to dump it. That would make it easy to maintain even if you have that initializer data compressed using zstd2 as part of your build process. Although you can always just hexdump any binary file back into a C const char initddata_zstd[] = {...}; array initializer.

I'm assuming the char* member is always NULL, otherwise getting it initialized to point at constant strings is more of a challenge.

typedef struct {
    short x[5];
    unsigned short num_zeros;
    // or  uint_least8_t num_zeros;  if you use __attribute__((packed)) for the init stuff
} weight_initializer;

Footnote 1:
Freeing init data: kernels like Linux put their init data in a custom linker section so it's contiguous, and at some point at the end of boot put that region of memory on the free list to be reused. In user-space under Linux you could do something like that by calling munmap on pages that consist solely of initializer array. (Use alignas(4096) static const unsigned char initdata[] = {...}; so it starts at the beginning of a page.)

Footnote 2:
The non-zero initializer data in your example looks very compressible. zstd or lz4 are (AFAIK) the go-to choices for quick decompression, with zstd allowing good compression ratios if you spend enough time compressing, which I think doesn't hurt decompression speed. (You can limit max memory required for decompression, at the expense of compression efficiency. Although I think that only gets relevant for inputs larger than memory size.)

LZ4 is faster I think, but maybe not by much. Apparently there's also a Lizard (next-gen LZ5). Again, spending lots of time compressing can make a smaller compressed size, potentially leading to even faster decompression.

zstd is about an order of magnitude faster than gzip to compress or decompress, IIRC, and gets equal or better compression ratios at that speed. (With a wide range of speed vs. compression presets.) It makes zlib basically obsolete if you don't need format compatibility.

  • Related