I googled ans searched here a bunch without a fitting solution. The title is maybe a bit weird or not fully accurate, but let me explain: My IoT device collects a bunch of data every second that I can represent as a list of integer. Here is an example of one row of sensor reads (the zeros are not always 0 btw):
230982 0 4294753011 -9 4294198951 -1 4294225518 0 0 0 524789 0 934585 0 4 0 0 0 0
On trigger I want to send the whole table (all rows until then) to my computer. I could just stringify it and concatenate everything, but wonder if there is a more efficient encoding/compression to reduce the byte count, both when storing in RAM/flash and for reduced transfer volume. Ideally this could be achieved with integrated functions, ie no external compression libraries. I am not that strong with encoding/compression, hope you can give me a hint.
CodePudding user response:
Simplest solution is to simply dump data out in binary form. It may be smaller or bigger than string form depending on your data, but you don't have to do any data processing on device.
If most of your data is small, you can use variable length data encoding for serialization. There are several, but CBOR is fairly simple.
If your data changes only very little, you could send only first row as absolute values, and remaining rows as delta of previous row. This would result in many small numbers, which typically are more efficient in previously mentioned encoding systems.
I wouldn't try to implement any general purpose compression algorithms without any experience and external libraries, unless you absolutely need it. Finding suitable algorithm that compresses your data well enough and with reasonable resource usage can be time consuming.
CodePudding user response:
Zlib/Zstd libraries are better suited for doing general purpose compression. If I may assume that you don't want to use any third party libraries, here is a hand coded version of some naive compression method, which saves half of the bytes of the input string.
The basic idea is very simple. Your strings will at most have 16 different characters which can be mapped to 4-bits rather than typical 8-bits. SEE THE ASSUMPTIONS BELOW. You can try base16
, base64
, base128
encodings too, but this is the simplest.
Assumptions:
- First you'll convert all your numbers into a string in decimal format.
- The string won't contain any other characters than
0,1,2,3,4,5,6,7,8,9, ,-,.
,space
, and acomma
.
============================================================================
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
static inline char map(char c)
{
switch(c) {
case ' ' : return ('/' - '*');
case '\0': return 0;
default : return c - '*';
}
return 0;
}
static inline char revmap(char c)
{
switch(c) {
case '\0' : return 0;
case '/' - '*': return ' ';
default : return c '*';
}
return 0;
}
char *compress(const char *s, int len)
{
int i, j;
char *compr = malloc((len 1)/2 1);
j = 0;
for (i = 1; i < len; i = 2)
compr[j ] = map(s[i-1]) << 4 | map(s[i]);
if (i-1 < len)
compr[j ] = map(s[i-1]) << 4;
compr[j] = '\0';
return compr;
}
char *decompress(const char *s, int len)
{
int i, j;
char *decompr = malloc(2*len 1);
for (i = j = 0; i < len; i ) {
decompr[j ] = revmap((s[i] & 0xf0) >> 4);
decompr[j ] = revmap(s[i] & 0xf);
}
decompr[j] = '\0';
return decompr;
}
int main()
{
const char *input = "230982 0 4294753011 -9 4294198951 -1 4294225518 0 0 0 524789 0 934585 0 4 0 0 0 0 ";
int plen = strlen(input);
printf("plain(len=%d): %s\n", plen, input);
char *compr = compress(input, plen);
int clen = strlen(compr);
char *decompr = decompress(compr, clen);
int dlen = strlen(decompr);
printf("decompressed(len=%d): %s\n", dlen, decompr);
free(compr);
free(decompr);
}