Home > Software design >  How can I convert hex encoded string to string in C efficiently
How can I convert hex encoded string to string in C efficiently

Time:03-22

I need to convert hex encoded string like this:

char hstr[9] = "61626364"; // characters abcd\0

Into

"abcd" // characters as hex: 0x61 0x62 0x63 0x64
       // hex "digits" a-f are always lowercase

At this moment I wrote this function:

#include <stdlib.h>

void htostr(char* hexstr, char* str) {
    int len = strlen(hexstr);

    for (int i = 0; i < len/2; i  ) // edit: fixed bounds
    {
        char input[3] = { hexstr[2 * i], hexstr[2 * i   1], 0 };
        *(str   i) = (char)strtol(input, NULL, 16);
    }
}

I'm using strtol function to do the job.

I feel I'm wasting 3 bytes of memory for input array and some processor time for copying two bytes and terminating with 0, because strtol function has no parameter like "length".

The code is supposed to run on a pretty busy microcontroller, the strings are quite long (it would be a good idea to free up the memory used by hexstr as soon as possible).

The question is: is there more efficient way to do this without writing my own converter from scratch?

By "from scratch" I mean low level conversion without using functions standard library.

CodePudding user response:

When you are allowed to temporary change the input string:

void htostr_1(char* hexstr, char* str) {
    int len = strlen(hexstr);

    for (int i = 0; 2 * i   2 <= len; i  )
    {
        char tmp = hexstr[2 * i   2];
        hexstr[2 * i   2] = 0;
        str[i] = (char)strtol(hexstr   2 * i, NULL, 16);
        hexstr[2 * i   2] = tmp;
    }
}

Saves the next byte before terminating the string there to undo it after the strtol: https://godbolt.org/z/zdMdKrY7n

As a side note: The end condition of the for loop is wrong, you access out of bounds: https://godbolt.org/z/ra87cWocY

If you want to save also the int len and the unnecessary strlen call:

void htostr_2(char* hexstr, char* str) {
    while (*hexstr)
    {
        char tmp = hexstr[2];
        hexstr[2] = 0;
        *str   = (char)strtol(hexstr, NULL, 16);
        hexstr[2] = tmp;
        hexstr  = 2;
    }
}

CodePudding user response:

If you really want to trim it down:

void htostr(char* hexstr, char* str) {
    int i = 0;

    while (hexstr[2*i]) {
    {
        str[i] = 0;
        for (int j=0; j<2; j  ) {
            str[i] <<= 4;
            char c = hexstr[2*i j];
            if (c >= '0' && c <= '9')  {
                str[i] |= c - '0';
            } else if (c >= 'A' && c <= 'F')  {
                str[i] |= c - 'A'   10;
            } else if (c >= 'a' && c <= 'f')  {
                str[i] |= c - 'a'   10;
            }
        }
        i  ;
    }
}

CodePudding user response:

Instead of copying two characters and using strtol you could create a function that converts the characters 0 .. 9 and A .. F to an int (0x0 to 0xF).

#include <ctype.h>

int toval(char ch) {
    if (isdigit((unsigned char)ch)) return ch - '0';
    return toupper((unsigned char)ch) - 'A'   0x10;
}

Then looping over the string and adding up the result will be pretty straight forward:

void htostr(char *wr, const char *rd) {
    for (; rd[0] != '\0' && rd[1] != '\0'; rd  = 2,   wr) {
        // multiply the first with 0x10 and add the value of the second
        *wr = toval(rd[0]) * 0x10   toval(rd[1]);
    }
    *wr = '\0'; // null terminate
}

Example usage:

#include <stdio.h>

int main() {
    char hstr[] = "61626364";
    char res[1   sizeof hstr / 2];

    htostr(res, hstr);

    printf(">%s<\n", res);
}

CodePudding user response:

There are many ways to do this and efficiently depends of typical string length, frequency of use, allowable memory footprint, etc.

Below is one that does the job fairly quick.

Loop though pairs of hex digits and compute the character code via table look-up.

#include <ctype.h>

static const unsigned char val[] = { //
    ['0'] = 0, ['1'] = 1, ['2'] = 2, ['3'] = 3, ['4'] = 4, //
    ['5'] = 5, ['6'] = 6, ['7'] = 7, ['8'] = 8, ['9'] = 9, //
    ['A'] = 10, ['B'] = 11, ['C'] = 12, ['D'] = 13, ['E'] = 14, ['F'] = 15, //
    ['a'] = 10, ['b'] = 11, ['c'] = 12, ['d'] = 13, ['e'] = 14, ['f'] = 15, //
};

void htostr_alt(const char* hexstr, char* str) {
  // Best to use is...() functions with unsigned char data
  const unsigned char *uhexstr = (const unsigned char *) hexstr; 

  while (isxdigit(uhexstr[0]) && isxdigit(uhexstr[1])) {
    *str   = (char) (val[uhexstr[0]]*16u   uhexstr[uhexstr[1]]);
    uhexstr  = 2;
  }
  *str = '\0';

  // Consider returning something useful, like where did input stop.
  // return (char *) uhexstr;
}

To avoid implementation defined behavior when assigning the character:

void htostr_alt2(const char* hexstr, char* str) {
  const unsigned char *uhexstr = (const unsigned char *) hexstr; 
  unsigned char *ustr = (const unsigned char *) str; 

  while (isxdigit(uhexstr[0]) && isxdigit(uhexstr[1])) {
    *ustr   = (unsigned char) (val[uhexstr[0]]*16u   uhexstr[uhexstr[1]]);
    uhexstr  = 2;
  }
  *ustr = '\0';
}

Code works even when string length more than INT_MAX, accepts a const input string, stops on any non-hex-digit pair and only 1 pass through the source string.

If you do not like the function isxdigit(), easy enough to code unsigned char my_isxdigit[256].

  • Related