Home > Blockchain >  What's a good alternative to a hashmap in C?
What's a good alternative to a hashmap in C?

Time:09-02

I'm trying to parse a file for certain characters (limited to readable ASCII). I want to then print out the characters and their count, in order of the count and then by order of the tokens. An obvious data structure for this would be a hashmap, like the one offered in C 's STL, and then dump the values into a vector to sort. But neither of those are readily available in C.

I was thinking of doing an array of Pairs (structure), indexed by the ASCII value, and then sorting the array. Is there a better, more optimal choice? Is there something easier to implement than that? I'm very new to C, so I'm not used to the bare bones of it.

CodePudding user response:

If you are talking about an occurrence of characters? you can use a basic array. (this is pure c ,ansi only but you can adjust as needed)

void countchar(const char * str, int &charcount[256])
{
    for (x = 0; x < strlen(str); c  )
    {
        if (isalpha(str[x])) // optional
          charcount[(int)str[x])]  ;
    }
}

Assumes passed array is initialized!

Edit 1: // your basic sort functions

void swap(int* f, int* l)
{
    int temp = *f;
    *f = *l;
    *l = temp;
}

void sortcount(int &charcount[256])
{
    int i, j, min_idx;
    for (i = 0; i < 256; i  ) {
        min_idx = i;
        for (j = i   1; j < 256; j  )
            if (charcount[j] < charcount[min_idx])
                min_idx = j;
        swap(&charcount[min_idx], &charcount[i]);
    }
}

CodePudding user response:

As @JB-MSI already answered, you just need an array for the counting. For the second part (sorting), you can use the C standard library function qsort() from stdlib.h.

In order to avoid global variables (the predicate function passed to qsort lacks a context parameter, as so many old functions in that old language), you need to pack the character and its count into a struct, then sort by count. So it looks a bit more complicated than it actually is, grace to the enlightenment of the 1965 C-standard library designers.

#include <stdlib.h>
#include <stdio.h>
#include <stdbool.h>

#define NCHARS 256

bool character_count(const char* str,
             size_t* frequencies,
             size_t nfrequencies) {
  if (NULL == frequencies)
    return false;
  if (nfrequencies < 256)
    return false;
  for (;*str != '\0'; str  ) {
    frequencies[*str]  ;
  }
  return true;
}

typedef struct CharCount_tag {
  char c;
  size_t count;
} CharCount_t;

int charcount_compare(const void* cc1, const void* cc2) {
  CharCount_t* c1 = (CharCount_t*)cc1;
  CharCount_t* c2 = (CharCount_t*)cc2;
  if (c1->count == c2->count)
    return 0;
  else if (c1->count > c2->count)
    return -1;
  else
    return 1;
}

int main (int argc, const char* argv[]) {
  if (argc > 1) {
    size_t frequencies[NCHARS];
    for (size_t i = 0; i < NCHARS; i  ) {
      frequencies[i] = 0;
    }
    if (character_count(argv[1], frequencies, NCHARS)) {
      CharCount_t cc[NCHARS];
      for (size_t i = 0; i < NCHARS; i  ) {
        cc[i].c = (char) i;
        cc[i].count = frequencies[i];
      }
      qsort(cc, NCHARS, sizeof(cc[0]), charcount_compare);
      for (size_t i = 0; i < NCHARS; i  ) {
        printf("%d (%c) %zu\n", cc[i].c, cc[i].c, cc[i].count);
      }
    }
  }
  return 0;
}
  • Related