I'm trying to parse a file for certain characters (limited to readable ASCII). I want to then print out the characters and their count, in order of the count and then by order of the tokens. An obvious data structure for this would be a hashmap, like the one offered in C 's STL, and then dump the values into a vector to sort. But neither of those are readily available in C.
I was thinking of doing an array of Pairs (structure), indexed by the ASCII value, and then sorting the array. Is there a better, more optimal choice? Is there something easier to implement than that? I'm very new to C, so I'm not used to the bare bones of it.
CodePudding user response:
If you are talking about an occurrence of characters? you can use a basic array. (this is pure c ,ansi only but you can adjust as needed)
void countchar(const char * str, int &charcount[256])
{
for (x = 0; x < strlen(str); c )
{
if (isalpha(str[x])) // optional
charcount[(int)str[x])] ;
}
}
Assumes passed array is initialized!
Edit 1: // your basic sort functions
void swap(int* f, int* l)
{
int temp = *f;
*f = *l;
*l = temp;
}
void sortcount(int &charcount[256])
{
int i, j, min_idx;
for (i = 0; i < 256; i ) {
min_idx = i;
for (j = i 1; j < 256; j )
if (charcount[j] < charcount[min_idx])
min_idx = j;
swap(&charcount[min_idx], &charcount[i]);
}
}
CodePudding user response:
As @JB-MSI already answered, you just need an array for the counting.
For the second part (sorting), you can use the C standard library function qsort()
from stdlib.h
.
In order to avoid global variables (the predicate function passed to qsort
lacks a context parameter, as so many old functions in that old language),
you need to pack the character and its count into a struct, then sort by count. So it looks a bit more complicated than it actually is, grace to the enlightenment of the 1965 C-standard library designers.
#include <stdlib.h>
#include <stdio.h>
#include <stdbool.h>
#define NCHARS 256
bool character_count(const char* str,
size_t* frequencies,
size_t nfrequencies) {
if (NULL == frequencies)
return false;
if (nfrequencies < 256)
return false;
for (;*str != '\0'; str ) {
frequencies[*str] ;
}
return true;
}
typedef struct CharCount_tag {
char c;
size_t count;
} CharCount_t;
int charcount_compare(const void* cc1, const void* cc2) {
CharCount_t* c1 = (CharCount_t*)cc1;
CharCount_t* c2 = (CharCount_t*)cc2;
if (c1->count == c2->count)
return 0;
else if (c1->count > c2->count)
return -1;
else
return 1;
}
int main (int argc, const char* argv[]) {
if (argc > 1) {
size_t frequencies[NCHARS];
for (size_t i = 0; i < NCHARS; i ) {
frequencies[i] = 0;
}
if (character_count(argv[1], frequencies, NCHARS)) {
CharCount_t cc[NCHARS];
for (size_t i = 0; i < NCHARS; i ) {
cc[i].c = (char) i;
cc[i].count = frequencies[i];
}
qsort(cc, NCHARS, sizeof(cc[0]), charcount_compare);
for (size_t i = 0; i < NCHARS; i ) {
printf("%d (%c) %zu\n", cc[i].c, cc[i].c, cc[i].count);
}
}
}
return 0;
}