Home > database >  Creating a frequency table for a text file with alphabets?
Creating a frequency table for a text file with alphabets?

Time:11-20

I'm looking to creating a C array frequency table for a text file with repeated alphabets. So for example if I had a small file named "text.txt" with the contents: "AAAABBBB\nCCCCEE" and I wanted to parse this file into a byte unsigned int array of 256 bytes, where the index into this array would be the integer value of the source alphabet symbol, how would I go about doing it? I'm not sure how to represent the alphabet in the array as well as its frequency in the same array.

CodePudding user response:

I'm not sure how to represent the alphabet in the array as well as its frequency in the same array.

You don't have to represent the alphabet in the array. You only have to represent the frequency counters themselves. This is because the alphabet itself is implied by the position of the frequency counters in the array. For example, if you define the array like this

unsigned char counters[256] = {0};

then you could make counters['A'] represent the frequency of the upper-case letter 'A', and counters['Z'] represent the frequency of the upper-case letter 'Z'. Assuming that you are using ASCII, then counters['A'] is equivalent to writing counters[65] and counters['Z'] is equivalent to writing counters[90].

You could then just go through the input file character by character by calling fgetc in a loop, and for every character that you encounter, you increment the corresponding counter.

Note that if you only assign an unsigned char (a single byte) to each counter, then the counter can only represent numbers up to 256. Therefore, you may want to consider using an unsigned int instead.

In accordance with the community guidelines on homework questions, I will not provide a full code solution at this time. However, I can add such a solution later, if necessary.

  • Related