Home > Mobile >  File reading and words counting
File reading and words counting

Time:12-10

I want to read a file and then read a string of words or a sentance and count individually how many times those words occur in the file. Also I need to seperate the words that do not occur.
Example Input:
filename.txt
Powerfull moon forest sky
Example Output:
Powerfull: 2
moon: 3
forest: 4
Not used: sky
I am kinda stuck in here and this is what I got

string filename = Console.ReadLine();
        StreamReader stream = File.OpenText(filename);
       
        string input = Console.ReadLine();
        string[] source = filename.Split(new char[] { '.', '?', '!', ' ', ';', ':', ',' }, StringSplitOptions.RemoveEmptyEntries);
        var matchQuery = from word in source
                         where word.ToLowerInvariant() == input.ToLowerInvariant()
                         select word;
        int wordCount = matchQuery.Count();
        Console.WriteLine("{0} occurrences(s) of the search term \"{1}\" were found.", wordCount, input);

CodePudding user response:

There are serveral ways to do it. One of them is group by as query as Arshad commented. You can also use dictionary to preserve your result if you read file line by line. Here is example, but you have to adjust it to yours requirements: https://stackoverflow.com/a/11967649/7226070

CodePudding user response:

Instead of splitting on whitespaces and punctuation (please, note, that we have a lot of them) I suggest matching. If we define word as

Word is non-empty sequence of letters

we can use a simple regular expression pattern:

 \p{L} 

and then you can preprocess file:

 using System.IO;
 using System.Linq;
 using System.Text.RegularExpressions;

 ...

 Regex regex = new Regex(@"\p{L} ");

 var freqs = File
   .ReadLines(filename)
   .SelectMany(line => regex
      .Matches(line)
      .Cast<Match>()
      .Select(match => match.Value))
   .GroupBy(word => word, StringComparer.OrdinalIgnoreCase)
   .ToDictionary(group => group, group => group.Count());

Time to user query. Again, we match words and then find occurrencies with a help of freqs:

  var result = regex
    .Matches(Console.ReadLine())
    .Cast<Match>()
    .Select(match => match.Value)
    .Distinct(StringComparer.OrdinalIgnoreCase)
    .Select(word => $"{(freqs.TryGetValue(word, out int count) ? count : 0)} occurrences(s) of {word} found");

  Console.Write(string.Join(Environment.NewLine, result));
  •  Tags:  
  • c#
  • Related