Home > Net >  How to get the second most frequent word from the text?
How to get the second most frequent word from the text?

Time:11-10

Here is my code:

string StringFromTheInput = TextBox1.Text;

string source = StringFromTheInput.ToString();

var frequencies = new Dictionary<string, int>();
frequencies.Add("item", 0);
string highestWord = null;

var message = string.Join(" ", source);
var splichar = new char[] { ' ', '.' };
var single = message.Split(splichar);
           
int highestFreq = 0;

foreach (var item in single)
{
    if (item.Length > 4)
    {
        int freq;
        frequencies.TryGetValue(item, out freq);
        freq  = 1;

        if (freq> highestFreq)
        {
            highestFreq = freq;
            highestWord = item.Trim();
        }
                        
        frequencies[item] = freq;
        Label1.Text = highestWord.ToString();
    }
                
}

This is successfully gets me the most frequent word from the text but I tried to increment highestFreq= freq 1 to get the second most frequent word but it doesn`t work!

CodePudding user response:

Can you use Linq or is this homework?

using System;
using System.Linq;

string StringFromTheInput = "Her life in the confines of the house became her new normal. He wondered if she would appreciate his toenail collection. My secretary is the only person who truly understands my stamp-collecting obsession. This is the last random sentence I will be writing and I am going to stop mid-sent. She tilted her head back and let whip cream stream into her mouth while taking a bath.";

string[] words = StringFromTheInput.Split(" ");
var setsByFrequency = words
    .Where(x => x.Length > 4)   // For words with more than 4 characters
    .GroupBy(x => x.ToLower()) // ToLower so 'House' and 'house' both gets placed into the same group
    .Select(g => new { Freq = g.Count(), Word = g.Key})  
    .OrderByDescending(g => g.Freq)
    .ToList();

var mostFrequent = setsByFrequency[0];
var secondMostFrequent = setsByFrequency[1];

Console.WriteLine(mostFrequent);
Console.WriteLine(secondMostFrequent);

CodePudding user response:

you can order your Dictionary by their value, then take the first, second... element found.

int indexYouNeed =2;
int indexRead=1;
string textFound="";
foreach (KeyValuePair<string,int> entry in frequencies.Where(x=>x.Key.Length>4).OrderBy(x=>x.Value))
{
    if(indexRead==indexYouNeed)
    {
        textFound=entry.Key;
        break;
    }
    indexRead  ;
}

CodePudding user response:

I would do something like this (simple and easy to understand) First, load the dictionary with your words and the frequency (only words with more than 4 characters, because I saw in your question that), in this case the dictionary must ignore case. Second, order the dictionary and take the second (please check if it is empty, or if it has only 1 element before trying to access):

string StringFromTheInput = "";
var wordsFreq = new Dictionary<string,int>(StringComparer.OrdinalIgnoreCase);
foreach(var s in StringFromTheInput.Split(' ')){
    if(s.Length <=4) continue;
    if(wordsFreq.ContainsKey(s)){
        wordsFreq[s]  ; 
    }else{
        wordsFreq.Add(s,1);
    }
}
if(!wordsFreq.Any()) return;
var secondFreqWord = wordsFreq.OrderByDescending(x => x.Value).ToList()[1]; 

  • Related