Home > front end >  What's the best way to get top nth values from Dictionary?
What's the best way to get top nth values from Dictionary?

Time:10-15

I am trying to get the top n-th values from a dictionary. The part that is making it not easy for me is that if there are multiple values of the same rank, I need to keep all of them. For example, if a dictionary looks like:

Dictionary<string, int> dict = new Dictionary<string, int>();
dict.Add("AAA", 91);
dict.Add("BBB", 97);
dict.Add("CCC", 98);
dict.Add("DDD", 92);
dict.Add("EEE", 97);
dict.Add("FFF", 100);

and if I want Top 3, I need to get

dict.Add("BBB", 97);
dict.Add("CCC", 98);
dict.Add("EEE", 97);
dict.Add("FFF", 100);

because BBB and EEE have the same rank. I first sorted the dictionary by the rank, and tried Take() but it only took one of the two.

/* Does not work */
var dictSorted = dict.OrderByDescending(x => x.Value).ToDictionary(x => x.Key, x => x.Value).Take(3);

/* The below only prints 
    FFF = 100
    CCC = 98
    BBB = 97

    but not  
    EEE = 97
*/
foreach(KeyValuePair kvp in dictSorted){
    Console.WriteLine(kvp.Key   " = "   kvp.Value);
}

Is there a good way to achieve this?

CodePudding user response:

You can use grouping with GroupBy method to include items with different Key but same Values into result:

var kvpGroups = dict.OrderByDescending(x => x.Value).GroupBy(x => x.Value).Take(3);

Here you taking NOT first 3 KeyValuePairs, you taking first 3 groups of KeyValuePairs. Each group may contain one KeyValuePair if its Value was unique in source collection or may contain several KeyValuePairs if some of them has same Values.

You can iterate groups with foreach in next way (kvp means KeyValuePair):

foreach (var kvpGroup in kvpGroups)
{
    foreach (var kvp in kvpGroup)
    {
        Console.WriteLine(kvp.Key   " = "   kvp.Value);
    }              
}

// Output:
// FFF = 100
// CCC = 98
// BBB = 97
// EEE = 97

Or LINQ version:

foreach (var kvp in from kvpGroup in (from kvp in dict
                                      orderby kvp.Value descending
                                      group kvp by kvp.Value into kvpGroup
                                      select kvpGroup).Take(3)
                    from kvp in kvpGroup
                    select kvp)
{
    Console.WriteLine(kvp.Key   " = "   kvp.Value);
}

CodePudding user response:

You haven't been entirely clear about the requirements, but I'm assuming you intend the common pattern where scoring competitors with no tie-breakers, where the top N qualify, and if there's a tie for Nth place then all those tied also qualify. So for example if you had [10, 10, 9, 9, 8, 8] then the top 3 would be [10, 10, 9, 9], because you have two 1st places, two 3rd places and two 5th places. Everyone at 3rd place and above qualifies. But if you have [10, 10, 10, 9, 9, 9] then only the 10s go through, because you have three 1st places and three 4th places - 4th place doesn't make the cut.

There may well be a more elegant way to do it, but the straightforward way is to take the first n, and then also add on all those tied with the last one you took.

using System;
using System.Collections.Generic;
using System.Linq;

public static class Extension
{
    public static IEnumerable<TSource> TopNWithTies<TSource, TKey>(this IEnumerable<TSource> sequence, Func<TSource, TKey> keySelector, int n) {
        var sequenceDescending = sequence.OrderByDescending(keySelector);
        var topN = sequenceDescending.Take(n);
        TKey cutoff = keySelector(topN.Last());
        var ties = sequenceDescending.Skip(n).TakeWhile(item => keySelector(item).Equals(cutoff));
        return topN.Concat(ties);
    }
}

public class Program
{
    public static void Main()
    {
        Dictionary<string, int> dict = new Dictionary<string, int>();
        dict.Add("AAA", 91);
        dict.Add("BBB", 97);
        dict.Add("CCC", 98);
        dict.Add("DDD", 92);
        dict.Add("EEE", 97);
        dict.Add("FFF", 100);
        
        var top3 = dict.TopNWithTies(kvp=>kvp.Value, 3);
        
        foreach(var kvp in top3){
            Console.WriteLine(kvp.Key   " = "   kvp.Value);
        }
    }
}

If instead you want to figure out the rank of every item, you might do something like this, which applies 0-indexed ranks to all the scores. Basically every score gets a rank that is equal to the count of other scores that exceed it.

using System;
using System.Collections.Generic;
using System.Linq;

public static class Extension
{
    public static IEnumerable<(TSource item, int rank)> RankDescending<TSource, TKey>(this IEnumerable<TSource> sequence, Func<TSource, TKey> keySelector) {
        int rank = 0;
        foreach (var group in sequence.OrderByDescending(keySelector).GroupBy(keySelector))
        {
            foreach (var item in group)
            {
                yield return (item, rank);
            }
            rank  = group.Count();
        }
    }
}

public class Program
{
    public static void Main()
    {
        Dictionary<string, int> dict = new Dictionary<string, int>();
        dict.Add("AAA", 91);
        dict.Add("BBB", 97);
        dict.Add("CCC", 98);
        dict.Add("DDD", 92);
        dict.Add("EEE", 97);
        dict.Add("FFF", 100);
        
        var top3 = dict.Rank(kvp=>kvp.Value).TakeWhile(item => item.rank<3);
        
        foreach(var (item, rank) in top3){
            Console.WriteLine(item.Key   " = "   item.Value   " at rank #"   rank);
        }
    }
}
  • Related