Home > Net >  I want to get most frequent values using LINQ
I want to get most frequent values using LINQ

Time:10-14

I am trying to get the most frequent values in an array using LINQ in C#.

For example,

int[] input = {1, 1, 1, 3, 5, 5, 6, 6, 6, 7, 8, 8};

output = {1, 6}
int[] input = {1, 2, 2, 3 ,3, 3, 5}
output = {3}

Please let me know how to build LINQ.

Please read be careful. This is a different problem with Select most frequent value using LINQ

I have to choose only the most frequent values. The code below is similar, but I can't use Take(5) because I don't know the number of results.

 int[] nums = new[] { 1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7 };
 IEnumerable<int> top5 = nums
            .GroupBy(i => i)
            .OrderByDescending(g => g.Count())
            .Take(5)
            .Select(g => g.Key);

this output is {1, 2, 3, 4, 5} but my expected output = {1, 2}

Please read the questions carefully and answer.

Thanks and regards.

CodePudding user response:

Just to add to the plethora of answers:

int[] input = { 1, 1, 1, 3, 5, 5, 6, 6, 6, 7, 8, 8 };

var result = input
   .GroupBy(i => i)
   .GroupBy(g => g.Count())
   .OrderByDescending(g => g.Key)
   .First()
   .Select(g => g.Key)
   .ToArray();

Console.WriteLine(string.Join(", ", result)); // Prints "1, 6" 

[EDIT]

In case anyone finds this interesting, I compared the performance of the above between .net 4.8 and .net 5.0 as follows:

(1) Added a Comparer class to instrument the number of comparisons made:

class Comparer : IComparer<int>
{
    public int Compare(int x, int y)
    {
        Console.WriteLine($"Comparing {x} with {y}");
        return x.CompareTo(y);
    }
}

(2) Modified the call to OrderByDescending() to pass a Comparer:

.OrderByDescending(g => g.Key, new Comparer())

(3) Multi-targeted my test console app to "net48" and "net5.0".

After making those changes the output was as follows:

For .net 4.8:

Comparing 1 with 3
Comparing 1 with 1
Comparing 1 with 2
Comparing 3 with 3
Comparing 3 with 2
Comparing 3 with 3
1, 6

For .net 5.0:

Comparing 3 with 1
Comparing 3 with 2
1, 6

As you can see, .net 5.0 is better optimised.

CodePudding user response:

If you want to do it in pure LINQ in one query you can group groups by count and select the max one:

int[] nums = new[] { 1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7 };
var tops = nums
     .GroupBy(i => i)
     .GroupBy(grouping => grouping.Count())
     .OrderByDescending(gr => gr.Key)
     .Take(1)
     .SelectMany(g => g.Select(g => g.Key))
     .ToList();

Note that it is not a most effective and clear solution.

UPD

A little bit more effective version using Aggregate to perform MaxBy. Note that it will fail for empty collections unlike the previous one:

var tops = nums
     .GroupBy(i => i)
     .GroupBy(grouping => grouping.Count())
     .Aggregate((max, curr) => curr.Key > max.Key ? curr : max)
     .Select(gr => gr.Key);

Also you can use MaxBy from MoreLinq or one introduced in .NET 6.

CodePudding user response:

You can store your result in an IEnumerable of tuples with the first item being the number, the second item being the count of the number in your input array. Then you look at the count of your group with most elements, and take all the tuples where the second items equals your maximum.

int[] nums = new[] { 1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7 };
var intermediate = nums
            .GroupBy(i => i)
            .Select(g => (g.Key,g.Count()));
int amount = intermediate.Max(x => x.Item2);
IEnumerable<int> mostFrequent = intermediate
            .Where(x => x.Item2 == amount)
            .Select(x => x.Item1);

Online demo: https://dotnetfiddle.net/YCVGam

CodePudding user response:

Use a variable to capture the number of items for the first item, then use TakeWhile to get all the groups with that number of items.

void Main()
{
    var input = new[] { 1, 1, 1, 3, 5, 5, 6, 6, 6, 7, 8, 8 };

    int numberOfItems = 0;
    var output = input
        .GroupBy(i => i)
        .OrderByDescending(group => group.Count());
        
    var maxNumberOfItems = output.FirstOrDefault()?.Count() ?? 0;
        
    var finalOutput = output.TakeWhile(group => group.Count() == maxNumberOfItems).ToList();

    foreach (var item in finalOutput)
    {
        Console.WriteLine($"Value {item.Key} has {item.Count()} members");
    }
}

You can do this as a single query as well:

int? numberOfItems = null;
var finalOutput = input
    .GroupBy(i => i)
    .OrderByDescending(group => group.Count())
    .TakeWhile(i =>
    {
        var count = i.Count();
        numberOfItems ??= count;
        return count == numberOfItems;
    })
    .ToList();

CodePudding user response:

You could consider adding an extension-method. Something like

public static IEnumerable<T> TakeWhileEqual<T, T2>(this IEnumerable<T> collection, Func<T, T2> predicate)
    where T2 : IEquatable<T2>
{
    using var iter = collection.GetEnumerator();
    if (iter.MoveNext())
    {
        var first = predicate(iter.Current);
        yield return iter.Current;
        while (iter.MoveNext() && predicate(iter.Current).Equals(first))
        {
            yield return iter.Current;
        }
    }
}

This has the advantage of being efficient, not needing to iterate over the collection more than once. But it does require some more code, even if this can be hidden in an extension method.

CodePudding user response:

You may first group the first input like that.

 int[] input = { 1, 1, 1, 3, 5, 5, 6, 6, 6, 7, 8, 8 };

 var tmpResult = from i in input
     group i by i into k
     select new
     {
          k.Key,
          count = k.Count()
     };

then you can filter the max value of group like that;

var max = tmpResult.Max(s => s.count);

after you should make a filter is enough

 int[] result = tmpResult.Where(f => f.count == max).Select(s => s.Key).ToArray();

Also you can create an Extension method for this.

public static class Extension
{
    public static int[] GetMostFrequent(this int[] input)
    {
        var tmpResult = from i in input
                        group i by i into k
                        select new
                        {
                            k.Key,
                            count = k.Count()
                        };

        var max = tmpResult.Max(s => s.count);

        return tmpResult.Where(f => f.count == max).Select(s => s.Key).ToArray();
    }

CodePudding user response:

I think you probably want to use TakeWhile rather than Take;

    int[] nums = new[] { 1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7 };
    var n = nums
            .GroupBy(i => i)
            .OrderByDescending(g => g.Count());

    var c = n.First().Count();

    var r = n.TakeWhile(g => g.Count() == c)
            .Select(g => g.Key);

If you want to do this in a single pass, without LINQ, you can use a pair of dictionaries to track a) how many times you saw a value and b) what other values you saw that many times and c) what value you saw the most times:

        int[] nums = new[] { 1, 1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7 };

        int maxSeen = int.MinValue;
        var seenCounts = new Dictionary<int, int>();
        var sawThatMany = new Dictionary<int, List<int>>();

        foreach (var n in nums)
        {
            if (!seenCounts.TryGetValue(n, out var seenCount))
                seenCounts[n] = seenCount = 1;
            else
                seenCounts[n] =   seenCount;

            if (!sawThatMany.TryGetValue(seenCount, out var tracker))
                sawThatMany[seenCount] = tracker = new();
            tracker.Add(n);

            maxSeen = seenCount > maxSeen ? seenCount : maxSeen;
        }

You'll end up with a List<int> in sawThatMany[maxSeen] that is the list of numbers that appear most

CodePudding user response:

You were very close. Just add one more line to your code.

int[] input = { 1, 1, 1, 3, 5, 5, 6, 6, 6, 7, 8, 8 };

var counts = input
    .GroupBy(i => i)
    .Select(i => new { Number = i.Key, Count = i.Count()})
    .OrderByDescending(i => i.Count);
            
var maxCount = counts.First().Count;                
var result = counts
    .Where(i=> i.Count == maxCount)
    .Select(i => i.Number);

result

{1,6}
  • Related