Home > front end >  C# LINQ group collection by attribute then sort each group by explicit order defined by a list [dupl
C# LINQ group collection by attribute then sort each group by explicit order defined by a list [dupl

Time:09-24

I have a collection of C# objects that I want to

  1. group on a given attribute
  2. then sort each group by another attribute. The sort order is explicitly defined by a list.
  3. Then I want to get only the top item from each group

For instance, consider:

data = [
  {​​​​​"name": "Alice", "country": "UK", "age": 30}​​​​​,
  {​​​​​​​"name": "Bob", "country": "KE", "age": 20}​​​​​​​,
  {​​​​​​​"name": "Charlie", "country": "UK", "age": 30}​​​​​​​,
  {​​​​​​​"name": "Alice", "country": "KE", "age": 40}​​​​​​​,
  {​​​​​​​"name": "Bob", "country": "AU", "age": 50}​​​​​​​,
  {​​​​​​​"name": "David", "country": "USA", "age": 25}​​​​​​​,
]

I want to group by name then sort each group by country to follow an explicit order specified in another collection such as:

var countryOrder = new List<string> { "UK", "KE", "AU", "USA"};

Then I want to select only top members of each group based on this order so that I'd end up with a list of objects unique on name (e.g. one Alice from either the UK, KE, AU or USA) but following the explitcly defined order of precedence.

Below is the expected output of the given example:

output = [
  {"name": "Alice", "country": "UK", "age": 30},
  {"name": "Bob", "country": "KE", "age": 20},
  {"name": "Charlie", "country": "UK", "age": 30},
  {"name": "David", "country": "USA", "age": 25},
]

I have tried using the code below:

using System;
using System.Collections.Generic;
using System.Linq;
                    
public class Program
{
    public static void Main()
    {
        var countryOrder =new List<string> { "UK", "KE", "AU", "USA"};
        
        var data = new List<ExampleDataClass> {
            new ExampleDataClass { Name = "Alice", Country = "UK", Age = 30 },
            new ExampleDataClass { Name = "Bob", Country = "KE", Age = 20 },
            new ExampleDataClass { Name = "Charlie", Country = "UK", Age = 30 },
            new ExampleDataClass { Name = "Alice", Country = "KE", Age = 40 },
            new ExampleDataClass { Name = "Bob", Country = "AU", Age = 50 },
            new ExampleDataClass { Name = "David", Country = "USA", Age = 25 }
        };
        
        var reducedData = data
            .GroupBy(x => x.Name)
            .Select(g => g.OrderBy(item => countryOrder.IndexOf(item.Country)).Min())
            .ToList();
        
        reducedData.ForEach(item => Console.WriteLine(reducedData));
    }
}

class ExampleDataClass
{
    public string Name { get; set; }
    public string Country { get; set; }
    public int Age { get; set; }
}

.Select(g => g.OrderBy(item => countryOrder.IndexOf(item.Country)).Min()) gives a run-time exception:

Run-time exception (line 50): At least one object must implement IComparable.

Stack Trace:

[System.ArgumentException: At least one object must implement IComparable.]
   at System.Collections.Comparer.Compare(Object a, Object b)
   at System.Collections.Generic.ObjectComparer`1.Compare(T x, T y)
   at System.Linq.Enumerable.Min[TSource](IEnumerable`1 source)
   at Program.<>c__DisplayClassd.<Main>b__9(IGrouping`2 g) :line 50
   at System.Linq.Enumerable.WhereSelectEnumerableIterator`2.MoveNext()
   at System.Collections.Generic.List`1..ctor(IEnumerable`1 collection)
   at System.Linq.Enumerable.ToList[TSource](IEnumerable`1 source)
   at Program.Main() :line 49

CodePudding user response:

Here is a solution I have found:


using System;
using System.Collections.Generic;
using System.Linq;
                    
public class Program
{
    public static void Main()
    {
        var countryOrder =new List<string> { "UK", "KE", "AU", "USA"};
        
        var data = new List<ExampleDataClass> {
            new ExampleDataClass {
                Name = "Alice",
                Country = "UK",
                Age = 30,
            },
            
            new ExampleDataClass {
                Name = "Bob",
                Country = "KE",
                Age = 20,
            },
            
            new ExampleDataClass {
                Name = "Charlie",
                Country = "UK",
                Age = 30,
            },
            
            new ExampleDataClass {
                Name = "Alice",
                Country = "KE",
                Age = 40,
            },
            
            new ExampleDataClass {
                Name = "Bob",
                Country = "AU",
                Age = 50,
            },
            
            new ExampleDataClass {
                Name = "David",
                Country = "USA",
                Age = 25,
            }
        };
        
        var reducedData = data.GroupBy(x => x.Name)
                             .Select(g => g.OrderBy(item => countryOrder.IndexOf(item.Country)).First())
                             .ToList();
        
        foreach (ExampleDataClass item in reducedData) 
        {
            Console.WriteLine(string.Format("Name: \"{0}\", Country: \"{1}\", Age: {2}", item.Name, item.Country, item.Age));
        }
    }
}

class ExampleDataClass
{
    public string Name { get; set; }
    
    public string Country { get; set; }
    
    public int Age { get; set; }
}

Output

Name: "Alice", Country: "UK", Age: 30
Name: "Bob", Country: "KE", Age: 20
Name: "Charlie", Country: "UK", Age: 30
Name: "David", Country: "USA", Age: 25

link to a dotnetfiddle https://dotnetfiddle.net/6geHtk

  • Related