Home > Enterprise >  LINQ sum with grouping, but also get line item value before grouping
LINQ sum with grouping, but also get line item value before grouping

Time:09-17

I have a list of type Cars like below List<Car> cars:

ID Colour Make Model Km
1 Red BMW M3 1000
2 Red BMW 318i 1000
3 Black Mercedes C200 1000
4 Black Mercedes E200 1000
5 White Mercedes CLA200 1000
6 White Mercedes E200 1000
7 White Mercedes E200 2000
8 White Mercedes E200 3000

Now what I want to do is, using LINQ (or maybe another way if not possible with LINQ) I want to group them by Colour, Make, Model and then add another column and assign to list of another type (<CarDetail>) which will give the weighted average of the line in the group by.

As an example, on last 3 Cars, the new column will show (Total Km for that group = 1000 2000 3000 = 6000).

6 => 1000/6000
7 => 2000/6000
8 => 3000/6000

What I have tried so far is :

List<CarDetail> carDetails =
                cars.
                GroupBy(x => new
                {
                    x.Colour,
                    x.Make,
                    x.Model
                }).
            Select(h => new CarDetail
            {
                Colour= h.Key.Colour,
                Make= h.Key.Make,
                Model= h.Key.Model,                    
                WeightedKm = x.Km  / h.Sum(x => x.Km )
            }).ToList();

h.Sum(x => x.Km ) will bring the sum of Km.s but trying to use "x.Km" surely doesn't work.

Can anyone help on how to get this list please?

CodePudding user response:

A grouping is conceptually a list of lists. It seems you want to unpack this to a simple list, using an aggregation of the sub list along the way. The process for turning a list of lists into a list is a SelectMany

var carDetails = cars
        .GroupBy(c => new
        {
            c.Colour,
            c.Make,
            c.Model
        })
        .SelectMany(g => g, (subList, aCarInTheSublist) => new CarDetail
        {
            Colour= aCarInTheSublist.Colour,
            Make= aCarInTheSublist.Make,
            Model= aCarInTheSublist.Model,                    
            WeightedKm = (double)aCarInTheSublist.Km  / subList.Sum(aCar => aCar.Km ) //cast to double because int math 1000/3000 = 0
        }).ToList();

Attacking it with a Select will run into trouble because what you're selecting is the group i.e. the outer list. Taking this simpler example:

A, 1
A, 2
A, 3
B, 1

If you group by the letter, your output only has 2 rows:

A -> {A,1} {A,2} {A,3}
B -> {B,1}

so you'll struggle to get back to the 4 rows you want, because the Select is only run twice.

SelectMany will visit each entry in the group (i.e. the two groups A and B, the A group having 3 sub-entries) and visit each one of the sub entries, so for the A group (a single group with 3 members, A1, A2 and A3) it will result in projecting 3 elements out. Crucially because, whilst it visits each of A1, A2 and A3, access to the group A as a whole is still available so you can sum it all the KMs in it

It's perhaps not very efficient, because we're summing up the group repeatedly but LINQ doesn't always win efficiency contests :)

You could consider trading CPU time in summing for memory in remembering the sum by putting the sums in a dictionary:

    var d = cars
        .GroupBy(c => new
        {
            c.Colour,
            c.Make,
            c.Model
        })
        .ToDictionary(g=>g.Key, g=>g.Sum(c=>c.Km));

    var result = cars
        .Select(c => new CarDetail
        {
            Colour= c.Colour,
            Make= c.Make,
            Model= c.Model,                    
            WeightedKm = (double)c.Km  / d[new{c.Colour,c.Make,c.Model}]
        }).ToList();

Side note, when using LINQ try and keep aliases representative of what the list entry is. cars is a list of car, so Select(c => to help keep straight that it's a car.

A GroupBy(c=>...).Select(g=> reminds you that you're grouping up cars and then Select is operating on a grouping. Each item within the grouping is a Car, so you might like Select(gc=> for "group of cars", and Select(gc=> gc.Select(c=> the second select is operating on an enumerable of Car, so reverting to c helps clarify that

  • Related