LINQ Group By and merge properties-CodePudding

private static void Main()
{
    var accounts = new List<Account>()
    {
    new Account { PrimaryId = 1, SecondaryId = 10 },
    new Account { PrimaryId = 1, SecondaryId = 12 }
    };
}

public class Account
{
    public int PrimaryId { get; set; }
    public IList<int> SecondaryIds { get; set; }
    public int SecondaryId { get; set; }
}

Suppose I have the above code, I would like to group by PrimaryId and merge SecondaryId into SecondaryIds.

For example:

// PrimaryId = 1
// SecondaryIds = 10, 12

How can achieve it?

I did this so far

var rs = accounts.GroupBy(g => g.PrimaryId == 1)
.Select(s => new Accounts 
{ PrimaryId = 1, SecondaryIds = new List<int>() { /*???*/  } });

CodePudding user response：

What I'd do, perhaps

LINQ GroupBy in its simplest form, takes a lambda that returns the Key that shall be grouped by. .GroupBy(a => a.PrimaryId). It produces something like a "list of lists", where everything in the inner list has the same key.

If you did that grouping you could explore it like this:

var listOfLists = accounts.GroupBy(a => a.PrimaryId);

foreach(var innerList in listOfLists){
  Console.WriteLine("Key is: "   innerList.Key);

  foreach(Account account in innerList)
    Console.WriteLine("Sid is: "   account.SecondaryId);
}

So it's effectively put your Account objects into something like a List<List<Account>>, except the inner list has a Key property that tells you what common value (the PrimaryId) the accounts in the inner list all share

There's an extended form of GroupBy that takes a second argument of "what item, possibly from the original objects, do you want to put into the inner list?", and we could use that like this:

.GroupBy(a => a.PrimaryId, a => a.SecondaryId)

This would create something like a List<List<int>> because instead of the whole account object, it's just pulling the SecondaryId out and putting that in the inner list. This is fine because you can still get the PrimaryId from the Key and there isn't anything else in an Account

There's actually even another form of GroupBy with a third argument that will take each item in the resulting "list of lists" and allow you to do something with that..

.GroupBy(
  a => a.PrimaryId,     //derive the key
  a => a.SecondaryId,   //derive the innerList
  (key, innerList) => new Account { PrimaryId = key, SecondaryIds = innerList.ToList() }
)

You can think of this third argument as being like the .Select() you've put on the end of the groupby. It's slightly different, and perhaps a bit easier to use because of that clear separation between "the key" and "the list of things with that key"

So, the first argument (key, is the grouping key (the PrimaryID), and the second argument innerList) is the list of all the SecondaryIds with that grouping key. We can use these two bits of info to make a new Account object with the Primary and SecondaryIds set

I mentioned in the comments that it's weird to have a single SecondaryId and also a list of them in one object.. I know why you've done it, but if you wanted to drop the SecondaryId singular and just keep the list, the groupby would change slighty:

.GroupBy(
  a => a.PrimaryId, 
  a => a.SecondaryId.First(),   //take the only item
  (key, innerList) => new Account { PrimaryId = key, SecondaryIds = innerList.ToList() }
)

This would be capable of processing a single item list of secondary IDs, for example something like this:

var accounts = new List<Account>()
{
  new Account { PrimaryId = 1, SecondaryIds = new(){10} },
  new Account { PrimaryId = 1, SecondaryIds = new(){12} }
};

Taking a look at your attempt:

var rs = accounts.GroupBy(g => g.PrimaryId == 1)
.Select(s => new Accounts 
{ PrimaryId = 1, SecondaryIds = new List<int>() { /*???*/  } });

Doing .GroupBy(g => g.PrimaryId == 1) will group by an assessment of whether the PrimaryId is 1 or not. It won't filter anything to "just ID 1" - essentially anything that was PrimaryId 1 would go in one inner list, and everything else would go in another. The Key of the inner list would be a bool. If your accounts list only contained Account objects with Primary ID 1 then all would be fine. It'd go a bit haywire if it didn't

.Select(s => new Accounts 
{ PrimaryId = 1, SecondaryIds = new List<int>() { /*???*/  } });

Your GroupBy() with one argument has produced something like a List<List<Account>>, so it means s is the inner list. It's actually an IGrouping<Account> which is like a list of Account objects but with that extra Key property that tells you what all the items in the grouping have in common.

This means that, because this .Select is giving you s and s is a list of Account, and you want just the SecondaryId out of each account in that list, you need another Select, this time on s to pull just the SecondaryId out of each account within it:

.Select(s => new Accounts 
{ 
  PrimaryId = 1, 
  SecondaryIds = s.Select(acc => acc.SecondaryId)
});

and then you want to call ToList() on that Select

.Select(s => new Accounts 
{ 
  PrimaryId = 1, 
  SecondaryIds = s.Select(acc => acc.SecondaryId).ToList()
});

If I was doing that way I'd do:

.GroupBy(a => a.PrimaryId)
.Select(g => new Accounts 
{ 
  PrimaryId = g.Key, 
  SecondaryIds = g.Select(acc => acc.SecondaryId).ToList()
});

But the 3 argument GroupBy is a bit cleaner..

CodePudding user response：

When using .GroupBy() followed by .Select(), you need to drill down one level inside your grouped items:

var rs = accounts
    .GroupBy(a => a.PrimaryId)
    .Select(gr => new Account {
        PrimaryId = gr.Key,
        SecondaryIds = gr.Select(account => account.SecondaryId).ToList()
    });