Home > Back-end >  C# select duplicates from sub-group
C# select duplicates from sub-group

Time:10-28

I would like to filter out "" names then select each unique location where there are duplicate IDs regardless of name:

Data Setup

var list = new[]
{
    new { id = 3, Name = "", Location = "LocationA" },
    new { id = 2, Name = "", Location = "LocationA" },
    new { id = 1, Name = "T", Location = "LocationB" },
    new { id = 2, Name = "H", Location = "LocationB" },
    new { id = 3, Name = "E", Location = "LocationB" },
    new { id = 3, Name = "R", Location = "LocationB" },
    new { id = 5, Name = "U", Location = "LocationC" },
    new { id = 5, Name = "S", Location = "LocationC" },
    new { id = 5, Name = "S", Location = "LocationD" },
    new { id = 4, Name = "O", Location = "LocationD" },
    new { id = 4, Name = "Z", Location = "LocationE" },
};

Query

var query1 = list
    .Where(s => s.Name != "")
    .GroupBy(g => g.Location)
    .Where(w => w.Select(s => s.Location).Count() > 1)
    .SelectMany(s => s)
    .GroupBy(g => g.id)
    .Where(w => w.Select(s => s.id).Count() > 1)
    .SelectMany(s => s)
    .ToList();
Console.WriteLine("output\n"   string.Join("\n", query1));

Returns

{ id = 3, Name = E, Location = LocationB }
{ id = 3, Name = R, Location = LocationB }
{ id = 5, Name = U, Location = LocationC }
{ id = 5, Name = S, Location = LocationC }
{ id = 5, Name = S, Location = LocationD }

vs What I actually wanted

{ id = 3, Name = E, Location = LocationB }
{ id = 3, Name = R, Location = LocationB }
{ id = 5, Name = U, Location = LocationC }
{ id = 5, Name = S, Location = LocationC }

LocationD has IDs 4 & 5 so it should've been filtered out, I wasn't able to do so. What am I doing wrong? How do I correct it?

CodePudding user response:

Given

var list = new[]
{
   new { id = 3, Name = "", Location = "LocationA" },
   new { id = 2, Name = "", Location = "LocationA" },
   new { id = 1, Name = "T", Location = "LocationB" },
   new { id = 2, Name = "H", Location = "LocationB" },
   new { id = 3, Name = "E", Location = "LocationB" },
   new { id = 3, Name = "R", Location = "LocationB" },
   new { id = 5, Name = "U", Location = "LocationC" },
   new { id = 5, Name = "S", Location = "LocationC" },
   new { id = 5, Name = "S", Location = "LocationD" },
   new { id = 4, Name = "O", Location = "LocationD" },
   new { id = 4, Name = "Z", Location = "LocationE" },
};

Example

var results = list
   .Where(s => s.Name != "")
   .GroupBy(x => new {x.id, x.Location})
   .Where(g => g.Count() > 1)
   .SelectMany(y => y);

foreach (var result in results)
   Console.WriteLine($"{result.id}, {result.Name}, {result.Location}");

Output

3, E, LocationB
3, R, LocationB
5, U, LocationC
5, S, LocationC

CodePudding user response:

Group by id and Location. And get .Count() more than 1.

var query1 = list
    .Where(s => s.Name != "")
    .GroupBy(g => new { g.Location, g.id })
    .Where(g => g.Count() > 1)
    .SelectMany(g => g)
    .ToList();

Sample demo

  • Related