Home > OS >  C# LINQ find duplicates in List and remove one
C# LINQ find duplicates in List and remove one

Time:11-12

How to find duplicate items from the list below and delete one?

var mylist = new List<string>(){
   "itemA.config",
   "itemA.en-us.config",
   "itemB.config",
   "itemC.config", 
   "itemC.en-us.config",
   "itemC.fa-ir.config"
};

If it has a value of "*.fa-ir.config", keep it. Otherwise, keep *.config

var mylist = new List<string>(){
   "itemA.config",
   "itemB.config",
   "itemC.fa-ir.config"
};

CodePudding user response:

You can GroupBy (by ItemA, ItemB etc.) and then analyze each group:

var mylist = mylist
  .GroupBy(item => item.Substring(0, item.IndexOf('.')))
  .Select(group => group.FirstOrDefault(item => item.EndsWith(".fa-ir.config")) 
                ?? group.FirstOrDefault(item => item.EndsWith(".config"))
                ?? group.First())
  .ToList();

CodePudding user response:

I think this is your logic: DEMO Explanation below

mylist = mylist 
    .Select(s => (Name:s, Culture:GetConfigCulture(s, out string[] tokens), Tokens:tokens))
    .GroupBy(x => x.Tokens.First(), StringComparer.InvariantCultureIgnoreCase)
    .Select(g => g.OrderBy(x => GetOrder(x.Culture)).First().Name)
    .ToList();

string GetConfigCulture(string name, out string[] tokens)
{
    tokens = name.Split('.');
    if(tokens.Length < 3) return null;
    return tokens[1];
}

int GetOrder(string culture)
{
    if(StringComparer.InvariantCultureIgnoreCase.Equals(culture, "fa-ir")) return 0;
    if(StringComparer.InvariantCultureIgnoreCase.Equals(culture, null)) return 1;
    return 2;
}

So you want to remove duplicates according to the first token of the config name/path. If there a culture specific and one "global" you want to take that without culture. If it has the "fa-ir" culture, this has priority and should be taken.

So you need to group by the first and then order the items in each group by this logic, finally take the first to remove other duplicates.

  • Related