Home > Software design >  Group by a list of string with specific substring
Group by a list of string with specific substring

Time:10-31

I have a list of emails and I want to group by mylist with domain names or specific substring that contains all of string items(eg: @gmail.com or @yahoo.com).

Please note that I don't want just gmail and yahoo because there are many domains like @yahoo.fr or @hotmail.com.

After that, i want to add all of sub group items in own separates list string.

e.g.:

List<string> Emails = new 
{
"[email protected]",
"[email protected]",
"[email protected]",
"[email protected]"
};

I tried group by with regex parameter, but it didn't work

CodePudding user response:

You could use this to get a Dictionary of domain to email addresses:

Dictionary<string, List<string>> emailsByDomain = emails
    .GroupBy(e => e.Split('@', 2)[1].ToLowerInvariant())
    .ToDictionary(g => g.Key, g => g.ToList());

Basically we split the email address at the "@", take the second half, convert it to lowercase, and then use that as the dictionary key.

If there's a possibility that some of your list might not be an email address (no @) then you can filter those out with a .Where:

Dictionary<string, List<string>> emailsByDomain = emails
    .Where(e => e.IndexOf('@', StringComparison.Ordinal) >= 0)
    .GroupBy(e => e.Split('@', 2)[1].ToLowerInvariant())
    .ToDictionary(g => g.Key, g => g.ToList());

Note that depending on your version of .NET, you might need to change the .GroupBy line to this: .GroupBy(e => e.Split(new string[] { "@" }, 2, StringSplitOptions.None)[1].ToLowerInvariant())

Try it online

CodePudding user response:

Fun one. @ProgrammingLlama has a good idea with the Dictionary.

using System.Text.Json;

List<string> emails = new()
{
    "[email protected]",
    "[email protected]",
    "[email protected]",
    "notanemail",
    "[email protected]",
    "[email protected]"
};

var groupedEmails = from email in emails
                    where email.Contains("@")
                    let domain = email.Substring(email.IndexOf("@")   1)
                    group email by domain into g
                    select new { Domain = g.Key, Emails = g };

var dictionaryEmails = groupedEmails.ToDictionary(x => x.Domain, x => x.Emails);

System.Console.WriteLine(JsonSerializer.Serialize(dictionaryEmails, new JsonSerializerOptions { WriteIndented = true }));

Result

{
  "gmail.com": [
    "[email protected]"
  ],
  "yahoo.fr": [
    "[email protected]",
    "[email protected]"
  ],
  "tafmail.com": [
    "[email protected]"
  ],
  "mail.ru": [
    "[email protected]"
  ]
}

CodePudding user response:

You could try .ToLookup() and pass in either the regex or substring from the index of @

  • Related