Home > Mobile >  C# How to know which elements of a list are substrings of a string?
C# How to know which elements of a list are substrings of a string?

Time:05-24

If I have a list of string like

var MyList = new List<string>
{
    "substring1", "substring2", "substring3", "substring4", "substring5"
};

is there any efficient way to determine which elements of that list are contained in the following string

"substring1 is the substring2 document that was processed electronically"

In this case the result should be

var MySubList = new List<string>
{
    "substring1", "substring2"
};

CodePudding user response:

We can use LINQ Where to query, for every substring, whether the large string Contains the substring:

var MyList = new List<string>
{
    "substring1", "substring2", "substring3", "substring4", "substring5"
};

var Text = "substring1 is the substring2 document that was processed electronically";

var output = MyList.Where(x => Text.Contains(x)).ToList();

CodePudding user response:

  1. Split the Text by whitespaces
  2. Sort the words alphabetically
  3. Create a unique list from that
var words = Text.Split(" ").OrderBy(word => word).Distinct().ToList();
  1. Create an accumulator collection for the matches
  2. Create two index variables (one for the words, one for the patterns)
List<string> matches = new();
int patternIdx = 0, wordIdx = 0;
  1. Iterate through the lists until you reach one of the collections' end
while(patternIdx < patterns.Count && wordIdx < words.Count)
{

}
  1. Perform a string comparison
  2. Advance index variable(s) based on the comparison result
int comparison = string.Compare(patterns[patternIdx],words[wordIdx]);
switch(comparison)
{
    case > 0: wordIdx  ; break;
    case < 0: patternIdx  ; break;
    default: 
    {
        matches.Add(patterns[patternIdx]); 
        wordIdx  ;
        patternIdx  ;
        break;
    }
}

Here I've used C# 9 new feature switch pattern matching.
If you can't use C# 9 then a if ... else if .. else block would be fine as well.


For the sake of completeness here is the whole code

var Text = "substring1 is the substring2 document that was processed electronically";
var words = Text.Split(" ").OrderBy(x => x).Distinct().ToList();
var patterns = new List<string> {  "substring1", "substring2", "substring3", "substring4", "substring5" };

List<string> matches = new();
int patternIdx = 0, wordIdx = 0;
while(patternIdx < patterns.Count && wordIdx < words.Count)
{
    int comparison = string.Compare(patterns[patternIdx], words[wordIdx]);
    switch(comparison)
    {
        case > 0: wordIdx  ; break;
        case < 0: patternIdx  ; break;
        default: 
        {
            matches.Add(patterns[patternIdx]); 
            wordIdx  ;
            patternIdx  ;
            break;
        }
    }
}

Dotnetfiddle link

  • Related