I tried this code:
string Input = TextBox1.Text;
string[] splitX = Regex.Split(Input, @"(?<=[|if|and|but|so|when|])");
Often this regular expression is applied @"(?<=[.?!])") to split a text into sentences. But I need to use words as a delimiter to split the text..
CodePudding user response:
It looks like you're trying to use a character set when you should be using a capture group with multiple possible matches. The []
characters indicate a character set which matches any of the enclosed characters. For example, in the other regex you provided, [.?!]
matches either .
, ?
, or !
(though you probably want to escape the period with \.
because .
will match any character except newline). Thus, your regex is trying to match the characters |
, i
, f
, and so on. I'm not sure what happens if you specify duplicate characters in a character set like you have (two n
s and multiple |
s), but the point is that this is the wrong regex construct to use.
The solution it simple: replace your square brackets with parenthesis. This turns that section of the regex into a capture group, which matches the contained regex and can have multiple possible matches separated by |
. You should also only put the |
between matches, so remove the first and last one. The correct regex would be:
(?<=(if|and|but|so|when))
CodePudding user response:
Since the question isn't specifically tagged on RegEx, nor do you specifically say that you need to perform the split within a RegEx operation..
But I need to use words as a delimiter to split the text..
Multiple words can be used as delimiters to identify where you want to split up your string like so:
string[] delimiters = {"if", "and", "but", "so", "when" };
var parts = srcString.Split(delimiters, StringSplitOptions.RemoveEmptyEntries);
So perhaps this approach gets you where you need, or perhaps there is a combination of approaches, (regex first, then apply this string split technique.... )