I have this regex: 1(0 )1
and I'm testing against this value: 1010010001
,
I'm expecting that I have three matches: 101
, 1001
, 10001
but the matches that I got are only: 101
, 10001
.
the second 1
should be part of the two matches: 101
and 1001
.
is there a way make a character to be part of two matches ?
https://regex101.com/r/oNfdFU/1
CodePudding user response:
Here is a way to do this in C#. We can find all matches on the regex pattern 10 (?=1)
. This uses a lookahead at the end of the pattern to assert a trailing 1
, but note that the trailing 1
does not actually get consumed in each match. It gets consumed in the start of the following match. Then, to build the actual output match, we need to append a 1
to the end of each match.
string input = "1010010001";
Regex regex = new Regex(@"10 (?=1)");
MatchCollection matches = regex.Matches(input);
foreach(Match match in matches)
{
Console.WriteLine("Found a match: {0}1", match.Value);
}
This prints:
Found a match: 101
Found a match: 1001
Found a match: 10001
CodePudding user response:
You can use a capturing lookahead to capture overlapping matching groups.
In this case:
(?=(10 1))
In this case, the lookahead is looking ahead and capturing in one step. It then increments by one character and tries again -- so moving through the string by each left hand 1
.
Same regex works in C#
Or, in C#:
string input = "1010010001";
Regex regex = new Regex(@"(?=(10 1))");
MatchCollection matches = regex.Matches(input);
foreach(Match match in matches)
{
Console.WriteLine("Found a match: {0}", match.Groups[1]);
}
Prints:
Found a match: 101
Found a match: 1001
Found a match: 10001
BTW: .NET allows variable width lookBEHINDS so you can also do:
(?<=(10 1))
But that is fairly unique to .NET. PCRE and mots other re flavors require fixed length strings for look behinds...