In my regex pattern, I would like to make sure a certain substring only occurs once in between two other substrings.
So, let's take for example these strings:
string_a = “this and that”
string_b = "this and and that"
I want to return a match for string_a but not for string_b, because 'and' occurs twice there between this/that. I would do that with a negative lookahead-tempered dot:
my_pattern = "this(?:(?!and.*and).)*that"
This matches string_a and not string_b, so so far so good.
However, with the following sentence is also not matched (like string_b):
string_c = "this and that and"
Evidently, the negative lookahead occurs for the whole string, rather than between "this" and "that" as I had anticipated and hoped.
How can I do this instead?
CodePudding user response:
You can use another tempered greedy token to temper the .*
inside the lookahead:
this(?:(?!this|that|and(?:(?!that).)*?and).)*?that
See the regex demo.
Details:
this
- a fixed string(?:(?!this|that|and(?:(?!that).)*?and).)*?
- any char other than line break chars, zero or more but as few as possible occurrernces, that does not start athis
,that
char sequences or a pattern that matchesand
, then any char other than line break chars, zero or more but as few as possible occurrernces, that does not start athat
char sequence and thenand
stringthat
- a fixed string.