I am stumbling into an issue with a regex search in python
So I have:
testVariable = re.findall(r'functest(.*?)1', 'functest exampleOne [2] functest exampleTwo [1] functest exampleOne throw [2] functest exampleThree [1]')
Current Output is:
[' exampleOne [2] functest exampleTwo [', ' exampleOne throw [2] functest exampleThree [']
But what I want is to find all occurences between ‘functest’ & 1' <or 2, or 3 based on need> so output should be like:
['exampleTwo [, exampleThree [']
this because both above are between functest & 1 as I need. Anyone have any idea?
CodePudding user response:
Found a way by using the following. It still includes functest, but at least does the job
testVariable = re.findall(r'functest(?:(?!functest).)*?239', 'functest exampleOne [2] functest exampleTwo [239] functest exampleOne throw [2] functest exampleThree [1] functest exampleFour [2] functest exampleFive [239]')
Output: ['functest exampleTwo [239', 'functest exampleFive [239']
CodePudding user response:
If there can not be any digits in between matching the first occurrence of 1 or 3:
\bfunctest\b\s*(\D*)[13]\b
The pattern matches:
\bfunctest\b\s*
Match the word functest followed by optional whitespace chars(\D*)
Capture Optional non digits in group 1[13]
Match either 1 or 3\b
A word boundary
See a regex demo.
Or you can exclude matching the square brackets before matching a digit using a negated character class:
\bfunctest\b\s*([^][]*\[)[13]]
See another regex demo.
Example
import re
pattern = r"\bfunctest\b\s*([^][]*\[)239]"
s = "functest exampleOne [2] functest exampleTwo [239] functest exampleOne throw [2] functest exampleThree [1] functest exampleFour [2] functest exampleFive [239]"
print(re.findall(pattern, s))
Output
['exampleTwo [', 'exampleFive [']