I'm trying to get a string between one or more pairs of string. For example,
import re
string1 = 'oi sdfdsf a'
string2 = 'biu serdfd e'
pattern = '(oi|biu)(.*?)(a|e)'
substring = re.search(pattern, string1).group(1)
In this case I should get: "sdfdsf" if I use string1 and "serdfd" if I use string2 in the search funnction. Instead I'm getting "oi" or "biu"
CodePudding user response:
If you use string in parentheses, regex will capture your string. If you want capture some strings but not match of them, you should add '(?:)' expressions.
You can just changed your pattern as below.
pattern = '(?:oi|biu)[ /t] ([\w*] )[ /t] (?:a|e)'
CodePudding user response:
You are placing capture groups around parts of your regex pattern which you don't really want to capture. Consider this version:
inp = ['oi sdfdsf a', 'biu serdfd e']
for i in inp:
word = re.findall(r'\b(?:oi|biu) (\S ) (?:a|e)\b', i)[0]
print(i ' => ' word)
Here we turn off the capture groups on the surrounding words on the left and right, and instead use a single capture group around the term you want to capture. This prints:
oi sdfdsf a => sdfdsf
biu serdfd e => serdfd